Updated: August 28th, 2008

imageNewsflash: Perl 6 is not dead (in case you thought it was)!

I stumbled upon this most excellent series of posts by Moritz Lenz of perlgeek.de that describe the differences between Perl 5 and the upcoming Perl 6 (thanks to Andy Lester for the link). The posts are done in the form of tutorials, which helps comprehension. Simply awesome, Moritz.

It seems like Perl 6 is going to be a lot more object oriented, but such orientation is optional and not forced upon programmers, like in, say, Java. It warms my heart that I will be able to do this (you did see the new "say" function in Perl 5.10, right?):

1
2
3
my Num $x = 3.4;
say $x.WHAT; # Num
say "foo".WHAT; # Str

My favorite Perl 6 change so far is this:

1
2
3
4
5
6
7
8
9
# named arguments
 
sub doit(:$when, :$what) {
  say "doing $what at $when";
}
 
doit(what => 'stuff', when => 'once');  # 'doing stuff at once'
 
doit(:when, :what('more stuff')); # 'doing more stuff at noon'

I've first seen this technique in Ruby (apparently Python has it too), and have been using an anonymous hash in order to emulate named arguments in Perl 5. Perl 6 does it in a much cleaner way.

I wonder if there are any Perl 6 changes specifically affecting file/disk access, MySQL interaction, and execution speed.

What is your favorite new feature? Comments welcome.

Edit: Whoa, string concatenation is now ~, the dot . is used for method calls. That's kind of upsetting, I'm so used to '.'.

Edit #2: Holy crap, regex changed so much, it just warped my head onto itself and now I have a black hole in place of my face, thanks a lot. Regexes are also now called "Rules". More here

So the other day I was setting up public key authentication for one of my users, which is usually very straightforward: generate a private/public key pair, stick the private key into user's .ssh dir, set dir permissions to 0700, private key permissions to 0600, stick the public key into the authorized_keys file on the server, and the job's done. However, this time, no matter what I was doing, the public key was being rejected or ignored and the system was moving on to the keyboard-interactive authentication.

Debugging on the client side with -v didn't help much:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
artem@DeathStar:~/svn/b2/Fetch/LinkChecker> ssh -v monkey@192.168.1.30
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007   
...
lots of boring shit
...
debug1: Found key in /home/artem/.ssh/known_hosts:1
debug1: ssh_rsa_verify: signature correct
...
more boring shit
...
debug1: Authentications that can continue: publickey,keyboard-interactive
debug1: Next authentication method: publickey
debug1: Offering public key:
debug1: Authentications that can continue: publickey,keyboard-interactive
debug1: Offering public key:
debug1: Authentications that can continue: publickey,keyboard-interactive
debug1: Offering public key: /home/artem/.ssh/id_rsa
debug1: Authentications that can continue: publickey,keyboard-interactive
debug1: Trying private key: /home/artem/.ssh/id_dsa
debug1: Next authentication method: keyboard-interactive
Password:

After breaking my head over possible reasons why the pile of junk that thinks it's smarter than me next to my feet doesn't work, kicking it a few times, and observing the same result, I turned to debugging the ssh daemon itself - sshd.

  • The -d option disables the daemon mode and enables debug mode, in which only 1 connection is accepted for the lifetime of the server, after which it simply quits.
  • -dd simply enables a more detailed output.
  • -e switches this debug output from a log file to STDOUT.

However, to free up port 22, I had to stop the daemon that was already running, or else a "Bind to port 22 on 0.0.0.0 failed: Address already in use." error appeared (duh). An interesting question though, especially for people doing this to remote boxes, what happens when one stops sshd? Ever thought of doing that but instead ran over to your mommy crying like a little girl? Well, fear no more, because I'll tell you exactly what happens:

  1. New users will have their connection refused.
  2. Your own connection will not be interrrupted. sshd works by spawning a new instance of itself for every incoming connection, so your own sshd process will stay in memory.

So where was I?

1
/usr/sbin/sshd -dd -e
1
2
3
4
...
Authentication refused: bad ownership or modes for directory /home/monkey
...
Failed publickey for monkey from 192.168.1.30 port 56287 ssh2

AhA!! (emphasis on the last 'a'). What have we here?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
artem@DeathStar:~> cd /home/
artem@DeathStar:/home/> l
drwxrwx--- 29 monkey  users 4096 2008-08-06 23:14 monkey/
 
DeathStar:/home/ # chmod 755 monkey
drwxr-xr-x 29 monkey  users 4096 2008-08-06 23:14 monkey/
 
artem@DeathStar:~/svn/b2/Fetch/LinkChecker> ssh -v monkey@192.168.1.30
OpenSSH_4.6p1, OpenSSL 0.9.8e 23 Feb 2007   
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to 192.168.1.30 [192.168.1.30] port 22.
debug1: Connection established.
debug1: identity file /home/artem/.ssh/id_rsa type 1
debug1: identity file /home/artem/.ssh/id_dsa type -1
debug1: Remote protocol version 2.0, remote software version OpenSSH_4.6
debug1: match: OpenSSH_4.6 pat OpenSSH*
...

Connection established, all systems are go, the key has been accepted.

Inspired by http://linux.derkeiler.com/Mailing-Lists/Fedora/2005-08/1105.html

P.S. Don't forget to /etc/init.d/sshd start. ;)

1. I want to download and play FLVs on my computer.

2. I don't want to use some crappy FLV player that only plays FLVs and has an interface from either 1995 or 2034 - I want to use my favorite player, like Media Player Classic.

3. Yes, VLC plays FLVs but it can't fast forward or rewind them. Yes, mplayer plays FLVs but I want a GUI. Yes, mplayer supports GUIs but they all pretty much suck. I don't particularly like VLC's or mplayer's interface - want to fight about it?

Enter the latest version of ffdshow. ffdshow is a decoding filter - think of it as a set of codecs for your media players. It supports FLV1, FLV4, H263, On2 VP6, H264, WMV, DiVX, XViD, and anything you fancy. I tried playing FLVs encoded to FLV4 and VP6, and both worked great in MPC, including fast forwarding and going full screen. Just download and install it, and everything will magically work - no reboot necessary.

As a bonus, here's a handy screenshot that I took of everything ffdshow supports:

image

image According to Wikipedia, in April 2008, the number of videos on Youtube was 83.4 million (ref: http://en.wikipedia.org/wiki/YouTube#cite_note-5). However, the link in the cite note now displays “*” video results 1 - 20 of millions, without showing the real count.

Here's one way I found to get an estimated, but relatively accurate, number of videos on the popular video sharing site Youtube. The idea is simple. Get this feed: http://gdata.youtube.com/feeds/api/videos/-/* and parse out the number inside the <opensearch:totalresults> tag.

So here it is: the number of videos on Youtube is currently fluctuating between about 141 million and 144 million. The number goes up and down, which points to the fact that these are estimates.

That's a whole boatload of video if you ask me. To put it into perspective, a modest and completely inaccurate estimate of the amount of space all these videos occupy would be something like

142,500,000 * (a + b + c + d), where

  • a = average size of an FLV, let's say 4MB, though I'm probably way off. There are lots of really short videos out there and Youtube has a 10 minute cap. It's just an estimate, anyway.
  • b = average size of an MP4, let's say the same 4MB. There are lots of factors that would make this number completely inaccurate, the biggest one being I don't know at which point Youtube started generating MP4s and if they generated them for all videos or just the ones going forward). It also depends on whether they managed to save all originals that people uploaded.
  • c = average size of all images associated with the video, let's say 50KB. Small thumbnails and a larger first frame don't take that much space.
  • d = average size of an original uploaded to Youtube. These could be immediately discarded after the encoding is complete, or perhaps Youtube saves the past few months worth, or if they're completely insane, they're saving ALL originals ever. I'm going to throw a semi-random number in - 50MB per file.

So, just the FLVs, MP4s, and images would equal ((4 MB) + (4 MB) + (50 KB)) * 142 500 000 = 1.06818788 petabytes.

If Youtube has been saving all originals since the beginning, this number goes up to ((4 MB) + (4 MB) + (50 MB) + (50 KB)) * 142 500 000 = 7.70386123 petabytes.

In addition to the video files, I wonder how big Youtube's databases are. Depending on how the data is compacted over time (i.e. daily views folded into monthly after a month, monthly into yearly, etc), I would estimate something along the lines of 1.5-2TB, which is negligible compared to the space needed for videos. I'm quite sure the databases are mysql, split into many shards for better performance, perhaps tweaked with Google patches. Watch Youtube's Scalability Presentation and have a peak at this article for more info.

So there you have it, folks. Am I far off in my calculations? If so, don't hesitate to correct me.

Edit: It seems that I forgot that Youtube also generates 3gp, so add some space needed for that.

I'm usually not one to be easily impressed but these 2 dudes at supposedly the Olympics opening ceremony are causing controversial feelings in me. They're insanely acrobatic. Anyway, watch this: