Updated: September 16th, 2012
According to Wikipedia, in April 2008, the number of videos on Youtube was 83.4 million (ref: http://en.wikipedia.org/wiki/YouTube#cite_note-5). However, the link in the cite note now displays “*” video results 1 – 20 of millions, without showing the real count.
Here's one way I found to get an estimated, but relatively accurate, number of videos on the popular video sharing site Youtube. The idea is simple. Get this feed: http://gdata.youtube.com/feeds/api/videos/-/* and parse out the number inside the <opensearch:totalresults> tag.
So here it is: the number of videos on Youtube is currently fluctuating between about 141 million and 144 million. The number goes up and down, which points to the fact that these are estimates.
That's a whole boatload of video if you ask me. To put it into perspective, a modest and completely inaccurate estimate of the amount of space all these videos occupy would be something like
142,500,000 * (a + b + c + d), where
- a = average size of an FLV, let's say 4MB, though I'm probably way off. There are lots of really short videos out there and Youtube has a 10 minute cap. It's just an estimate, anyway.
- b = average size of an MP4, let's say the same 4MB. There are lots of factors that would make this number completely inaccurate, the biggest one being I don't know at which point Youtube started generating MP4s and if they generated them for all videos or just the ones going forward). It also depends on whether they managed to save all originals that people uploaded.
- c = average size of all images associated with the video, let's say 50KB. Small thumbnails and a larger first frame don't take that much space.
- d = average size of an original uploaded to Youtube. These could be immediately discarded after the encoding is complete, or perhaps Youtube saves the past few months worth, or if they're completely insane, they're saving ALL originals ever. I'm going to throw a semi-random number in – 50MB per file.
So, just the FLVs, MP4s, and images would equal ((4 MB) + (4 MB) + (50 KB)) * 142 500 000 = 1.06818788 petabytes.
If Youtube has been saving all originals since the beginning, this number goes up to ((4 MB) + (4 MB) + (50 MB) + (50 KB)) * 142 500 000 = 7.70386123 petabytes.
In addition to the video files, I wonder how big Youtube's databases are. Depending on how the data is compacted over time (i.e. daily views folded into monthly after a month, monthly into yearly, etc), I would estimate something along the lines of 1.5-2TB, which is negligible compared to the space needed for videos. I'm quite sure the databases are mysql, split into many shards for better performance, perhaps tweaked with Google patches. Watch Youtube's Scalability Presentation and have a peak at this article for more info.
So there you have it, folks. Am I far off in my calculations? If so, don't hesitate to correct me.
Edit: It seems that I forgot that Youtube also generates 3gp, so add some space needed for that.
In the meantime, if you found this article useful, feel free to buy me a cup of coffee below.