Friday Post 2: Rocketbooming
Ze Frank is in the middle of a nerd fight about download stats, and it all boils down to Rocketbooming. I'm not going to get into the issue of whether or not Rocketboom's reaction to this is justified or not, because the baseline issue is actually more interesting.
It's about the metrics, really. Here's the thing: when I wrote the blurb for the back cover of the Crow book, I said 225,000 people had downloaded the Pig book. I got that number by looking at the download stats of the original files (in all languages), and running a few filters. We started at something approaching 375,000. Take out bots and other such nonsense, and it drops a good amount. But then you look at the downloads and you see that someone downloaded it three times in the space of a minute... maybe their PDF viewer was wonky, they hit reload a few times to make it work. Whatever it was, I tried to filter by IP based on timeframe, so that I could arrive at actual downloads. After a lot of that, I trimmed the number down to about 225,000, which is still fantastically large.
I know if I wrote my initial stats (375,000) on the Crow book, it'd be much more impressive, and people would think I was super-1337. But I know in my heart that's not a valid number (even 225,000 is iffy to me), and all it does is set up a standard that is unhealthy. People have to be dishonest or suffer the consequences.
Let's say I, the author of the Crow book, am a different person. That book's been downloaded less than 50,000 times (with filtering). Without filtering, about 70,000. So as an author of a similarly-targeted product, what do I announce as my stats? 70,000? If I don't, I'm a small-fry, and advertisers will shy away from me (not that I'm looking for advertisers, but y'know). But if I say 70,000 and some advertiser wants to see results for that kind of audience, they won't get it, because my actual number is likely only 50,000! I'm setting myself up for a fall, but I don't have much choice because my competitor (the Pig book) is going around yelling huge numbers from the rooftops.
The thing is, Rocketboom needs to filter down their numbers not because they're being dishonest (I don't think you'd call it dishonest anyway). They have to do it because they're pushing metrics in the wrong direction... we have the power and intelligence on the web to at least improve on the TV model, which is a lot of silly guesswork and extrapolation. We should be able to say: our downloads are X, and our likely real audience is X-Y. You don't get numbers that compete with Grey's Anatomy, but you get a better ROI. Advertisers will get more actual bang for the buck because your bang is reasonable. And if Rocketboom tapers their numbers, it will put less pressure on their competitors to fudge numbers to apppear to be in the same league.
What will happen in this kind of arms race is that one day, a vidcast that has a so-so audience will distribute it via every possible outlet, claim 1 million downloads a day (of which 10,000 are actually watched) and push the overall value per download on the web so low that Rocketboom, with their 300,000 downloads a day, will start to lose money. And then they'll have to inflate their stats, and so on, until the only people that can play professionally are the ones that can sign deals with big distributors to help boost their download stats.
Transparency and admissions of imperfection are key to internet life, and it helps everyone to admit that their download stats are flawed. If we trim them back and try and present REALISTIC numbers rather than "competing with TV" numbers, advertisers will end up a lot happier.