Datasift, Twitter, and “Privacy”

  3 m, 7 s

Datasift (www.datasift.com) just announced a fancy new service.   They have worked out a deal with Twitter where they can provide 2+ years of historical tweets.

This is exciting for a number of reasons.  Not least of which, you can run very interesting trend analysis for streams that you weren't already capturing.  Step back for a moment (figuratively, not literally - I want you to still be able to read what I'm writing) and ponder this.  One of the real problems with doing any sort of analysis on rapidly emerging events is that you have to actually catch them when they first happen.

And the only way to do this would be to consume and store all of Twitter.  This is a tremendously difficult technical undertaking, and fortunately you don't have to do that now.   You can ask all kinds of interesting historical questions, see cause-and-effect.  Perhaps find a literal "butterfly effect" resident in the Twitter data.

You can see the very genesis of a topic, trace it back to it's roots before it became a trending topic.   This is a hugely powerful idea for historians, politicians, and, yes, marketing people.

And therein lies some controversy.   Articles like these:

Daily Mail: Twitter secrets for sale

IT World: Time to clean out old Tweets so they won't define you the rest of your life 

Datasift is NOT providing content from private Twitter streams or tweets that have been deleted.   So, this means that the only Tweets that are accessible through the service are completely public tweets.

"Secrets" not so much.  Is there really a person on this planet that believes that what they Tweet is anything other than completely public information?  If there are, then they are a strong argument for the need for a license to use the Internet.   Sir, please step away from the keyboard.

This is not new.  Politicians, historians, and evil marketing people have been consuming and storing Twitter feeds since Twitter became popular.   In fact, there are 2 major differences here:  

1) it's everything

2) Datasift have actually instituted the ability to remove Tweets.   With any other system storing Tweets, they don't get deleted when you delete them.

Oh, wait, I'm sorry.   The fact that it's everything isn't even new.  The Library of Congress stores every single Tweet.  And they don't even have a "don't store deleted Tweets policy."  So, everything you've ever said on Twitter is stored permanently in a way that can be accessed.  Granted, with "scholarly" aspirations, but it's still there.

When you Tweet, you're using a free platform.  This platform has a Terms of Service that states that Twitter is allowed to do exactly what they're doing.  Twitter is a business that exists to make money, they are not an organization that aims to provide a free messaging platform for the good of all mankind.  

But what really annoys me about all this is this sudden burst of OMG THINK OF THE PRIVACY from people who should know better.   I don't know why the EFF has been quoted here, but seriously, people.  If you Tweet something, don't you think the world is watching?  

I can understand the backlash against Google and Facebook with their privacy policy changes.   Twitter has not changed their policy, and the Tweets in question were posted out to the world at large.   

We are clearly biased, as we're close business partners with Datasift, and they're using our software to analyze this heap of data.

We think it's great.   And we're also teaching our children about the value of being circumspect on the Internet.