Twitter: The Creation of A New Data Corpus
Jorge Espinel / July 13, 2009
Last week, I attended Techcrunch’s Real-Time Crunchup. The event brought together many members of the Twitter ecosystem to showcase their products and several key investors to discuss their views of the Twitter phenomenon. This event helped me crystallize my view on Twitter, which is that Twitter is facilitating the creation of a new database of real-time content and information on the Web. Angel investor, Ron Conway referred to it as a new “data corpus” on the Web. Google’s index is an example of another data corpus, which contains information about sites across the Web. Navteq’s mapping information is another data corpus. The creation of a new data corpus can unlock significant value creation opportunities.
So, how is Twitter creating a valuable and new data corpus? At the highest level, Tweets are pieces of real-time data that users contribute to a database. I am not referring to the original Twitter use case of “I am enjoying a juicy burger for lunch.” I am referring, rather, to the emerging and powerful “use-cases” of users reporting on news, sharing and commenting on article links, uploading photos and video, promoting blog post headlines, tracking flight information, following financial news, etc. While these are activities that we have been doing for some time on the Web, there are several reasons why Twitter has managed to leverage these use-cases to create a new and unique data corpus:
1) Public Network: To date, these activities have primarily taken place in private networks – i.e., most people shared articles with their friends and family rather than with the overall public. Most social networks had been dominated by activity in private networks. The fact that the Twitter database focuses on public information amplifies its value because it increases the potential number of consumers for ti.
2) People-led: All of this “public” information is now being mapped to individuals rather than websites. This means that we can now use individuals to navigate information across the Web. As I have discussed in the past in the context of journalism, this is a powerful shift that suggests the emergence of “Intimacy” as one of the key differentiating elements of the Web as a commercial medium.
3) Frictionless: The experience of contributing information to the database requires little user involvement. As a result, the information is contributed in real-time and the database is constantly updating at an unprecedented speed. The myriad Twitter-centric tools (upload videos and photos, share links, etc.) make the overall experience for contributors to the Twitter database “frictionless.”
4) Optimized-for-mobile: The Twitter ecosystem of tools is becoming optimized around using phones to contribute information to the database. As the penetration of smart phones increases, the speed of growth of Twitter’ database will accelerate.
There are several key implications from the creation of this new “data corpus”. A whole new infrastructure of services/experiences needs to emerge to unlock the value of the data. This has already begun in a similar way as it happened in the Web in the mid nineties. Directory and search services lead the way (e.g., Summize, Collecta, Oneriot). Uploading tools and analytics companies follow (e.g., 12 seconds, Twitpic, Twitvid, Bit.ly). New user interfaces start to be tested (e.g., Twubs, ExecTweets, StockTwits). Lastly, a monetization model/solution eventually emerges.
Thanks to its open approach, the overall Twitter ecosystem is developing at much faster speed than we have seen before. While Twitter remains small in terms of number of employees as a company, its ecosystem of services is growing at rapid speed. Judging by the size of the Techcrunch event, it is probably safe to say that the number of people involved in developing Twitter’s ecosystem of services may be more than 1000.
There is no guarantee that Twitter’s momentum will continue or that the service itself will become a permanent fixture of the Web experience in its current form. However, the real-time Twitter data corpus will likely stand the test of time and be one of the main differentiating elements of the Web experience going forward.
Filed in: Content.
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=508bdab8-1002-4118-a0ab-32bef935bbd6)








