By: Clive Longbottom, Head of Research, Quocirca
Published: 6th February 2013
Copyright Quocirca © 2013
Toward the end of 2012, Quocirca met with an interesting company called DataSift. DataSift is a social data platform company—it takes feeds of data from the majority of social media sites and can then mine through social conversations for content, trends and insights. This is of obvious interest for organisations that are tracking sentiment of their brand in the market, but may also have other uses as well.
The one obvious target for DataSift is Twitter; the vast majority of Twitter data is available in the public domain (only direct messages (DMs) are hidden from general view). However, DataSift can also track activity around an organisation’s Facebook page, content from blogs and forums—including other semi-private information the organisation accesses via social networks established between itself and the public.
The platform is cloud-based with prices based on a combination of 'complexity', hours and hourly cost along with a data cost. The hourly cost is the simplest to explain. The price is based on the period being analysed—for a week, this would be 168 hours, for a month (nominally) 720 hours. Complexity is more difficult and is based on a calculation that can only be completed once the query has been created. However, the business model does mean that you only pay for what you get: no on-going subscriptions that have to be paid no matter what—everything is on a per use basis. The data cost is based on a small charge per Tweet analysed. For statistical validity, DataSift recommends that a 10% sample rate is used, which lowers the price significantly.
As a test, Quocirca asked DataSift to run a Twitter-only analysis of 2012 Twitter activity for a named set of vendors who are often mentioned in the same breath as big data. The query required just 10 lines of code to be written, and gave a complexity score of 2.1. Without the 10% filter in place, 2.23 million Tweets were analysed.
We selected an interesting topic as the basis for our test and Quocirca will be writing a more detailed piece on the findings, but the highlights below illustrate the potential power of the system:
As a single point of interest, a look was taken at HP at a sentiment analysis level. Through the first part of the year, people’s views of HP remained fairly level, with a net sentiment score (positive comments minus negative comments) of 0—not good news in itself, but it could have been worse. However, between 14th November and 10th December, a lot of sentiment activity took place.
On the 21st November, HP’s sentiment score plunged close to -10,000. It recovered back to zero by the 24th, and then went back down to -5,000 on the 28th, rose again and then crashed down to -7,000 on the 1st December.
Why? On November 20th, HP’s CEO Meg Whitman told Wall Street analysts that HP had massively overpaid for software firm Autonomy, and accused former executives at Autonomy of cooking the books. Financial and technical analysts went into a frenzy—the very people who use social networking the most to get information out as quickly as possible. The ongoing fall-out was what caused the triple-dip poor sentiment scores over the following weeks.
This shows that, although HP got a fourth place in the mentions it had around big data, it was not necessarily positive to HP’s brand. This is why a company such as DataSift is important—it not only can remove the grunt work of dealing with analysing the massive firehose of data that comes from social networks, but also applies solid analytic against this to ensure that what a customer sees as results is there in context.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.
Published by: electronicdawn Ltd.