• Jump to Left Menu
  • Jump to Right Menu
  • Jump to Main Content
  • Jump to Footer
  • Accessibility Page
IT-Director.com Logo

 

Main navigation - go to a section of this website:

  • ARCHIVE
  • PAPERS
  • EVENTS
  • NEWSWIRE
  • BLOGS

  

Register For Membership | Member Login

 
 
DOMAINS
  • Business Issues
    • Change
    • Compliance
    • Costs
    • Employment
    • Innovation
    • Quality
    • Regulation
    • Security & Risk
  • Channels
  • Enterprise
  • Services
  • SME
  • Technology
FEATURED EVENTS
  • Free Webinar - ISO 22301: The New Standard for Business Continuity Best Practice
    23rd May
    Webinar (online)
  • Telecoms Tech World
    4th June - 5th June
    London, United Kingdom
USEFUL LINKS
  • Last 7 Days
  • Archives
  • Top Articles
SHARE THIS PAGE
  • Delicious Icon Delicious
  • Digg Icon Digg
  • reddit Icon reddit
  • Facebook Icon Facebook
  • StumbleUpon Icon StumbleUpon
CONTENT FEED

Business Issues -> Regulation
RSS Feed:

RSS Icon

What is RSS?

RANDOM QUOTE
Observations - "As you get older, the pickings get slimmer but the people don't." - Carrie Fisher

PAGE TOOLS
ADVERTISEMENT
MORE FROM AUTHOR
  • May 2013
    CA - Same old same old, or new opportunities?
  • April 2013
    Dreaming of the perfect trip
  • March 2013
    Behavioural analytics - Bah, humbug?
  • March 2013
    Is the use of cloud sharing systems worrying you?
  • February 2013
    The Financial CRD Game - a game of two halves.
  • February 2013
    Gaining strategic technology platforms through financing
  • February 2013
    Measuring the social perceptions of big data
Analysis

Big data - big misunderstandings, big mistakes?

Clive Longbottom By: Clive Longbottom, Head of Research, Quocirca
Published: 8th May 2012
Copyright Quocirca © 2012
Logo for Quocirca

If an organisation is sitting on top of 10 databases, each of which is 100TB in size, it has a big data issue, right?

Not necessarily – it certainly has a problem in that it has a lot of data to deal with, but federating databases and applying data cleansing, master data management (MDM) and business analytics can provide a pretty decent solution to this. Big data introduces a lot of different problems – ones that require a bit of different thinking which may take many outside of their comfort zone.

Let’s begin by taking a simple view of information within an organisation. In the dim, dark past when I got into the ITC world, a rule of thumb approach was that around 20% of an organisation’s information was in electronic format, the rest on paper. Of the electronic stuff, about 80% was held within formal databases. Roll the clock forward by a couple of decades and this has essentially flipped – around 80% of an organisation’s information is now in electronic format, and only around 20% of that will be held in a formal database. The rest of the electronic stuff will be held in various file formats dotted around on file servers, personal devices and so on.

Any “big data” approach that just deals with the data held within databases is therefore only using 16% of the available information – not a good way to reach mission critical decisions.

This is further complicated by how information usage has changed. Back at that earlier time, an organisation’s data assets were pretty easy to define – the data was in that database that was on that server in that data centre. Now, the organisation’s information assets have to include shared information across the value chain of customers and suppliers – and then beyond that into the information held in the internet itself and across social networking sites.

All of a sudden, the “big data” approach of federating information across those large databases that the organisation controls is looking a little measly. Even if it is assumed that those databases are large – say a total of 10 petabytes (PB), or close to 1,000 times the amount of information held in the American Library of Congress – the total size pales into insignificance against the volume of information held on the internet, where other information that could be useful could be found in semi-structured or unstructured formats. The current information volume of the internet is estimated to be around 2 zettabytes (ZT) – or 2 million PB. Bringing that into the equation brings that 16% of available information that you may have thought you were acting against down to a very small fraction of a single per cent.

Sure, a lot of the available information out there on the internet is either complete dross or is not germane to the problem you are dealing with. The problem is that some of it is – the views of customers being propagated through the social networks; the performance and activities of competitors; the dynamics of the markets in which you are operating, whether these are vertical or geographic. You need the tools to identify that useful stuff, and then the means to bring it into an environment where it can be analysed and reported against in a manner that allows intelligence to be gleaned from a broader set of sources – in other words, a true big data approach.

A term that is being used around big data sums it all up nicely – it is about volume, velocity and variety. The volume side is the one everyone accepts, but is also the one that vendors have latched on to and focused on. The velocity side is where the big battles seem to be being played out – how fast can one vendor provide insights against this large volume of data that is under focus?

But variety is often glossed over – and yet it is the most important. Less structured information held in documents and spreadsheets, along with information that can be gleaned from less traditional sources such as voice and video and those internet sources alluded to earlier are all potentially relevant. Those who can use the right technologies in order to bring this variety of information sources together such that volume and velocity needs are also met will be the outright winners in world of true big data – those who just look at it as a problem with volumes of structured data under their direct control will face major problems.

For a bit more on this subject, see Quocirca’s argument on why “Big data” should be re-termed as “unbounded information”, here.

Reader Comments

We have not received any comments against this entry. Why not be the first?

We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.

  • Contact
  • | Site Map
  • | Terms of Use
  • | Privacy Policy
  • | Cookie Policy

Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761