• Skip Navigation |
  • Accessibility 
IT-Director.com Logo
  • Metastorm leverages Azure to leap into Cloud-based collaborative modelling
  • Uwhat?
  • A Clear Message for Vendors In the SMB Technology Market
 

Main navigation - go to a section of this website:

  • ARCHIVE
  • PAPERS
  • EVENTS
  • NEWSWIRE
  • BLOGS

  

Member Login | Become a Member

 
 
DOMAINS
  • Enterprise
  • SME
  • Business Issues
  • Technology
    • Data Management
    • Applications
    • Infrastructure
    • Systems Mgmt
    • Security
    • Mobile
    • Storage
    • Personal Productivity
  • Services
  • Channels
FEATURED EVENTS
  • Data Protection Essential Knowledge - Level 2
    5th August
    Edinburgh, United Kingdom
  • Enterprise Architects TOGAF™ v9 Level 1 & Level 2 Training course - Special UK price of £1599 plus 17.5% vat
    23rd August - 26th August
    London, United Kingdom
POPULAR PAPERS
  • Identity Management as a Service by Bloor Research
TRANSLATE PAGE



USEFUL LINKS
  • Last 7 Days
  • Archives
  • Market Place
  • Top Articles
INTERACT
  • Advertising
  • Site Feedback
  • Newsletters
  • Contact Us
  • Registration
CONTENT FEED

Technology -> Applications
RSS Feed:

RSS Icon

What is RSS?

RANDOM QUOTE
Observations - "Blessed is he who expects nothing for he shall never be disappointed." - Jonathan Swift

ADVERTISEMENT
Analysis

The problem with data quality solutions part 3

Philip Howard By: Philip Howard, Research Director - Data Management, Bloor Research
Published: 2nd December 2008
Copyright Bloor Research © 2008
Logo for Bloor Research
Page Tools

Request Reprints
Tell A Friend
Contact Author

More from author
  • July 2010
    Uwhat?
  • July 2010
    And so, it begins ...
  • July 2010
    Whither analytics?
  • July 2010
    Approaching heterogeneous storage optimisation
  • June 2010
    Storage optimisation
  • June 2010
    RainStor 4
  • June 2010
    DataRush extends its boundaries

So far in this series of articles I have discussed the failures of traditional data quality tools when it comes to matching in general and product and complex data matching in particular. However, these aren't the only areas they fall down in: they are not very good at dealing with names either (which makes one wonder what they are good at?).

Suppose you are Chinese and you go to live in America. Do you keep your Chinese name? Do you anglicise it? If so, how? Do you reverse your names so that your forename goes first? Now consider a data quality solution trying to match in these circumstances. Or think about criminals with 30 different aliases: how do you match these names?

Fortunately, the data quality fraternity (or some of them) has owned up to this omission in its capabilities. Thus IBM bought LAS (now Global Name Recognition) and Informatica more recently acquired Identity Systems, though the other vendors in the market remain in the cold in this regard.

However, if you have read the previous articles in this series you will know that lack of ability when it comes to names is the least of my concerns when it comes to data quality and that my real worry is that all the leading products have been built using out-of-date technology that has now been superseded.

In the first article I highlighted Netrics, which uses mathematical modelling as an alternative to the conventional pattern-matching used by the traditional vendors. And in the second article I mentioned Silver Creek, which uses a semantic approach. In particular, both of these products feature self-learning capabilities (as does Zoomix, recently acquired by Microsoft) that improve the efficiency of the match process over time while reducing the amount of human involvement that is required.

It is not that these products are new—Silver Creek has been around for a number of years, Netrics has 150 odd customers—but I have now got to the point where I think we need the existing market to be radically disrupted. Current products are being incrementally improved but incremental improvements are not enough: we need dramatic improvements. Otherwise, most companies will continue to (ineffectively) use manual efforts for data cleansing because they can't see the cost benefits (and I am not sure I can blame them) of moving to inadequate pattern-based matching products.

If data quality is the huge issue we all say it is, and it is, then we owe it to users to actually provide them with technologies that help them to resolve those problems rather than just a sop, which is what they are, in most cases, getting. Leading vendors need to recognise that the likes of Netrics and Silver Creek offer way better technology than they do and they need to buy or build comparable capability as soon as possible if they are not to continue to disappoint the market generally.

Reader Comments

Sorry, we are no longer accepting comments on this item. We suggest trying to contact the author directly.

4th December 2008: 'Bob Barker' said:

Great discussion, Philip, but is the data quality world really this homogeneous? While mathematical modeling and semantic analysis are extremely useful in some solutions, it doesn’t follow that they can solve every problem in every domain equally well. For example, a solution great at matching product data may fail miserably when applied to a Wall Street insider trading problem. Sometimes combining different analytics is more effective, depending on the problem domain. I posted a longer discussion of this point yesterday on www.identityresolution.com for anyone who’s interested.

Reply to Bob Barker?

4th December 2008: 'Philip Howard' said:

Good point. It would suggest that the ultimate data quality product would support multiple matching engines that can be deployed as appropriate, depending on the class of problem, in a way analagous to the use of different algorithms as provided by data mining tools.

Reply to Philip Howard?

10th December 2008: 'Bob Barker' said:

Oops. Last week I referred to an expanded version of my response and made a brilliant typo. The blog address is www.identityresolutiondaily.com if you're interested.

Reply to Bob Barker?

The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.

  • Site Map
  • | Terms of Use
  • | Privacy

Published by: IT Analysis Communications Ltd.
T: +44 (0)1908 880760 | F: +44 (0)1908 880761