Sitewide
RSS Feed:
|
By: Philip Howard, Research Director - Data Management, Bloor Research Published: 5th September 2007 Copyright Bloor Research © 2007 |
While there are a number of open source ETL (extract, transform and load) vendors I had not previously encountered an open source data quality solution until I recently spoke with Infosolve Technologies. However, Infosolve is not your typical open source vendor.
Infosolve in fact has two products: OpenDQ and OpenCDI (data quality and customer data integration respectively), where the latter leverages the former. So, how does Infosolve differ from other open source vendors?
The biggest difference between Infosolve and the remainder of the open source community is that Infosolve does not believe that you can make any money by simply having a download site and then trying to sell support or services on the back of that download. No, Infosolve believes that you need to do the complete reverse of this: go out and sell your professional services, in this case for data quality, through a direct sales force. Then you implement your solution for the customer on a “free” open source platform. In other words, as I have remarked before, Infosolve is using open source as simply a different licensing model. Typical service engagements range between three weeks and nine months, though the company informs me that it is shortly hoping to sign a two year engagement.
In addition to its own direct sales force, Infosolve is also exploiting the channel: partnering with systems integrators and sub-licensing OpenDQ to other open source (and, for that matter, non-open source) vendors and ISVs.
Remaining on the open source discussion, Infosolve is a partner of Sun's and runs on Sun grid technology and, in particular, is available via Sun's utility computing offering, meaning that you can have OpenDQ hosted for you using a utility-based approach that can cost as little as an hour. Infosolve refers to this open source, utility-based model as a "zero-based data solution".
This means that, apart from the initial professional service engagement (to determine and set up appropriate data quality business rules, for example) and any on-going service fees, you will have more or less zero costs for the whole project—actually more but at an hour not much more. You can of course run the OpenDQ software on your own hardware should you prefer to do that.
On the technical side, OpenDQ is tightly integrated with Pentaho's data integration (formerly KETTLE) product but perhaps most interesting is the fact that the company will shortly be introducing support for unstructured data. This is important when it comes to non-name and address data such as product data, where information about products often comes into the organisation in unstructured format. The company will be using natural language processing to support unstructured data, which is probably the best approach to take.
The introduction of unstructured support is interesting, not just because it is clear that product data quality is becoming more of an issue but that it suggests (and I want to make it clear here that this is my own inference) that Infosolve may introduce an OpenPIM (product information management) product to go alongside its OpenCDI offering. Which, of course, raises the whole question of open source MDM (master data management): while that is a discussion for another day we see no reason why Infosolve shouldn't be as successful with MDM as it is with data quality.
Sorry, we are no longer accepting comments on this item. We suggest trying to contact the author directly.
3rd February 2008: 'Kasper Sørensen' said:
Why one would call OpenDQ "Open Source" is a mystery to me. Where excactly IS the source available? As far as I can see the source code is only "open" to paying customers, making it quite closed source in my eyes and not different from proprietary solutions sold by almost everybody else.
It seems everybody wants to call them selves open source these days, without even finding out what Open Source is.
Oh yeah, and I read somewhere else (http://blogs.cnet.com/8301-13505_1-9802297-16.html) that the software is being accompagnied by a GPL license, but it is still quite inconsistent with the Open Source Definition (http://www.opensource.org/docs/osd), which the GPL is supposed to be compliant with:
"... The license shall not require a royalty or other fee for such sale."
"Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost preferably, downloading via the Internet without charge."
12th February 2008: 'mark madsen' said:
I was contacted by them a few months ago too and opted not to write about them because they have the benefits of open source backwards. The value to a customer of open source where there is no community and there is no support and service (they offer none) is actually negative. Why would I as a customer hire someone to use and build custom data quality software that I then have to inherit and maintain? I'd be far better off buying off the shelf and implementing because of this.
I talked to a few people about the legality (GPL violation) of what they're doing and it's in a gray area. The legal opinion I trust most suggested that this is legal since the source is available to the people it is distributed to. One of the license authors stated it was gray and subject to interpretation of what is meant with the distribution requirement. Another one said "absolutely not, it should be reported".
13th February 2008: 'Kasper Sørensen' said:
That's excactly my point. Actually I'm beginning a new Open Source project for data quality, which is still in the very early stages. But for people interested, have a look at my project, which is called DataCleaner: http://www.eobjects.dk/trac/wiki/DataCleaner
It's being released under the Apache Licence, Version 2.0, and I hope to have the first alpha ready by march 2008.
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
Published by: IT Analysis Communications Ltd.
T: +44 (0)1908 880760 | F: +44 (0)1908 880761