Technology -> Data Management
By: Andy Hayler, Associate Analyst, Bloor Research
Published: 22nd February 2008
Copyright Bloor Research © 2008
Not the Oscars this time, but a data warehouse appliance. Teradata carved out a successful high end niche in database and hardware technology specifically aimed at analytic rather than transactional processing, succeeding where previous attempts (e.g. Red Brick, Britton Lee) had faltered. However it is the rapid rise of Netezza that caused a flurry of look-alike appliance vendors to sprout up in the last couple of years such as DATAllegro, Dataupia, ParAccel etc. I believe that it will be much easier to convince conservative buyers about appliances if they do not come with proprietary hardware, and indeed this is the approach by Dataupia. However the software-only appliance route was taken a couple of years earlier by Kognitio (a re-brand of WhiteCross). Kognito initially had a proprietary hardware link and had built up some impressive references in the UK such as BT (who have serious data volumes) but had not succeeded as broadly commercially as they might have done; in my view they were held back by the proprietary hardware issue (especially in a conservative UK market). This has been addressed, and a major re-engineering exercise has now allowed their WX2 V6 product to run on commodity X86 hardware such as data blades.
WX2 uses scanning technology, no indexes, and is an RDBMS using hardware parallelism and smart use of memory in preference to disk access where possible to achieve its performance. The product reads in data from a flat file, loads it quickly (up to 1 terabyte an hour) and can then achieve extremely fast read performance. In one test 23 billion rows were read in two seconds. This approach differs from column-oriented databases (e.g. Sybase, ParAccel) whose design can also achieve high performance for certain analytic queries. A typical Kognitio implementation may involve 80 servers in groups of four. Resilience is obviously a key issue for such large data volumes, and the company claims that if you pull a server out of the rack and so artificially crash the system, it is able to restart in a just a few minutes.
The technology does not compete with data quality tools, as it assumes that pre-validation of data has been completed prior to loading. It could be characterised in philosophy as ELT (rather than ETL) since with such fast performance at its disposal it may be more efficient to carry out transformations within the database engine than pre-processing prior to loading. An ODBC interface allows the loaded data to be queried by any normal reporting tool. Against conventional databases such as Oracle, appliances can show dramatic results. In one recent proof concept on a half terabyte sample database, some queries were demonstrated to be 40 times faster than the existing warehouse.
Kognitio already has nearly half its customers on its software as a service model, which I wrote about previously. The more traditional licences result in orders typically in the £300k–£1.2m range. The company has added more solid customer references such as Marks and Spencer and Scottish Power (it has a few dozen customers now), and has grown to 78 employees and around £8million in revenue, having been profitable for three years. This solid commercial performance has now given it the base to branch out into the massive US market, and it is about to open a head office in Chicago with sales offices in Boston and San Francisco.
Kognitio has the advantage of non-proprietary hardware ties (unlike Netezza) and a solid and lengthy track record of successful reference customers (unlike more recent appliance start-ups), which should be a potent combination if it can crack sales and marketing to the US market.
Posted: 26th February 2008 | By Samantha Stone :
It’s refreshing to see a healthy debate brewing about the definition and benefits of a data warehouse appliance. At Dataupia we believe the integration of software and hardware addresses a more complete set of data warehouse challenges than software alone and is optimal for simple administration and query execution efficiency. However, we also believe that the use of proprietary hardware can be costly to maintain. Therefore, we’ve architected a solution that encompasses the best of both worlds; a complete data warehouse appliance but with non-proprietary hardware components. Built with commodity, off-the-shelf components you benefit from an economic solution without compromising simplicity and efficiency.
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.
Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761