Technology -> Data Management
RSS Feed:
|
By: Philip Howard, Research Director - Data Management, Bloor Research Published: 4th July 2008 Copyright Bloor Research © 2008 |
Real-time data integration has become more and more of a significant issue over the last few years, supporting predictive analytics, operational business intelligence, real-time dashboards, risk management, zero-downtime migrations, service oriented architectures and so on.
However, despite its increasing popularity, one of the most common assumptions regarding real-time data movement is not often scrutinised or discussed. This is that the means of supporting it (typically via change data capture) is generally regarded as being exclusively an adjunct to traditional batch methods of data integration: a side dish rather the main course. In other words, batch capabilities are typically thought of as coming first via standard ETL (extract, transform and load) tools and then real-time data integration is implemented as an accessory, whether for trickle feeding data into a warehouse of for other purposes). What is not generally considered, except perhaps for specific projects, is the idea of implementing real-time capabilities on their own and without any use of batch processing.
What would be the value of this?
Well, consider the characteristics of batch loading. Put simply, you have to load and transform all of the data that you need to move within the confines of your batch window. What this means is that you must have enough hardware and processing power so that these peak loads can be handled with appropriate performance. Worse, of course, is the fact that batch windows are narrowing so that, relatively speaking, these peaks are becoming increasingly onerous and demanding, meaning that the required processing power to service these needs is similarly increasing.
This is where batch loading stumbles over its own proverbial feet and real-time processing begins to overtake it: where batch loading takes a specific amount of time of concentrated processing, real-time data integration divides the same effective amount of processing requirement over a much longer period of time. As a simple (and simplistic) example, if twelve hours of batch loading are required each week, replacing this with a real-time approach would mean that those twelve hours would be spread out over the whole week. In other words, you require 14 times more processing power to handle peak processing loads in this particular batch environment than you do if exclusively moving the data in real-time.
Of course there is more to it than this but, simplistically, it should be clear that the argument is valid: real-time data integration requires less in the way of computing power than using a traditional approach, either with real-time data movement combined with batch, or batch on its own. And that translates into less hardware, less space on the data centre floor, lower power requirements and less need for cooling. So real-time is more green than batch and there is a good case for using it as a standard approach to integration even if we didn't have an increasing need for real-time information.
This article was co-authored by Daniel Howard.
Sorry, we are no longer accepting comments on this item. We suggest trying to contact the author directly.
4th July 2008: 'Tony Sceales' said:
Bravo! Great message Philip - and one we've really taken to heart at Celona as you know.
Spreading the impact on people and processes by migrating smaller, better distributed, business-focused slices of data over a period of time rather than killing the network and performance of both source and target environments is a key design principle of ours.
So green is also lower risk and helps key managers sleep at night - a real win-win.
Tony Sceales
CTO
Celona Technologies
www.celona.com
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
Published by: IT Analysis Communications Ltd.
T: +44 (0)203 051 5760 | F: +44 (0)870 345 9922