Sitewide
RSS Feed:
|
By: Philip Howard, Research Director - Data Management, Bloor Research Published: 29th June 2011 Copyright Bloor Research © 2011 |
Often when you see this sort of heading you expect to read something like "long live the EDW" later in the piece. Not this time. The EDW (enterprise data warehouse) is dead. Period. Like a dodo. Like Monty Python's parrot.
This came up last week at SAS' analyst conference in Athens. I was having dinner with Keith Collins, CTO of SAS, and he asked me what I thought the future of the EDW was. I said that I thought that it had no future. You might think that this would have led to a long and interesting debate. It didn't: he completely agreed with me. We swiftly moved onto other topics.
There are two ways to look at this. The first is to look at the broader picture. In this context there are three types of data that you want to analyse: structured, unstructured and streamed. Of course, these terms are hopelessly confusing. What is the difference between a 140 character product description, stored in a table in a relational database, and a tweet? The truth is that data is neither inherently structured nor unstructured. We extract structure from data by using BI or search or similar capabilities and we impose structure by the way we store and manage data. Storing data in a relational database imposes structure on data, which is reflected as metadata. What Hadoop is doing is imposing structure on so-called unstructured data. Streamed data is also structured, it's just that the metadata is external: we know that a stock tick consists of a stock symbol followed by the value of the tick.
So, what we are talking about here is how you store and process data for query purposes and structured, unstructured and streamed data (assuming we can't get rid of this terminology) have very different requirements. It is just about possible to conceive of a platform that supports relational, Hadoop-like and streaming data at some point in the future but it's not going to happen soon-if it ever does. So, certainly for the time being, there is no prospect of an EDW supporting all of these different types of query processing.
However, even if that was conceivable, the EDW is not going to make a comeback even if we look at the narrower field of structured data alone. There are two reasons for this. The first is that I don't think the concept of a single monolithic EDW was ever the right one. It was too time consuming and expensive to set up and run. But even if you think that it was conceptually the right approach it has been overtaken by practical considerations: in particular, data marts and application-specific appliances have proliferated throughout large enterprises and there is no way that that is going to change. Further, data virtualisation and federation are now mature technologies that allow federated query access across data marts and warehouses so there is no longer any incentive to try and centralise everything.
Of course, smaller organisations might get away with a single data warehouse but this would be an SMEDW and, no doubt, there will be diehards who have invested so much money in their existing infrastructure that they are afraid to admit that they got it wrong but, going forward: RIP the EDW.
Posted: 11th July 2011 | By Robert Eve :
Philip -
"The EDW is Dead" is quite a statement!
I think the EDW is more like the case of another King, Elvis Presley.
Many cling to the memory of the young Elvis, shaking it on stage and in his popular movie hits. That is Elvis the King. And that King will never die.
Fewer think about an older, less-vibrant Elvis, overweight and overdosed. I think this is the King you speak of when you say the EDW is dead.
Perhaps, there is a middle ground analogy in another in the oft stated Elvis reference. "Elvis has left the building."
So
Posted: 20th September 2011 | By Greenlightwilly :
And the mainframe is dead too. ... I have to ask if you ever participated in designing a data architecture for a large corporation? Particularly a finance department? Being in the midst of just such an endeavor (as well as many others in my past) it's not a stretch to say that no data professional who has trodden that path would make such a comment as yours. I admit it's tempting to move to the next shiny object instead of dwelling in 'yesterdays' concepts but merely stating that those shiny objects will nullify something without clearly defining what that 'something' actually does is absurd. Real-time streaming and big-data solutions may not be a good fit into an EDW - and you at least allude to that - but go no further. That hardly signals the end of EDW. The need to integrate and store structured enterprise data over time, data that has been subjected to countless layers of rules and latency issues, as vetted, approved 'snapshots' is why these things exist. Suggesting that the EDW needs to 'make a comeback' presumes that it went somewhere in the first place. Which, from my experience, it definitely hasn't. Misused, misunderstood and misapplied - yes. But good ones continue to thrive when properly built, managed and inserted correctly into a data solution. ....And where is it written that the Big-Bang, monolithic EDW projects, upon which you base your first 'proof' that EDW's are a doomed premise, are an inherent feature of all EDW's? Again, anyone who has worked in this space over the last 20 years will attest that that approach, versus an incremental one, is exactly the wrong way to go if you want success. Your second point supporting your premise is also specious - you do not explain how the proliferation of datamarts and appliances signify the end of EDW. As if the presence of one disproves the need for the other (Google 'Kimball/Inmon War' for an update on how that all turned out). This is like suggesting the introduction of railroads signalled the end of cargo ships. They both carry freight but... oh well, forget it. The needs of an organization will drive whether or not an EDW component must be part of their data solution - not the musings of internet pundits and software vendors who have an ax to grind. You might have been on more solid footing if you had established what an EDW's architectural premise is, like you attempted to do on the big-data end of your comparison. But then you wouldn't have made the point your title suggests. ...And what does Elvis have to do with any of this???
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.
Published by: Electronicdawn Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761