Technology -> Big Data
By: Joe Clabby, President, Clabby Analytics
Published: 18th March 2013
Copyright Clabby Analytics © 2013
For some unknown reason the topic of memory has come up a lot in my research this week. It started when I was comparing cache designs on IBM's POWER7+ microprocessor to Intel's i7 x86 architecture, then moved into how much main memory each system could support—and then I chose to add an IBM System z mainframe to the discussion.
As I looked at each processor environment, here's what I found:
Why is this important? Because the closer you can put data in-memory to the processor, the faster that data can be processed.
I then started to look at main memory for each system. And here's what I found.
Why is this important? Again, because the more data that you can place near the processor, the faster that data can be processed.
I then started to think about IBM's Flex System architecture. This environment can run POWER and/or x86 chips (note: you can process twice as many threads with POWER chips—and POWER has significantly more on-chip cache). This environment has access to plenty main memory. This environment has eight internal, on-the-compute-node solid state drives that can also act as extended memory and that can accelerate the processing of applications that benefit from high IOPS (input/output per second) performance. Applications that perform extremely well within a Flex System environment include various data mining and database applications, multimedia streaming and video-on-demand, a wealth of financial services applications (that rely on results for quick decision making), surveillance and security applications (especially for real time security checks against reference materials), and
video rendering. I then asked myself—are Big Data applications appropriate for this environment?
This meant I had to venture away from memory into storage (storage feeds memory). IBM's Pure Systems/Flex System architecture offers access to large amounts of internal storage (blades typically do not). IBM’s StorWise V7000 storage array can be mounted within a Flex System environment—and can thus speed access to data (no need for multiple hops). Additionally, PureSystems/Flex Systems offer direct access by compute nodes to up to eight SSDs located within each compute node. These SSDs act like extended, fast memory—and are also positioned to provide 'hot data' rapidly to compute elements.
I then started thinking outside the box (literally—about external storage subsystems). IBM's storage offerings are particularly strong in the areas of tiering (placing the data used most often on fast disk for fast accesss), in compression, and in interoperability. But it is the tiering that interest me most because, yet again, it places hot data closer to the processor. And the closer that data is to the processor, the faster it can be processed...
What I think we're going to see soon is systems designed around in-memory database processing. Traditional blade architecture is not positioned to support very large memory (VLM) databases due to memory/footprint constraints. But other architectures such as traditional mainframes, Power Systems, and scale-up x86 designs are indeed well positioned for in-memory database processing.
Next week I'm starting a research report on systems designs and will discuss this topic in greater depth. But I would welcome any feedback and thoughts from readers of this article in the meantime. Please consider dropping me an e-mail or commenting on this article.
Big data boosts from in-memory databases and analytics—Why does in-memory technology help with big data processing problems? What is the role of data compression? Looking beyond in-memory storage, what about optimized hybrid storage?
Posted: 18th March 2013 | By Philip Howard :
Yes, but there's always the question of how you use memory or SSDs. Oracle, for example, uses its Flash Cache very differently from the way that IBM does.
Posted: 18th March 2013 | By Pae MunKyu :
Yes, In-Memory Database vendors like Altibase are waiting for systems designed around in-memory database.
Posted: 18th March 2013 | By Joe Clabby :
Phil is right. There are differences in how vendors are using solid state drives. This is why I'm expecting some big news in systems designs this year. My belief is that we're going to see a bunch of new Big Data configurations that feature large amounts of solid state and that use it like memory. I've actually started writing a systems design report that talks about converged systems, expert integrated systems and this new class of solid state systems.
Posted: 18th March 2013 | By Ranjit Nayak :
Tier 0 SSD cards such as the one from LSI / Cisco and EMC for the Cisco blade servers are addressing the data proximity. More details in this video -
Posted: 18th March 2013 | By Joe Clabby :
Thanks Pae. This is exactly why I started using this blog. I'm looking for as much field insight as I can get. Please keep the feedback coming.
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.
Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761