IBM has announced a new model in its PureData for Analytics (previously Netezza) range. The previously existing model is now designated the N1001 and the new model the N2001. The big difference between the two is that the N2001 offers significantly improved performance and efficiency.
Let's talk about performance first. To begin with, the N2001 uses faster components throughout: faster disks, faster FPGAs (field programmable gate arrays) and faster CPUs. That's only to be expected with a new generation of hardware. However, IBM has also amended the architecture. Historically, there was one disk for each FPGA and CPU but in the N2001 that 1:1:1 ratio has changed to 2.5 disks per FPGA.
The idea of having a fractional number of disks per FPGA might not be intuitively obvious as a concept but the way that this works is that each FPGA engine reads a page for a disk as quickly as it can, and then it's ready for the next one. So, that's how you get this aggregate figure. The other question one might ask is why it would be good idea to have more than one disk per FPGA in the first place? After all, the whole point of FPGAs is that, in effect, you can stream data directly from disk to the FPGA, thereby getting much greater throughput. If it was a good idea to have more than one disk per FPGA why didn't Netezza do that previously? The answer to that is that FPGA technology is and has been advancing more rapidly than disk technology and the increased number of logic gates in the latest generation of FPGAs means that they can handle greater capacity even though the disks themselves are larger (now at 600 GB). What this all means is that you get an effective scan rate (assuming 4x compression) of 128 GB/sec, which compares (very) favourably with competitive figures from the likes of Teradata and Oracle.
In terms of configuration the N2001 is available starting with half a rack, which has 96Tb user capacity (assuming 4x compression), and scales up to 4 racks.
I mentioned earlier that this release is not just about improved performance (overall the company reckons 3x) but also about efficiency. Because there is more capacity per rack there is less floor space required and power and cooling costs are reduced both because of that fact and generically because of the more efficient hardware being used. So on-going costs should compare favourably with competitors. Also, the 34 spare disks per rack represent an increase in this number, which should mean fewer callouts. RAID 1 mirroring is used.
One final point on the hardware side is the N2001 will be available as the analytic accelerator on z/series.
Going back to efficiency there is also an improved software component being released, namely NZPortal 2.0. This provides a web-based environment for all administrative, monitoring and capacity planning purposes, with both new functions and improved usability.
Finally, it's worth thinking about in-memory and SSD technology. Competitors to Netezza will argue that in-memory technology will ultimately make its FPGA design obsolete. Well, that may be true but it isn't going to happen soon. In any case, I think it's safe to say that IBM has its eye on this particular ball so it's not something that I would worry about. In the meantime the N2001 looks impressive and should keep IBM/Netezza well ahead of the chasing pack in terms of price/performance.