Business Issues -> Compliance
RSS Feed:
|
By: Dana Gardner, Principal Analyst, Interarbor Solutions Published: 27th March 2009 Copyright Interarbor Solutions © 2009 |
Greenplum has taken massively parallel processing (MPP) of data to the next level with the introduction this week
of its "MPP Scatter/Gather Streaming" (SG Streaming) technology, which
manages the flow of data into all nodes of the database, eliminating
the traditional bottlenecks with massive data loading.
The San Mateo, Calif. company, which provides large-scale analytics and data warehousing, says SG Streaming has allowed customers to achieve production-loading speeds of over four terabytes per hour with negligible impacts on concurrent database operations. [Disclosure: Greenplum is a sponsor of BriefingsDirect podcasts.]
Under
the "parallel everywhere" approach to loading data flows from one or
more source systems to every node of the database without any
sequential choke points. This differs from traditional “bulk loading”
technologies, used by most mainstream database and parallel-processing
appliance vendors that push data from a single source, often over a
single or small number of parallel channels, and result in fundamental
bottlenecks and ever-increasing load times.
The new technology
"scatters" data from all source systems across hundreds or thousands of
parallel streams that simultaneously flow to all nodes of the database.
Performance scales with the number of nodes, and the technology
supports both large batch and continuous near-real-time loading
patterns with negligible impact on concurrent database operations.
Data
can be transformed and processed in-flight, utilizing all nodes of the
database in parallel, for extremely high-performance extract-load-transform (ELT) and
extract-transform-load-transform (ETLT) loading pipelines. Final
'gathering' and storage of data to disk takes place on all nodes
simultaneously, with data automatically partitioned across nodes and
optionally compressed.
It was just six months ago that Greenplum publicly unveiled how it wrapped MapReduce approaches into the newest version of its data solution. That advance allowed users to combine SQL queries and MapReduce programs into unified tasks executed in parallel across thousands of cores.
Sorry, we are no longer accepting comments on this item. We suggest trying to contact the author directly.
Published by: IT Analysis Communications Ltd.
T: +44 (0)1908 880760 | F: +44 (0)1908 880761