• Skip Navigation |
  • Accessibility 
IT-Director.com Logo
  • Singularity go SaaS with LiveAgility
  • User Experience Monitoring as Governance?
  • Running IT as a business: don't be daft
 

Main navigation - go to a section of this website:

  • ARCHIVE
  • PAPERS
  • EVENTS
  • NEWSWIRE
  • BLOGS

  

Member Login | Become a Member

 
DOMAINS
  • Enterprise
  • SME
  • Business Issues
    • Compliance
    • Regulation
    • Employment
    • Innovation
    • Security & Risk
    • Costs
    • Change
    • Quality
  • Technology
  • Services
  • Channels
FEATURED EVENTS
  • Legal IT Show 2010
    10th February - 11th February
    London, United Kingdom
  • Data Modelling Fundamentals
    15th February - 16th February
    London, United Kingdom
POPULAR PAPERS
  • Log and Event Management by Bloor Research
  • Warehousing for low latency analytics by Bloor Research
TRANSLATE PAGE



USEFUL LINKS
  • Last 7 Days
  • Archives
  • Market Place
  • Top Articles
INTERACT
  • Advertising
  • Site Feedback
  • Newsletters
  • Contact Us
  • Registration
CONTENT FEED

Business Issues -> Compliance
RSS Feed:

RSS Icon

What is RSS?

RANDOM QUOTE
Famous Slights - "The trouble with her is that she lacks the power of conversation but not the power of speech." - George Bernard Shaw

ADVERTISEMENT
Analysis

Greenplum aims to eliminate massive data load 'choke points' with Scatter/Gather technology

Dana Gardner By: Dana Gardner, Principal Analyst, Interarbor Solutions
Published: 27th March 2009
Copyright Interarbor Solutions © 2009
Logo for Interarbor Solutions
Page Tools

Request Reprints
Tell A Friend
Contact Author

More from author
  • February 2010
    BriefingsDirect analysts discuss ramifications of Google-China dust-up over corporate cyber attacks
  • February 2010
    Advancing understanding of cloud-use benefits for enterprises
  • February 2010
    Security, simplicity and control ease make desktop virtualization ready for enterprise uptake
  • February 2010
    Technology, process and people must combine smoothly to achieve strategic virtualization benefits
  • February 2010
    Security skills provide top draw across IT jobs landscape
  • February 2010
    Apple and Oracle on way to do what IBM and Microsoft could not: Dominate entire markets
  • February 2010
    Time to give server virtualization's twin, storage virtualization, a place at IT efficiency table
Syndication
  • Delicious Icon Delicious
  • Digg Icon Digg
  • reddit Icon reddit
  • Facebook Icon Facebook
  • StumbleUpon Icon StumbleUpon

Greenplum has taken massively parallel processing (MPP) of data to the next level with the introduction this week of its "MPP Scatter/Gather Streaming" (SG Streaming) technology, which manages the flow of data into all nodes of the database, eliminating the traditional bottlenecks with massive data loading.

The San Mateo, Calif. company, which provides large-scale analytics and data warehousing, says SG Streaming has allowed customers to achieve production-loading speeds of over four terabytes per hour with negligible impacts on concurrent database operations. [Disclosure: Greenplum is a sponsor of BriefingsDirect podcasts.]

Under the "parallel everywhere" approach to loading data flows from one or more source systems to every node of the database without any sequential choke points. This differs from traditional “bulk loading” technologies, used by most mainstream database and parallel-processing appliance vendors that push data from a single source, often over a single or small number of parallel channels, and result in fundamental bottlenecks and ever-increasing load times.

The new technology "scatters" data from all source systems across hundreds or thousands of parallel streams that simultaneously flow to all nodes of the database. Performance scales with the number of nodes, and the technology supports both large batch and continuous near-real-time loading patterns with negligible impact on concurrent database operations.

Data can be transformed and processed in-flight, utilizing all nodes of the database in parallel, for extremely high-performance extract-load-transform (ELT) and extract-transform-load-transform (ETLT) loading pipelines. Final 'gathering' and storage of data to disk takes place on all nodes simultaneously, with data automatically partitioned across nodes and optionally compressed.

It was just six months ago that Greenplum publicly unveiled how it wrapped MapReduce approaches into the newest version of its data solution. That advance allowed users to combine SQL queries and MapReduce programs into unified tasks executed in parallel across thousands of cores.

Reader Comments

Sorry, we are no longer accepting comments on this item. We suggest trying to contact the author directly.

  • Site Map
  • | Terms of Use
  • | Privacy

Published by: IT Analysis Communications Ltd.
T: +44 (0)1908 880760 | F: +44 (0)1908 880761