Technology -> Data Management
By: David Norris, Practice Leader - Analytics, Bloor Research
Published: 25th November 2011
Copyright Bloor Research © 2011
The biggest barriers that I see to the widespread adoption of Big Data is the skills that are required to deliver the benefits that we all agree can be obtained. In the standard MIS layers of the BI tool suites (Reporting, Dashboards, and OLAP) we are seeing an increasing emphasis on what is being labelled Agile BI, a tool set that offers the same power as the traditional tools, but which costs less, is easier to use, is targeted at the business user and not the IT professional, is far more visual in how they are controlled and what they output, and which increase productivity in a step-change. But in the area in which Big Data offers the biggest potential return, that of data mining, the application of statistical and mathematical modelling to identify patterns of significance, there has been no comparable change, until now. Alpine Miner is the first offering I have seen that is clearly addressing the challenges of the scale and affordability of exploiting Big Data.
I have, for a long time, been a big fan of KXEN as an alternative to SPSS or SAS for those businesses that do not have the skills required to really make the most of the considerable power of the established market leaders, being easier to deploy and understand the results if you are not a statistician, whilst still delivering models of comparable statistical validity. But all of those technologies are, at present, going to struggle to economically cope with the scale of the data when it comes to Big Data. This is where Alpine Data Labs provide the first sight of a next generation of data mining solution, which copes with the scale of big data, but is still affordable, and is designed to be used by people in the business world and not just statisticians.
Alpine Data Labs are a spin-off from Greenplum (just prior to the EMC acquisition of Greenplum last year). Their primary product, Alpine Miner, is a data mining and analytics platform meant to leverage the processing capabilities of MPP databases like Greenplum and Oracle's Exadata. Alpine is headquartered in San Mateo, California with a sizeable development shop in Beijing. They have over 15 early adopter customers in both the US and China, and already over 500 evaluation downloads have been taken, so there is a lot of interest and the company is showing very solid growth based on quality opportunities.
This is a disruptive technology, aiming to bring advanced analytics into the hands of people capable using it to change the business landscape. The user interface is a drag and drop GUI, and the technology is designed to be cost effective, so there are few reasons at the business level not to invest. The promise of Big Data will only be achieved when the ROI is compelling. If the benefits can only be obtained by the deployment of very expensive technology, by very expensive consultants, operating over elongated time periods, Business will not bother.
So lets see how Alpine Data address these challenges. Firstly we should note that Alpine Miner is very much a work in progress. The company has a clear and compelling vision of where they are heading, and have already established the fundamental building blocks. The first challenge is one of scalability. This they are confident they have addressed and are well on the way to handling any size of data on any platform. Next they want to provide a platform that is well integrated so the modelling is not just providing insight; that insight must be readily actionable, so that it drives business improvement. Again, this is there already but will probably just get better and better over time. The third point they want to address is making data mining a participative technology, so that it is embedded into the business decision-making process, used naturally within the business to aid effective business management. This model is coming in 2012 with Alpine Miner v3.0, and finally they see some sort of a SaaS offering down the road.
The key to much of what Alpine delivers is that they are embedding the computation into the data, and not moving data to the tool. Alpine Miner is an analytics engine that connects directly to Greenplum, PostgreSQL and Exadata with offerings for Netezza and Hadoop on the roadmap.Â Alpine runs all of the transformations, calculations, and analytic processes directly within the database itself, thus eliminating the need to extract data out of the database and sending it off to another (smaller) analytic server for processing. On the client side, a PC or Mac controls things through a point and click GUI, with no arcane statistical notation to navigate. This "in-database" approach leverages the MPP capabilities of the appliance, and eliminates many of the constraints on scalability and integration seen with traditional data mining tools. The tool is designed to be used easily by BI analysts and should be a natural extension of their BI toolkits.
This model is capable of revolutionising how analytics are deployed. The slow, expensive model of the past is being replaced by a quick to deploy, rapid time to results, affordable alternative tailored to the needs of the business user. None of this changes what we do, just how we achieve it. So we are still looking to find answers to understand the customer and how they value things, and how then to market to them those things that they will value the most, but the cost of doing that has been changing dramatically, making the ROI really compelling.
Alpine Miner is a really exciting offering, which makes the promise of Big Data analytics more of a reality, to a broader audience than has been true before. I suggest you track their progress.
Posted: 25th November 2011 | By Jim Kaskade :
I suppose folks contributing to Mahout (scalable machine learning and data mining open source project for BIG DATA) would like to think they are getting close to "real". But it's limited without broad application across all SQL and NOSQL data stores.
Lets not forget those who have been in the data mining and statistical solution category for decades, aka SAS,...and have already ported their solutions to the Hadoop framework with connectors to all the usual RDBMS suspects. But again, one will argue that you still need to extract data into their proprietary data stores.
I'm guessing that "real" means others may not be leveraging the in-database analytics to help accelerate algorithm training and scoring.....as well as Alpine Data
Posted: 28th November 2011 | By David Norris :
By "real" I am implying that whilst the traditional data mining vendors can tackle big data, I am not convinced yet that they can do so at a price that is affordable to the majority in the business community. I do not think they cannot tackle this issue it is just that Alpine Data Labs in my opinion hold a lead on affordability matched to functionality at present.
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.
Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761