I recently attended an update briefing from SAP about its BusinessObjects Predictive Analysis offering. While not a lot has changed since the tool's general availability in November of last year, what the briefing did underline is the company’s commitment and investment in advanced analytics. While some might argue that this investment comes too late given that the leading heavyweights such as SAS and IBM SPSS have been doing a lot of this 'stuff' for decades, I would urge caution over writing off SAP Predictive Analysis efforts too soon. In fact, the company is using its late entry to its advantage by trying to overcome some of the common adoption hurdles that have thus far prevented its mainstream adoption; in particular by trying to circumnavigate barriers over ease of use and predictive modeling performance.
Predictive analytics remains a red hot opportunity for vendors
The need for organisations to get smarter with their data and become more forward looking is currently driving interest and uptake of predictive analytics. Traditionally its application has been a popular way for helping organisations predict customer behaviour, spot opportunities ahead of the competition, respond to changing business conditions and help understand and detect exposure to risk. While these business opportunities or challenges have always been the preserve of advanced analytics, a gradual maturing in analytics practices coupled with availability—and in some cases increasing affordability—of big data technology is currently accelerating interest in predictive analytics. SAP’s Predictive Analysis offering, now on version 1.0.7, aims to tap this market opportunity.
Targeting business users with big data problems
Predictive Analysis is a statistical analysis and data mining tool aimed at the general business user and combines native functionality, support for the R Open Source statistical analysis language, and in-memory data mining capabilities, courtesy of HANA, for handling large data volumes (see my recent blog on HANA here and link through to download a free 3-page report too).
In contrast with more established tools, SAP is not specifically targeting this offering at data scientists, statisticians or data engineers; instead it is making a play for the data-savvy business analyst. It aims to do this by leveraging Visual Intelligence, its data visualisation tool, used in part with Predictive Analysis for data acquisition and manipulation. In fact, Predictive Analysis is best described as a super-set of Visual Intelligence as it also uses the same code base, although the full version of the product is not available as part of Predictive Analysis. What it does provide, however, is a visual UI for common tasks such as loading data, transforming it and analysing it in a visual manner by utilising techniques such as scatter plot matrices, cluster graphs and decision trees. Having had a quick demo of the tool my overall impression is while it’s not the most advanced data visualisation tool out there on the market, the company has given due consideration to how to visually apply algorithms to data in a way that enables users to build simplistic models and interact with the results in a more intuitive way.
Talking of algorithms, the tool has a growing number that it supports. There are, for instance, a selection of algorithms available natively within Predictive Analysis including time series, outlier detection and trend analysis. However, for those that want to extend these capabilities, support can be supplemented in two other ways. Firstly, the offering can tap into the popularity and flexibility of the open source R language and use it to support other algorithms including Decision Trees, K-Means, and Neural Network, for example. Equally, users can tap into HANA’s Predictive Analysis Library (PAL) algorithms to perform in-database analytics.
At a high level the range of advanced analytics supported as part of Predictive Analysis looks comprehensive. But there are a couple of caveats to all of this. Firstly, it’s not totally clear how you decide or are guided towards using a particular algorithm library given that there’s a choice of three. So for example, how do end-user organisations know when the analytics support provided within Predictive Analysis might run out of steam and necessitates moving out of the visualisation environment? From what I can gather R appears to be the backstop for any algorithm support shortcomings found within Predictive Analysis or PAL (for example PAL doesn’t currently support logistic regression). In this latter scenario an R package can instead be called by a HANA SQLScript and executed within its in-memory environment; equally it does appear from some of the blogs on SAP’s community network that the company plans to include a wrapper as a way of incorporating a greater selection of R packages into the tool.
Integration with HANA is high up on the agenda
SAP is heavily pushing the integration of Predictive Analysis and HANA as its in-memory platform although the two are not actually dependent on each other. While it’s no surprise that SAP has put HANA at the forefront of its analytic developments—especially given its strategic importance—the ability to hook up both together does provide some advantages. Firstly, as models are created and trained within the database, the developer or statistician doesn’t need to extract data out into a separate environment for this purpose, lessening data movement and duplication. It’s good to see SAP provide this level of support given that all the other major BI and analytics vendors provide this capability.
Similarly, bringing analytic processing into HANA’s in-memory engine means it can utilise its performance as well as scalability to improve the model development process, as more model build iterations can be achieved in the quicker timescales. Likewise the ability to run algorithms on a full set of data as supported through HANA, rather than just a sample, promises to improve the accuracy of results as well as the quality of the models produced. All in all this is where we believe SAP’s predictive analytics strategy is heading—the ability to perform analytics on large amounts of big data in-memory using a visual front-end is a sweet spot that many vendors are aiming for. The only question remaining is what will all this cost? We don’t imagine it comes cheaply.