I just returned from the 6th annual Text Analytics Summit in
Boston. It was an enjoyable conference, as usual. Larger players
such as SAP and IBM both had booths at the show alongside pure
play vendors Clarabridge, Attensity, Lexalytics, and Provalis
Research. This was good to see and it underscores the fact that
platform players acknowledge text analytics as an important piece
of the information management story. Additionally, more analysts
were at the conference this year, another sign that the text
analytics market is becoming more mainstream. And, most
importantly, there were various end-users in attendance and they
were looking at using text analytics for different applications
(more about that in a second).
Since a large part of the text analytics market is currently
being driven by social media and voice of the customer/customer
experience management related applications, there was a lot of
talk about this topic, as expected. Despite this, there were some
universal themes that emerged which are application agnostic.
Interesting nuggets include:
-
The value of quantifying success. I found it
encouraging that a number of the talks addressed a topic near
and dear to my heart: quantifying the value of a technology.
For example, the IBM folks, when describing their Voice of the
Customer solution, specifically laid out attributes that could
be used to quantify success for call center related
applications (e.g. handle time per agent, first call
resolution). The user panel in the Clarabridge presentation
actually focused part of the discussion on how companies
measure the value of text analytics for Customer Experience
Management. Panelists discussed replacing manual processes,
identifying the proper issue, and other attributes (some easy
to quantify, some not so easy to quantify). Daniel Ziv, from
Verint, even cited some work from Forrester that tries to
measure the value of loyalty in his presentation on the future
of interaction analytics.
-
Data Integration. On the technology panel, all
of the participants (Lexalytics, IBM, SPSS/IBM, Clarabridge,
Attensity) were quick to point out that while social media is
an important source of data, it is not the only source. In many
instances, it is important to integrate this data with internal
data to get the best read on a problem/customer/etc. This is
obvious but underscores two points. First, these vendors need
to differentiate themselves from the 150+ listening posts and
social media analysis SaaS vendors that exclusively utilize
social media and are clouding the market. Second, integrating
data from multiple sources is a must have for many companies.
In fact, there was a whole panel discussion on data quality
issues in text analytics. While the structured data world has
been dealing with quality and integration issues for years,
aside from companies dealing with the quality of data in ECM
systems, this is still an area that needs to be addressed.
-
Home Grown. I found it interesting that at
least one presentation and several end-users I spoke to stated
that they have built/will build home grown solutions. Why? One
reason was that a little could go a long way. For example,
Gerand Britton, from Constantine Cannon LLP, described that the
biggest bang for the buck in eDiscovery was performing near
duplicate clustering of documents. This means putting
functionality in place that can recognize that an email
containing information sent to another person who responds that
he or she received it is essentially the same document and a
cluster like this should be reviewed by one person rather than
two or three. In order to put this together, the company used
some SPSS technology and homegrown functionality. Another
reason for home grown is that companies feel their problem is
unique. A number of attendees I spoke to mentioned that they
had either built their own tools or that their problem would
require too much customization and they could hire University
people to help build specific algorithms.
-
Growing Pains. There was a lot of discussion
on two topics related to this. First, a number of companies and
attendees spoke about a new "class" of knowledge worker. As
companies move away from manually coding documents to
automating extraction of concepts, entities, etc. the kind of
analysis that will be needed to derive insight will no doubt be
different. What will this person look like? Second, a number of
discussions sprang up around how vendors are being given a hard
time about figures such as 85% accuracy in classifying, for
example, sentiment. One hypothesis given for this was that it
is a lot easier to read comments and decide what the sentiment
should be than reading the output of a statistical analysis.
-
Feature vs. Solution? Text
analytics is being used in many, many ways. This includes
building full-blown solutions around problem areas that require
the technology to embedding it as part of a search engine or
URL shortener. Most people agreed that the functionality would
become more pervasive as time goes on. People will ultimately
use applications that deploy the technology and not even know
that it is there. And, I believe, it is quite possible that
many of the customer voice/customer experience solutions will
simply become part of the broader CRM landscape through time.
I felt that the most interesting presentation of the Summit was a
panel discussion on the semantic web. I am going to write about
that conversation separately and will post it in the next few
days.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.