• Jump to Left Menu
  • Jump to Right Menu
  • Jump to Main Content
  • Jump to Footer
  • Accessibility Page
IT-Director.com Logo

 

Main navigation - go to a section of this website:

  • ARCHIVE
  • PAPERS
  • EVENTS
  • NEWSWIRE
  • BLOGS

  

Register For Membership | Member Login

 
 
DOMAINS
  • Business Issues
  • Channels
  • Enterprise
  • Services
  • SME
  • Technology
FEATURED EVENTS
  • Free Webinar - ISO 22301: The New Standard for Business Continuity Best Practice
    23rd May
    Webinar (online)
  • Telecoms Tech World
    4th June - 5th June
    London, United Kingdom
POPULAR PAPERS
  • FM, IT and Data Centres by Quocirca
  • The next frontier for managed print services by Quocirca
  • Beyond Big Data - The New Information Economy by Quocirca
USEFUL LINKS
  • Last 7 Days
  • Archives
  • Top Articles
SHARE THIS PAGE
  • Delicious Icon Delicious
  • Digg Icon Digg
  • reddit Icon reddit
  • Facebook Icon Facebook
  • StumbleUpon Icon StumbleUpon
CONTENT FEED

Sitewide
RSS Feed:

RSS Icon

What is RSS?

RANDOM QUOTE
Observations - "A great many open minds should be closed for repairs." - Toledo Blade Newspaper

PAGE TOOLS
RECENT POSTS
  • IBM JSON
  • IBM boo-boo on big data
  • If I haven't heard of it it's probably NoSQL!
  • IBM adds new Netezza model
  • Greenplum update
  • Federating Big Data
ADVERTISEMENT
BLOG ARCHIVE
  • May, 2013
  • April, 2013
  • February, 2013
  • October, 2012
  • June, 2012
  • May, 2012
  • April, 2012
  • January, 2012
  • October, 2011
  • August, 2011
  • June, 2011
  • April, 2011
Blogs > Bloor IM Blog

Neo4j

Philip Howard By: Philip Howard, Research Director - Data Management, Bloor Research
Published: 11th May 2012
Copyright Bloor Research © 2012
Logo for Bloor Research

This is the third in my series of articles about graph databases and here I am going to highlight Neo4j from Neo Technologies but first a further discussion on the use of graph databases versus (other) NoSQL approaches.

A graph database has a different storage paradigm from, say, Hadoop, typically storing data in triples as a subject entity, a relationship and an object entity. So, for example, you could have "Philip Howard" "is bridge partner of" "Dave" where Dave and I are both nodes in the graph and "bridge partner" is the edge or relationship.

Now you could store this information in a key-value store such as Hadoop as name-value, relationship-value and name-value but if you now want to add the information that I am also a partner of Wendy and that Wendy also plays with Dave but that when I play with Dave we play a Precision Club system and when I play with Wendy we play 2 over 1 game forcing and when Dave plays with Wendy that they play Acol, then this all gets a lot more complicated and it becomes easier to represent this sort of information in a graph, not least because graph databases allow entities and relationships to be qualified, so Precision Club, 2 over 1 game forcing and Acol (all of which, for the uninitiated, are bidding systems) could all be qualifiers to the relationship and you could further add our rankings (life master and so forth) as attributes of each of us. It would be incredibly difficult and time-consuming to manage this sort of environment and then run queries against it (and, in any case, Hadoop doesn't support ad hoc queries) using a key-value store or, for that matter, a traditional data warehouse because the essential element that you are interested in is the relationships rather than the entities.

Moving away from Bridge, the same applies to social network analysis of other types, in retail for example. Moreover, relationships don't have to be between people and people or even people and things: things can also have relationships to one another. For example, any network management environment, whether it is pipelines or traffic or an IT network or the Cloud, essentially consists of things and relationships so it might make sense to build relevant applications in these areas on top of a graph database. This would include such things as SIEM (security information and event management) where a graph-based approach might make a lot more sense than the file-based systems that typify the SIEM market. Other potential markets include bioinformatics, medicine, capital markets and, of course, security services and agencies.

So, to talk about Neo4j, this is probably (almost certainly) the leading and most well-known commercial vendor in the graph database market. It has implementations at Adobe, Cisco, Deutsche Telekom, Viadeo, Comparex and the Telenor Group, amongst others. The applications for which these companies are using Neo4j are diverse. As I have previously mentioned, Neo4j is not only suitable for running queries against relationships but also for transaction processing. Thus, it is both ACID compliant and supports XA-compliant two-phase commit.

One particularly interesting use case at one of its customers is for master data management. And the reason is interesting: the company in question had previously implemented MDM on Oracle RAC but the number of relationships and hierarchies that had to be managed was so large and complex that, in the words of the CIO, "performance was killing us". Hence the move to an environment that was designed to understand and manage relationships as opposed to the relational database world which, despite its name, was not designed with that in mind. Well it was, but very much on a one-to-one-to-one based approach rather than the many-to-many environment which reflects the real world. Indeed, a relational database is very good for transaction processing where the queries are understood in advance and you can pre-determine they types of queries and you can design the tables to support those kind of queries. However, where these queries are not predictable the schema-free approach provided by graph databases can bring significant benefit.

It is also worth pointing out that while Neo4j is not a clustered solution (at present at least) the company does offer a high availability option whereby you can have a second server acting as back-up. This is an active-active arrangement so that during normal operations you can run transactions on one system and queries on the other, thereby providing a genuine hybrid approach. Also, with respect to this scaling-up approach, this means that you don't have or need MapReduce. You can, nevertheless, use various common languages for programming, just as with a conventional database. Thus Neo4j doesn't just support Java (as the j implies) but also Python, Perl, Ruby and so on, as well as SPARQL.

To conclude, astute readers will have gathered that I think graph databases are very interesting. I cannot say that, at this stage, I have looked at a wide enough list of graph database products to say definitively that Neo4j should be recommended compared to other such projects but it is a good place to start, especially for transactional and hybrid environments where relationships are key to your application.

Reader Comments

We have not received any comments against this entry. Why not be the first?

We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.

  • Contact
  • | Site Map
  • | Terms of Use
  • | Privacy Policy
  • | Cookie Policy

Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761