• Jump to Left Menu
  • Jump to Right Menu
  • Jump to Main Content
  • Jump to Footer
  • Accessibility Page
IT-Director.com Logo

 

Main navigation - go to a section of this website:

  • ARCHIVE
  • PAPERS
  • EVENTS
  • NEWSWIRE
  • BLOGS

  

Register | Login to Member's Area

 
 
DOMAINS
  • Enterprise
  • SME
  • Business Issues
  • Technology
  • Services
  • Channels
FEATURED EVENTS
  • Information Process Quality Improvement
    19th March - 21st March
    London, United Kingdom
  • Convergence Summit North 2012
    17th April - 18th April
    Manchester, United Kingdom
POPULAR PAPERS
  • Best practices for cloud security by Bloor Research
USEFUL LINKS
  • Last 7 Days
  • Archives
  • Top Articles
SHARE THIS PAGE
  • Delicious Icon Delicious
  • Digg Icon Digg
  • reddit Icon reddit
  • Facebook Icon Facebook
  • StumbleUpon Icon StumbleUpon
CONTENT FEED

Sitewide
RSS Feed:

RSS Icon

What is RSS?

RANDOM QUOTE
Raw wit - "She plunged into a sea of platitudes and with the powerful breaststroke of a channel swimmer made her confident way towards the white cliffs of the obvious." - W. Somerset Maugham

PAGE TOOLS
  • Request Reprints
  • Tell A Friend
  • Contact Author
RECENT POSTS
  • Four Vendor Views on Big Data and Big Data Analytics: IBM
  • Four Vendor Views on Big Data and Big Data Analytics Part 2- SAS
  • SAP moves to social media analysis with NetBase partnership
  • Attensity on Big Data and Big Data Analytics
  • The Inaugural Hurwitz & Associates Predictive Analytics Victory Index is complete!
  • Informatica announces 9.1 and puts stake in the ground around big data
ADVERTISEMENT
BLOG ARCHIVE
  • January, 2012
  • December, 2011
  • November, 2011
  • September, 2011
  • June, 2011
  • May, 2011
  • April, 2011
  • February, 2011
  • January, 2011
  • December, 2010
  • November, 2010
  • October, 2010
Blogs > Fern Halper

Do we need the semantic web?

Fern Halper By: Dr Fern Halper, Partner, Hurwitz & Associates
Published: 8th June 2010
Copyright Hurwitz & Associates © 2010
Logo for Hurwitz & Associates

What kinds of applications do we need a semantic web for? Is the semantic web practical? These questions (among others) were posed by Jamie Taylor of Metaweb Technologies to a group of panelists at the Text Analytics Summit last week. The panelists were no lightweights. They included Vladimir Zelevinsky from Endeca, Ron Kaplan from Microsoft, and Kathleen Dahlgren from Cognition. I found this to be one of the most engaging segments of the Summit.

First of all, many people define the semantic web as a "web of meaning" or a "web of data" that will allow computer applications to exploit the data directly. Check out the W3C webpage for more information about definitions. The panelists at the Summit got into an interesting discussion about parsing data sources for the semantic web. Here are a few of the highlights. Please note that I asked some additional questions after the panel, itself, so if you're reading information you didn't hear on the panel this is the reason.

  • What kind of applications is the Semantic Web good for? It depends what you want to know. For example, one of the panelists pointed out that you don't need the semantic web to find a hardware store in Boston. However, more unique queries might require it. Most people have had the experience of knowing what they are looking for and using a five or six word query and still not finding it. The panelists pointed out that entities (people, places, things) were relatively easy to extract; it is the relationships between the entities that is harder. Vladimir Zelevinsky explained it like this in terms of information retrieval need/information retrieval technologies:
    • Known Item Search -> Keyword Search (e.g., Google—where you need to find what you know exists);
    • Unknown Item Search -> Guided Navigation (e.g., Faceted search where you need to explore the data space);
    • Unknown Relationship Search -> Semantic Web (where you are looking not for separate items in the repository, in this case the web, but for the connection(s) between them).

The semantic web could pay off in applications that require understanding the relationships between these entities. Ron Kaplan also noted that semantic web technology provides a standard way of merging data from different sources, and that will probably enable some useful new applications.

  • Scaling the semantic web. Everyone seemed to agree that manually tagging documents is a brittle exercise. Vladimir Zelevinsky, from Endeca, suggested putting a parser on each machine. He said that since you type slower than 1 sentence per second that at the moment of creation, semantics could be injected into the document. Of course, it is a bit more complex than this, but this was an interesting notion. Kathleen Dahlgren from Cognition said that NLP at scale was the wave of the future. NLP is complex but deeply distributed. Computers are getting faster and cheaper, and this can make it fast and scalable.

Is it practical? There is a huge amount of data out there and it keeps changing. There is also a lot of duplicate information on the web. Is it economically viable to think about parsing the web? Ron Kaplan said he had done a back of the envelope calculation using the following assumptions:

"The simple order-of-magnitude calculation goes as follows: There are roughly 2.5M seconds in a month, so an 8-core machine gives you 20M cpu seconds. If it takes 1 second on the average to process a sentence (an upper bound), then you can do 20M sentences per month. If a web page has on the average 20 sentences, you get 1M pages per month per machine. So, 1000 machines can do a billion pages per month. More if 1 second over estimates, less if 20 sentence/document underestimates."

So this is economically feasible. If there is a need. And that remains the question. Is it economically viable and necessary to try to find the information in the long tail?

Reader Comments

The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.

We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.



  • Report errors / Make Suggestions
  • | Site Map
  • | Terms of Use
  • | Privacy

Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761