Sitewide
RSS Feed:
|
By: Dr Fern Halper, Partner, Hurwitz & Associates Published: 8th June 2010 Copyright Hurwitz & Associates © 2010 |
What kinds of applications do we need a semantic web for? Is the semantic web practical? These questions (among others) were posed by Jamie Taylor of Metaweb Technologies to a group of panelists at the Text Analytics Summit last week. The panelists were no lightweights. They included Vladimir Zelevinsky from Endeca, Ron Kaplan from Microsoft, and Kathleen Dahlgren from Cognition. I found this to be one of the most engaging segments of the Summit.
First of all, many people define the semantic web as a "web of meaning" or a "web of data" that will allow computer applications to exploit the data directly. Check out the W3C webpage for more information about definitions. The panelists at the Summit got into an interesting discussion about parsing data sources for the semantic web. Here are a few of the highlights. Please note that I asked some additional questions after the panel, itself, so if you're reading information you didn't hear on the panel this is the reason.
The semantic web could pay off in applications that require understanding the relationships between these entities. Ron Kaplan also noted that semantic web technology provides a standard way of merging data from different sources, and that will probably enable some useful new applications.
Is it practical? There is a huge amount of data out there and it keeps changing. There is also a lot of duplicate information on the web. Is it economically viable to think about parsing the web? Ron Kaplan said he had done a back of the envelope calculation using the following assumptions:
"The simple order-of-magnitude calculation goes as follows: There are roughly 2.5M seconds in a month, so an 8-core machine gives you 20M cpu seconds. If it takes 1 second on the average to process a sentence (an upper bound), then you can do 20M sentences per month. If a web page has on the average 20 sentences, you get 1M pages per month per machine. So, 1000 machines can do a billion pages per month. More if 1 second over estimates, less if 20 sentence/document underestimates."
So this is economically feasible. If there is a need. And that remains the question. Is it economically viable and necessary to try to find the information in the long tail?
The messages above were all contributed by IT-Director.com readers. Whilst we take care to remove any posts deemed inappropriate, we can take no responsibility for these comments. If you would like a comment removed please contact our editorial team.
We automatically stop accepting comments 180 days after a post is published. If you would like to know more about this subject, please contact us and we'll try to help.
Published by: IT Analysis Communications Ltd.
T: +44 (0)190 888 0760 | F: +44 (0)190 888 0761