Seth Grimes on the semantic web – but is B2B media ready to benefit?

Seth Grimes, an analyst specialising in business intelligence and text analysis, gave a fascinating presentation – “an introduction to the semantic web and text-mining” – last week in London.

I will try and give you a flavour of his presentation and Peter Thomas wrote it up. But he certainly got me thinking about whether B2B media is ready to benefit from the semantic web.

What is “the semantic web”
When you look up a key word or phrase on Google, the search engine returns content on the basis of the frequency of those words within the text and the links to it from other sites, among other things. The semantic web takes that concept further, returning content by recognising not only the frequency of the words and calibre of the links, but also the context of the request. In short, the semantic web aims to understand user searches in a more human way, adding context to queries.

Seth kicked off with an article by Hans Peter Luhnin the IBM Journal of 1958 which has Luhn, the pioneer of information services, complaining that “no attention is paid to the logical and semantic relationship the author has established”.

Hans Peter Luhn, “The Automatic Creation of Literature Abstracts,” IBM Journal, April 1958

Seth argues that even then Luhn was perceiving a time when “sense making” would matter:-

Statistical information derived from word frequency and distribution is used by the machine to compute a relative measure of significance

Today that need for sense among all the disorder of content is even greater with the “unstructured data challenge”, as Seth called it, of blogs, emails, surveys and office documents.

Ever more relevant today

Seth used, as an example, a Twitter application called twitrratr which assesses Tweets for “sentiment analysis”. But, using the word “kind” as an example, he showed how difficult it is to do that with the multiple-meaning English language.

Sentiment analysis by twitrratr of the word "kind" by Seth Grimes

“Is seach up to the job?” he asked.

Only if it provides content semantically enriched with linked data, that is context sensitive and location aware.

And the sooner media companies get in on the act, the better.

The digital universe by industry 2007 from "The Diverse and Exploding Digital Universe” IDC, 2008

Seth quoted from a survey by IDC to show just how little those who are responsible for content benefit from it.

The broadcast, media and entertainment industries garner about 4% of the world’s revenues but already generate, manage, or otherwise oversee 50% of the digital universe

Finally Seth went to push textmining on sites using automatic content categorization, text augmentation and information extraction (disclosure the presentation was sponsored by text mining platform Nstein). The market, he argued, from a study (partly funded by Nstein) he had published “Text Analytics 2009: user perspective on solutions and providers”, was worth $350 in 2008 and due to increase by 25% in 2009.

Seth’s own research showed…

Yet, surprisingly, when clients were asked about relative importance of several online qualities, clients placed content management a lowly fifth below brand values.

"What are the primary applications where text comes into play?" survey by Seth Grimes

But, interestingly, clients were more likely now to analyse social media content than traditional news articles.

Is B2B media ready to exploit the semantic web?

B2B media is the opposite of mass media, the former a mass of sites for small but very well defined communities rather than the latter with its few big sites for millions of people. Indeed B2B sites do not want millions of the wrong people coming to their sites but rather few of the right people. Users’ familiarity with an existing print brand, social media activity, all help to refine those who get to the sites. But key to this refinement is search.

Take SHP, a B2B site for the safety and health professionals, as an example. It boasts articles on stress. We do not want millions of people finding the site because they are Googling the word “stress”. What we do want is for safety and health practitioners who are looking for such phrases as “stress in factories in northern England” to find the site. The more complex the keyword phrase used to get to a B2B site is, the more qualified the user.

And, finally…

The issue for us, therefore, is that our more qualified readers have been making choices informed by Seth’s linked data, context and even “location aware” for years. How then will the semantic web benefit them? In fact, does the semantic web have something to learn from B2B?

Photo credit:dullhunk

2 Responses to Seth Grimes on the semantic web – but is B2B media ready to benefit?

Andrew says:

June 26, 2009 at 2:58 pm

The trouble is that automatic content extraction, categorization, and augmentation are incredibly difficult promises to deliver on when the scope is large and the tolerance for failure is small-to-nothing . When ‘content meaning’ is observed within a particular subject area or entity, then semantic analysis becomes simpler and more reliable (and this should apply to the niche focus of a business publisher).

I always point people to the demos on Zemanta.com for a good example of real-life usage. And we are really excited about OpenCalais, which is an awesome API on which to build intelligent services and content delivery platforms.
Pingback: links for 2009-06-29 «