The Semantic Web Gets Down to Business

Despite the recession, luggage retailer enjoyed phenomenal 2010 holiday sales -- some 33% higher than the previous year. (The online retail sector as a whole reported a 15% gain this past holiday season.) Both Black Friday and Cyber Monday sales set all-time records, according to Ebags Inc. co-founder Peter Cobb.

Despite the recession, luggage retailer enjoyed phenomenal 2010 holiday sales -- some 33% higher than the previous year. (The online retail sector as a whole reported a 15% gain this past holiday season.) Both Black Friday and Cyber Monday sales set all-time records, according to Ebags Inc. co-founder Peter Cobb.

Cobb credits much of these gains to his company's deployment of Endeca Technologies Inc.'s online retail platform, which uses semantic technology to analyze shoppers' keyword choices and clicks, and then winnows down results from categories to subcategories and microcategories. The end result? "Guiding the shopper to the perfect bag very quickly," Cobb says.

Endeca's Web site navigation software allows shoppers to use type, brand, price and size filters to get to relevant choices, Cobb explains. "With over 500 brands and 40,000 bags, we recognized a few years ago how important semantic search and guidance was to the shopping experience."

By providing highly detailed descriptions of products and their attributes, and linkages between categories, the semantic technology has also enabled Ebags to attain higher placement on Web search engine results pages, according to the e-retailer's chief technology officer, Chris Cummings.

In the late 1990s, Tim Berners-Lee, now widely known as the father of the World Wide Web, announced his vision of a "semantic Web" that would help people find exactly the information, answer or product they were looking for. This would happen, he hoped, without users having to design complex queries or try dozens of different keyword combinations or sort through thousands of irrelevant URLs.

To help make this happen, the World Wide Web Consortium (W3C), under Berners-Lee's direction, has developed standards that allow computer platforms and software agents to identify, access and integrate information from disparate Web sites and domains, as well as from various information silos within an enterprise.

Using the W3C standard Resource Description Framework (RDF), for example, retailers and manufacturers could pass detailed product information back and forth, says Jay Myers, lead Web development engineer at "Right now, a lot of our vendors provide product information in spreadsheets, which makes it hard to distill." isn't currently taking full advantage of the W3C RDF's capabilities; that's still a future goal, according to Myers. Indeed, Berners-Lee's dream is still a long way from reality, although it's getting closer. Many business decision-makers remain skeptical that the paybacks of adopting semantic technology will make up for the costs and risks. What's needed is a killer app that will persuade a critical mass of business users to invest in semantic Web software, says Phil Simon, a consultant and the author of The Next Wave of Technologies.

About this story

This is first part of a two-part series about the semantic Web. This installment explains various semantic Web technologies, including search. It explores their potential uses and paybacks, illustrated with real business cases, including ones involving the use of sentiment analysis. It also provides some best practices and tips from the trenches for anyone planning, or at least considering, a deployment.

Part 2 will provide an overview of commercial and open-source products, frameworks and services that support semantic technology and will discuss how they can be used as building blocks to develop a successful semantic Web infrastructure. It will also delve into the implications of growing industry support for W3C semantic standards.

Slowly but surely, however, semantic Web technology is catching on. Business users in industry sectors ranging from e-commerce, e-publishing and healthcare to marketing and financial services are reaping its benefits, even if they don't always understand how it works and even though hard ROI numbers have been hard to come by. An established practice like sentiment analysis -- the art of figuring out what customers and others really think of your company and product -- is getting a boost from semantic technology. (See related story.)

Moreover, enterprise software vendors like IBM, Oracle, SAS and Microsoft have started to incorporate semantic search and W3C standards into their platforms, as have Web search engines like Google, Microsoft's Bing and Yahoo.'s Myers can attest to this: Soon after his team began adding semantic metadata to product pages on store blogs, he reports, they saw an increase of about 30% in "organic" search traffic -- meaning traffic that results from user searches rather than clicks on Web ads.

What it's all about

Semantic software uses a variety of techniques to analyze and describe the meaning of data objects and their inter-relationships. These include a dictionary of generic and, often, industry-specific definitions of terms, as well as analysis of grammar and context to resolve language ambiguities such as words with multiple meanings.

For example, the phrase "there are 40 rows in the table" uses rows as a noun, whereas "she rows five times a week" uses rows as a verb. Likewise, the word stock has one meaning in the phrase "I used beef bones for my soup stock," another in "the supermarket keeps a lot of stock on hand" and yet another in "analysts are bearish on the stock."

Advice for going semantic

* Remember that collaboration between subject matter experts and IT staff is crucial when developing a semantic ontology.

* Make sure you have a specific business mission before you build an ontology, otherwise it will wind up being a useless exercise.

* Resist the urge to jump in with both feet right away; it's better to go slowly, implement projects that solve real problems and win converts along the way

Resolving language ambiguities ensures that a shopper who does a search using the phrase "used red cars" will also get results from Web sites that use slightly different terms with similar meanings, such as "pre-owned red automobiles," for example.

It also makes it possible for a user to, say, type in a complex query like "progressive rock songs from the 1970s with odd time signatures and atmospheric feels" at a music Web site like iTunes or and get back Pink Floyd, says Simon.

Once defined, content is tagged with descriptive metadata or "markups" and is mapped into an ontology. (See diagram.) Ontologies are schema that describe data objects and their relationships. Developing them is typically a collaborative effort involving technicians who understand semantic schema and subject matter experts who understand business language.

Semantic Web technology refers to products and architectures that support semantic searches, queries, publishing and retrieval based on W3C standards. These include Web Ontology Language (otherwise known as OWL), the Resource Description Framework (RDF) and Simple Protocol And RDF Query Language (SPARQL), as well as existing Web protocols like XML and HTTP.

The hidden helper's Cummings admits that he's not all that familiar with semantic technology. However, he is very aware that Endeca's semantic-based online retail platform has played a major role in increasing Ebags' sales. "Since it was deployed, our conversion rates have doubled," he reports. (Conversion is the term used to describe what happens when a shopper who clicks on a link to an e-commerce site actually buys something.)

Indeed, business users, and even some IT executives, don't always realize that their e-commerce or enterprise software platforms are using semantic technology. However, they definitely appreciate the paybacks.

In addition to stronger sales numbers, other benefits of semantic technology can include more clicks from Web search engines, higher customer satisfaction rankings and, internally, more timely and effective decision-making and faster responses to competitors and market changes.

One of the earliest applications of semantic technology has been to help business users more easily find and access the information they need, no matter where the data is located and no matter who owns it.

Michael Lang, CEO of Cambridge Semantics Inc., a Boston-based maker of semantic middleware and plug-ins, is betting that semantic platforms will supplant traditional business intelligence systems. The main reason he's expecting this to happen, he says, is because semantic technology eliminates the need to extract, transform and load all relevant data from disparate information silos into data warehouses and marts that need to be constantly updated.

With semantic technology, all of that happens on the fly and in the background.

According to Lynda Moulton, an analyst at Gilbane Group, a Cambridge, Mass.-based research arm of Outsell Inc., semantic technology can provide significant benefits for enterprises that are confronted with data that has some combination of the following characteristics:

• It's voluminous, with millions of unstructured documents.

• It's complex in scope and depth.

• It's valuable to end users, but in small, disparate pieces.

• It's needed by highly paid and highly skilled professionals for use in their areas of expertise.

• It's undifferentiated for e-discovery and research purposes. That means, for example, that the information lacks metadata and is not available in a structured format that supports intelligent searches.

• It's likely to have an impact on the bottom line, indirectly or directly, when discovered.

Semantic technology can process such information so that it can be "aggregated, federated, pinpointed or analyzed to reveal concepts or meanings" that are logistically impossible for human beings to obtain manually, Moulton says. Early adopters of semantic technology included companies in the publishing and life sciences industries; they're now being followed by enterprises "whose content has grown to proportions unmanageable by humans," says Moulton.

Competing for clicks

Semantic technologies can "make search engines better or more precise in finding relevant content," says Moulton. So if your company operated a retail Web site, that would mean that semantically-enabled searches would do a better job of leading shoppers to your site and then helping them find products they want to buy., for example, realized "high ROI in terms of increased store and product visibility on the Web," Myers says. While adding semantic metadata to product pages on some 1,100 store blogs was no small task, Myers' team saved a great deal of technical grunt work by using GoodRelations, an ontology that German university professor Martin Hepp developed specifically for e-commerce.

GoodRelations provides a standardized vocabulary -- the semantic Web term for ontology -- for product, price and company data. This information can be embedded into existing Web pages, then processed by other computers, applications and search engines that support W3C protocols. As mentioned above, this makes richer product information available to search engines that support W3C standards. It also provides the potential for cross-domain semantic querying across e-commerce sites -- as long as other e-commerce companies incorporate the vocabulary into their data, too. So far, only a handful of retailers have done so, including and, more recently,

While Myers could give no hard numbers on time savings, he said that, in contrast with most deployments of new methodologies and technologies, "we spent very little overhead time implementing GoodRelations in our markup." After an "initial introduction," developers typically found working with GoodRelations as easy as coding standard HTML, Myers says. is exploiting the power and precision of semantic search not only to help shoppers find what they want but also to bring their attention to specific types of products, such as "long-tail" items that don't generate huge sales, Myers explains. And early last year, his team developed a program, based on semantic Web standards, that makes it easy for store managers to publish information about "open box" or returned products on the store's WordPress blog. Because these products are slightly cheaper, they are much in demand among customers with budget restrictions, Myers points out.

Semantic Web platforms from vendors such as Expert System, Cambridge Semantics, Sinequa and Lexalytics allow users to query both internal enterprise data, and Web sources, including blogs, social networks like Facebook, and other Web 2.0 media.

Answering employees' questions

Bouygues Construction is using Sinequa's Context Engine to put employees in touch with in-house experts who can answer their questions in a broad range of areas, says Eric Juin, the worldwide construction firm's e-services and knowledge management director. "It could be a lawyer, an engineer or an executive, anywhere in the world." The semantic platform identifies and categorizes all experience within the company, worldwide, by analyzing vast quantities of unstructured information, including training materials, project documentation and other internal sources, as well as Web-based newspapers and scientific publications, Juin says.

1 2 Page 1
Page 1 of 2
The 10 most powerful cybersecurity companies