Time-to-Adoption Horizon: Four to Five Years
The idea behind the semantic web is that although online data is available for searching, its meaning is not: computers are very good at returning keywords, but very bad at understanding the context in which keywords are used. A typical search on the term “turkey,” for instance, might return traditional recipes, information about the bird, and information about the country; the search engine can only pick out keywords, and cannot distinguish among different uses of the words. Similarly, although the information required to answer a question like “How many current world leaders are under the age of 60?” is readily available to a search engine, it is scattered among many different pages and sources. The search engine cannot extract the meaning of the information to compile an answer to that question even though it can return links to the pages that contain pieces of that answer. Semantic-aware applications are tools designed to use the meaning, or semantics, of information on the Internet to make connections and provide answers that would otherwise entail a great deal of time and effort.
The vision for the semantic web, originally advanced by Sir Tim Berners-Lee, is that eventually it might be able to help people solve very difficult problems by presenting connections between apparently unrelated concepts, individuals, events, or things — connections that it would take many people many years to perceive, but that could become obvious through the kinds of associations made possible by semantic-aware applications. There are currently two theoretical approaches to developing the semantic capacity of the web. One, the bottom-up approach, is problematic in that it assumes metadata will be added to each piece of content to include information about its context; tagging at the concept level, if you will. The top-down approach appears to have a far greater likelihood of success, as it focuses on developing natural language search capability that can make those same kinds of determinations without any special metadata.
Most currently available semantic-aware applications are intended to assist with searching and finding, with making intellectual or social connections, or with advertising. Tools like TrueKnowledge (http://trueknowledge.com), Hakia (http://www.hakia. com), Powerset (http://www.powerset.com), and SemantiFind (http://www.semantifind.com) are designed to provide more accurate search results, either by scanning metadata tags added to content (the bottom-up approach, taken by SemantiFind) or by using semantic algorithms or lexica (the top-down approach, taken by Hakia). Yahoo! has released an open search platform, SearchMonkey (http://developer.yahoo.com/searchmonkey), that allows developers to create custom applications to return a certain type of information — about movies, say, or people — using semantic search of marked-up content to categorize information.
Tools for making connections between concepts or people are also entering the market. Calais (http://www.opencalais.com) is a toolkit of applications to make it easier to integrate semantic functionality in blogs, websites, and other web content; for instance, Calais’ Tagaroo is a plugin for WordPress that suggests tags and Flickr images related to a post as the author composes it. Zemanta (http://www.zemanta.com) is a similar tool, also for bloggers. SemanticProxy, another Calais tool, automatically generates semantic metadata tags for a given website that are readable by semantic-aware applications, without the content creator’s needing to do it by hand. Calais includes an open API, so developers can create custom semantic-aware applications. TripIt (http://www.tripit.com), a social semantic-aware application for travelers, organizes travel plans and makes useful connections; a TripIt user simply forwards a confirmation email from any travel provider — airlines, hotels, car rentals, event tickets — and TripIt automatically creates an itinerary by interpreting and organizing the information in the email according to its semantic context.
Advertisers are also finding a use for semantic-aware applications. Tools like Dapper MashupAds (http://www.dapper.net/mashupads/) extract information from the page the user is browsing and tailor sidebar advertisements to that content. If you are browsing flights to Orlando, for instance, MashupAds might show a sidebar with Orlando hotels; if you are shopping for a home, the ad might show you sample mortgage rates for comparable properties in that particular area. BooRah (http://boorah.com) is a tool that pulls information from restaurant reviews all over the web, analyzing the tone of the reviews to assign positive or negative ratings to restaurants. The links, ads, and recommendations on a BooRah detail page are all local to the restaurant’s area as well.
Semantic-aware applications like these allow meaning to be automatically inferred from content and context. The promise of these applications is to help us see connections that already exist, but that are invisible to current search algorithms because they are embedded in the context of the information on the web. Semantic-aware applications are still in early development, and many of those named here are in beta at press time; errors and incorrectly identified bits of content are not unusual. However, there is a great deal of work going on in this area, and we can expect to see significant advances in the coming years.
Relevance for Teaching, Learning, Research, or Creative Expression
Education-specific examples of semantic-aware applications are still rare. To date, development of semantic-aware applications has mostly focused on creating tools to automate the process of contextualizing information and tools to process content against a semantic lexicon; end-user applications are, by and large, still in very early development. One application that illustrates some of the potential of semantic-aware applications for education is Twine (http://twine.com), a social network organized around topics of interest. Members join a “twine” on a particular topic, like biological evolution, where they can add resources and connect with others who are interested in the topic. Twine sorts resources into categories based on the type of information they contain: places, people, organizations, and so on. Twine is not focused solely on education, but there are twines on many educational topics.
The capability of semantic-aware applications to aid in searching and finding has implications for research, especially in light of the rate at which web content is being created. As semantic search tools continue to develop, it will be more common to see highly relevant results that display desired information in the hit list summary itself, saving time that is now spent clicking through to each page in turn. Semantic search also promises to reduce the number of unrelated or irrelevant results for a given search and to facilitate natural-language queries, both potentially useful features for researchers.
Like the tools described in the 2008 Horizon Report under Social Operating Systems, semantic-aware applications hold the potential to organize and display information embedded in our data in meaningful ways that make it easier to draw connections. Semantic-aware tools to help visualize relationships among concepts and ideas are just beginning to emerge, including mashups that not only plot data on graphs or maps, but also emphasize and illustrate conceptual links. For instance, WorldMapper (http://www.worldmapper.org/) produces maps that change visually based on the data they represent; a world map showing total population enlarges more populous countries (China, India) and shrinks those that have a smaller fraction of the world’s population.
A growing number of companies and educational institutions are conducting research into semantic connections. For instance, the Multimodal Information Access and Synthesis (MIAS) Center at the University of Illinois at Urbana-Champaign is conducting research and developing prototype projects on topics such as contextualizing data automatically, natural-language search, and assembling contextual information for photographs based on text that appears near similar photographs (http://www.mias.uiuc.edu/mias/research).
A sampling of use cases for semantic-aware applications across disciplines includes the following:
- Research. The Fundación Marcelino Botín in Santander, Spain is seeking to create a research portal to cultural heritage information about the Cantabria region, using semantic-aware applications to draw connections and combine data from a wide variety of sources, including bibliographies, prehistoric excavations, industrial heritage, and others.
- Collections Tagging. The Powerhouse Museum of Science and Design in Sydney, Australia is using Open Calais to add contextual tags to objects in its online collection. The process of tagging the more than 66,000 objects in the collection would be impossible by hand, but Open Calais has been able to pick out important tags from object descriptions, facilitating navigation and search through the collection.
- Law. A prototype project at the Autonomous University of Barcelona assists newly appointed judicial officials in resolving complex legal questions based on collected information from prior cases. Developed for the Spanish General Council of the Judiciary, the system uses contextual information to suggest solutions to problems that new judges might typically refer to more experienced judges, potentially speeding up the legal process.
Examples of Semantic-Aware Applications
The following links provide examples of semantic- aware applications.
The Cleveland Clinic is using semantic web concepts to search patient data to improve future patient care.
Semantic Mediawiki is an extension to Mediawiki (the software upon which Wikipedia is based) that makes it easy for editors to insert “hints” into articles to enable semantic searches.
The University of Mary Washington, in addition to hosting a blogging platform for the UMW community, is experimenting with a semantic portal as a way to organize and find content, explore the community, and find people. For instance, the “Link Friends” exhibit makes friendship recommendations based on similar linking habits.
SemantiFind is a web browser plug in that works with Google’s search bar. When a user types a word into the search bar, a drop down menu prompts the user to select the exact sense of the word that is desired, in order to improve the relevance of the results that Google displays. The results are based on user labels on the pages being searched.
SIOC.Me (pronounced “shock me”) is a semantic visualization tool that lets the viewer browse an Irish bulletin board (web forum) site in a 3D space. Concepts and other data are linked semantically.
For Further Reading
The following articles and resources are recommended for those who wish to learn more about the semantic web and semantic-aware applications.
An Introduction to the Semantic Web
(Manu Sporny, YouTube, December 2007.) This six-minute video explains the idea of the semantic web in simple terms.
In the Cusp: a Global Review of the Semantic Web Industry
(David Provost, Semantic Business, 30 September 2008.) This blog post announces the release of (and links to) a report by the author on the current state of the industry with regards to semantic-aware applications and the semantic web.
The Semantic Web in Education
(Jason Ohler, EDUCAUSE Quarterly, Vol. 31, No. 4, 2008.) This article introduces the concept of the semantic web in an educational context and suggests some ways semantic-aware applications might be used in teaching and learning.
Semantic Web: What is the Killer App?
(Alex Iskold, ReadWriteWeb, January 2008.) This article examines what is needed for the semantic web to become mainstream: a killer app that attracts and engages.
Yahoo Embraces the Semantic Web — Expect the Internet to Organize Itself in a Hurry
(Michael Arrington, TechCrunch, 13 March 2008.) This post describes Yahoo’s announcement to expand their Open Search Platform to make use of semantic tags embedded in web content to improve search results.
Delicious: Semantic-Aware Applications
(Tagged by Horizon Advisory Board and friends, 2008.) Follow this link to find resources tagged for this topic and this edition of the Horizon Report, including the ones listed here. To add to this list, simply tag resources with “hz09” and “semanticweb” when you save them to Delicious.
Posted by NMC on January 18, 2009