Beruflich Dokumente
Kultur Dokumente
Aaron B. Helton
MCIS 6309.01
Abstract
Tim Berners-Lee described his vision of a Web in which information could be transferred not
just to humans but also to machines, ushering in an era where machines could combine data in
ways that only humans could do before. The end goal is the potential generation of new
knowledge, machine-interpretable intelligence, and the ability for machines to determine the
answer to specific questions. While his vision has yet to materialize significantly, many of the
enabling factors have begun to form and mature, and it is from these that the next steps, albeit
intermediary in nature, can be taken. Based on existing case and use studies available at the
World Wide Web Consortium (W3C) Web site (Herman, I. & Stephens, S., 2007), such early
Semantic Web enabling factors to date have been in very specific domains, solving specific
problems. This is likely to continue, but it is only by building these foundations that a full
realization of the Semantic Web can be achieved. This paper demonstrates how such building
blocks can be created here and now, with an eye on a particular domain: the United States
According to Tim Berners-Lee, the creator of the World Wide Web (“Tim Berners-Lee,”
2008), the Semantic Web is a Web in which computers “become capable of analyzing all the data
on the Web…machines talking to machines.” (“Semantic Web,” 2008) The repercussions of this
include the recombination of Web-enabled data and information in ways unimagined by the
original creators and extensible to the application of any new domain. In short, the Semantic
Web will allow both end users and machines to ask specific questions and get meaningful
answers, as opposed to being presented a simple list of documents with keyword matches.
The concept of the Semantic Web is not new. At the time of this writing, it is nearly ten
years old, and yet it has not fully materialized beyond an extensive set of framework and
solve domain-specific problems. But lest it remain forever locked in architectural documents, its
problems as possible. This will ultimately enable the full realization of the Semantic Web.
While the architecture and framework pieces have been fairly well documented,
application has only recently begun, and there is still much to be done. The next step is to create
specific use cases so that others may follow suit. An iterative approach seems most likely, as
adoption in some industries is evident, while the Semantic Web is largely absent in many others.
As new implementations arise, adoption will reach a critical mass, and early efforts should
The Semantic Web is not going to build itself. New implementations will foster other
new implementations, but those intermediary applications have to be created. What is needed,
Scaffolding the Semantic Web 4
then, is an approach to designing such applications that can be readily demonstrated, including
what components are necessary and what concerns may arise. The rest of this document takes a
look at just how this can be accomplished, with a particular use case in mind, that of US tourism.
Of all the domains that can benefit from the Semantic Web, the US tourism industry, especially
as represented by the taxpayer funded state tourism campaigns, could begin reaping those
benefits today.
When approaching the Semantic Web, one must begin with a general question that
comprises the set of all more specific questions in the domain. Thus a general question regarding
tax-funded tourism might be along the lines of “What does [given state] have for me to do?” Or,
“What interesting things are located nearby that I can visit this weekend?” Answering either of
these questions, of course, depends heavily on what one likes to do or finds interesting; they are
too broad for this purpose, except to highlight that the broad categories of travel and tourism that
have already been developed (and which help to focus these questions) should be a useful
starting point. In the spirit of simplicity first, complexity later, these questions can be pared
down to something a little more specific. For instance, one might want to know, “What state
parks are within a 50 mile radius of my house, have hiking trails, and allow camping and
fishing?” While this might very well comprise all such state parks in that vicinity, it is not a
given, and so the question becomes the gateway into developing a richer set of semantics to
Before any attempt is made to answer the meta-question, that is, the question whose
answer provides the answer to all such questions, some comparison must be made with the
existing information that can be gleaned from the Web. For this exercise, the state park system
of Texas can serve as an example, as the state is large and contains a good number of parks. A
Scaffolding the Semantic Web 5
number of sites exist to describe one or more dimensions of this query. Among them are the
Texas Parks and Wildlife site (“TPWD: Find a Park,” n.d.), which contains a list of all the Texas
state parks, including addresses and attractions (camping, hiking, etc.). With no way to see all of
the parks on a map, and no way to easily compare one’s own location to that of any set of parks,
the TPWD site is of limited use, but a good starting point. Other sites that do include such
features (“Texas Outside Guide,” n.d.) are marginally better, although the data is still not
accessible for meaningful use or integration by other machines. What is needed is a bridge,
something to cross the chasm between what the Web now provides and what it can provide. The
goal is to enable future developers to use available data, arranged semantically, to answer
one form or another, the data exists already, so it does not need to be created again. It does need
to be collected and arranged so that further automatic processing is possible. In current terms,
that means that information about the various state parks needs to be put into the context of the
Semantic Web. The framework that exists for parsing and assigning metadata to information for
the Semantic Web is the Resource Description Framework (RDF). RDF “is a language for
representing information about resources in the World Wide Web.” (Manola, F. & Miller, E.,
2004) Once this data is available in an RDF format, it can be retrieved with semantic query
languages like SPARQL, which is the query language specifically designed for RDF (for more
information on SPARQL, see Prud'Hommeaux, E., & Seaborne, A., 2008). In fact, this will
serve as a good test of the project. Once the data has been collected, organized, and marked up,
an application that makes a standard set of queries can be developed against it and packaged.
Scaffolding the Semantic Web 6
This will represent one of the first fully Semantic applications dealing with a tourism topic, and
it can be extended to include more kinds of destinations and other kinds of metadata.
Lest this project be regarded as a mere toy, consider the power that Semantic Web
enabled information can have. Advertisers are continually looking to target their advertisements,
and a Semantic database of state parks and other destinations could make advertisement
integration trivial. When someone searches for Texas state parks within fifty miles of postal
code 78704 (Austin), with the additional criteria that they include camping and hiking, any
number of geo-coded support resources (retailers, outfitters, hotels, restaurants) could ensure a
relevant audience for their advertisements. Relevance in this case comes from being in the same
geographic location and appealing to the same set of interests entered by the person doing the
search. This is but one of many possible uses for such data, and it is incredibly likely that other
uses will emerge beyond the original scope and intent of this project.
Scaffolding the Semantic Web 7
References
Herman, I., & Stephens, S. (2007, December 4). Semantic Web Education and Outreach Interest
Group Case Studies and Use Cases. Retrieved February 13, 2008, from
http://www.w3.org/2001/sw/sweo/public/UseCases/.
Manola, F., & Miller, E. (2004, February 10). RDF Primer. Retrieved February 11, 2008, from
http://www.w3.org/TR/rdf-primer/.
Prud'Hommeaux, E., & Seaborne, A. (2008, January 15). SPARQL Query Language for RDF.
Semantic Web. (2008, February 11). Retrieved February 11, 2008, from
http://en.wikipedia.org/wiki/Semantic_Web.
http://www.tpwd.state.tx.us/spdest/findadest/.
Texas Outside Guide. (n.d.). Retrieved February 11, 2008, from http://www.texasoutside.com.
Tim Berners-Lee. (2008, February 11). Retrieved February 11, 2008, from
http://en.wikipedia.org/wiki/Tim_berners-lee.