Beruflich Dokumente
Kultur Dokumente
ABSTRACT Keywords
The current business model of online newspapers is to create Linked Data, commercial aspects, licensing, usage restric-
content and publish it on the Web. Content in turn attracts tions
users to the online presence of a newspaper. This attention
of users is monetized by presenting advertisements to them.
The revenue of online content providers is mainly generated 1. INTRODUCTION
by advertisement and strongly depends on the number of In recent years the Web has been undergoing a dramatic
users consuming their content. change. The concept of Web 2.0 [5] summarizes several per-
Therefore - in general - content providers aim to attract ceived changes on the Web, which have led, amongst oth-
as many users as possible with their content. However, to ers, to the emergence of innovative mash-up applications
a large extent content providers make their content only aggregating and combining content and data from different
available on one individual site in an unstructured form and sources. As described in our previous work [8] the Con-
consequently limit the accessibility and reusability of their tent Republishing phenomenon, which can be observed on
content. Several reasons for this exist, among them are the the Web 2.0, boosts diffusion and dissemination of content.
way news have been published in the past, intellectual prop- Users may republish content by reblogging external content
erty rights and the way advertisements are currently sold. on their blogs, by retweeting microblog posts from other
In this paper we inspect the current situation of online users on their own microblog or by posting external con-
newspapers and propose solutions for the current dilemma tent to their Facebook1 wall. Content becomes increasingly
of content providers. We illustrate how Semantic Web tech- separated from individual sites and can be reused and re-
nologies can help content providers to open their content, combined, providing new value for content consumers.
i.e. making their content reusable and integrable for third Professional online content providers, such as online news-
parties, without loss in revenue. portals, are very present on the Web since its early days
and utilize it as one medium to publish their content. At
present, to a large extent professional online content provid-
ers only publish their content on dedicated Web sites, most
Categories and Subject Descriptors notably on their own ones. To successfully operate their
H.4.4 [Information Systems]: Information Systems Ap- business, they provide advertisements, acquired by them-
plication selves or taken from third parties (e.g. Google), embedded
in their content. To determine the attractiveness for po-
tential advertising customers online news providers measure
the usage of their site by using Web analytics tools. These
General Terms tools allow to record and analyze usage statistics on a very
Semantic Web detailed level. As a rule of thumb the more users are at-
tracted to an online portal, the higher the attractiveness for
an advertiser, because more people can be reached by an
advertising campaign. Therefore the amount of displayed
Permission to make digital or hard copies of all or part of this work for advertisement directly influences the revenue of content pro-
personal or classroom use is granted without fee provided that copies are viders, while usage statistics influence it indirectly.
not made or distributed for profit or commercial advantage and that copies Subsequently, content providers aim to reach and inform
bear this notice and the full citation on the first page. To copy otherwise, to as many users as possible by the help of their content. But
republish, to post on servers or to redistribute to lists, requires prior specific
at present, they limit accessibility and reusability of their
permission and/or a fee.
Copyright I-SEMANTICS 2010, September 1-3, 2010 Graz, Austria Copy- 1
right l’ ACM 978-1-4503-0014-8/10/09...$10.00. http://facebook.com
content, as content is usually only made available on one advertisements can be loaded from local or remote
single site, most notably on their own. Unfortunately, by servers. Advertisement revenue is generated directly
doing so, they are not able to benefit from potential cross- by the content providers. If an external server is used
monetarization effects, as they are unwilling to disseminate a (usually volume based) fee is paid for the usage of
their content. Currently, there is a lack of feasibility to the ad server.
measure content dissemination to make the increased reach
transparent to advertising customers. Therein results the 2. Advertisements are served by third party services:
“dilemma of content providers”. If third party services serve the advertisements they
In this paper we discuss Semantic Web technology ori- are loaded dynamically to the site. Content provid-
ented solutions to address the dilemma of content providers ers embed certain tags as placeholder into their Web
and to illustrate how professional online content providers pages. The appropriate advertisements are loaded dy-
can exploit new possibilities emerging through the recent namically from the ad server. The advertisement rev-
paradigm shift of the Web. In Section 2 we introduce cur- enue is shared between advertiser and the content pro-
rent models for online advertising and methods for the reuse vider.
of content on the Web. Section 3 provides a detailed descrip- The two presented pricing models regulate the way money
tion of the current dilemma of commercial content providers is paid from the advertiser to the publisher. The technical
and Section 4 proposes approaches for solving the previously realization of ad serving does not depend on whether ad-
described problems. Finally, we relate our work with exist- vertisements are billed by CPI or CPC models. In terms
ing work in this area and draw conclusions for future work. of technological realization the central difference is whether
the ad is embedded statically into the page or loaded dy-
2. BACKGROUND namically from a local or remote server.
In this section we shed light on information that forms the 2.2 Content Reuse Methods
basis of our work. We introduce current models for online If content are reused across application boundaries we
advertising and methods for the reuse of content on the Web. must distinguish two different methods:
The information introduced here sets the foundations for
later sections. (1) Content Reuse by Value:
If content are copied by value two individual instances
2.1 Advertising based Business Models of on- of the same content item are generated.
line Content Providers An advantage of this traditional copy and paste mech-
At present revenue of online content providers is mainly anism is that instances of the same content item can
generated by advertisements and therefore is strongly de- evolve independently from each other. However, the
pendent on the number of users consuming their content biggest disadvantage of copying content by value is that
and the advertisements. the content are not kept up to date. If the original con-
The two most prominent advertising pricing models are: tent change the copied versions are not updated. An-
other disadvantage is that copyright statements often
1. Cost per impression (CPI) model: prohibit copying content by value.
The cost per impression model ensures that the con-
tent provider gets a certain amount of money from the (2) Content Reuse by Reference (Transclusions):
advertiser for having displayed an ad a certain amount Transclusions allow including existing content into doc-
of times on his Web site. The advertisement profit de- uments without duplicating it. Content are copied by
pends on how often an ad is displayed and the price reference and content values are fetched and reloaded
the advertiser is willing to pay for an ad impression. on demand. The concept of transclusions has originally
The most popular form of this advertising model is been introduced in the early 1960s by Ted Nelson.
classical banner advertising. Transclusions have several advantages compared with
traditional cut-and-paste mechanisms: they allow for
2. Cost per click (CPC) model: keeping copied content up to date, avoiding copyright
The cost per click model demands that a visitor clicks problems and saving disk space [4]. As shown by [3] ba-
on an ad to visit the advertiser’s site. That means the sic Web technologies such as plain HTML, JavaScript,
advertisement profit depends on the number of clicks CGI scripts and a specialized HTTP proxy application
on an ad and the price the advertiser is willing to pay allow realizing transclusions on the Web.
for a click. The most popular and well known cost per
click advertising program is Google Adsense2 . The biggest disadvantage of transclusions is that for
textual content they never had their breakthrough and
In addition to the advertising pricing model also the ad- never found wide adoption. Furthermore, if the docu-
vertising serving model influences the advertisement rev- ment from where the transcluded content originates be-
enue: comes unavailable, the compound document must han-
dle the broken transclusion.
1. Advertisements are served directly by (or on behalf of)
the content provider: 3. PROBLEM DEFINITION
If advertisements are directly served by the content
providers they can be either embedded statically into In general content providers aim to reach and inform as
the page or be loaded dynamically. In the later case many users with their content as possible. If they would
open their content, interlink it with other content and de-
2
http://www.google.com/adsense scribe it semantically, third parties could aggregate and use
their content. Since content would become accessible via may access which content, but also how the content may or
different sites, it could possibly attract more users and the may not be used afterwards [6]. Machine-interpretable usage
value and reach of content could increase. Content provid- control policies can be used to expose, for example, that an
ers thus could reach more users by allowing third parties article can only be reused together with its advertisement.
(i.e. users, Web application developers, software agents) to Listing 1 shows an example of a semantically annotated
display and use their content. news site. The different resources (advertisement, article,
However, at present content provider are forced to limit embedded video) that are displayed on the site and the rela-
the access to their content to their own sites because of the tions between them are described. The article is an instance
following reasons: of the class Post from the SIOC3 ontology and has certain
properties such as content, author and topic. The fact
1. Presenting advertisements: Currently widely used that the article embeds a video is exposed by using the em-
advertising methods (see Section 2.1) force content beds property of the SIOC ontology which indicates that
providers to keep their content locked on their sites, a resources embeds external content (which may be related
because ads are bound to individual Web sites instead with certain policies and licenses). The article is related
of being bound directly to the content. Consequently with an advertisement which is an instance of the class Ad
the revenue of content provider depends on users con- of our advertisement ontology by using the has_ad property.
suming content via their sites. Content providers can- An ad has the following properties:
not open their content to make them integrable and
reusable for third parties without a loss in revenue. • has_advertiser/advertiser_of: relates an ad with
an advertisement customer (instance of class Person
2. Licences: Parts of the content published by content or Organization of the FoaF4 ontology) who is paying
provider usually comes from third parties, such as press for the ad
agencies or photographers. Original intellectual prop-
erty can be hold by editors that create content ex- • advertises: relates an ad with a product (instance of
clusively for a newspaper or by news agencies which class ProductOrService of the GoodRelations5 ontol-
(pre-)produce content for several newsportals. This ogy) for which an advertisement is made
content falls under limited licenses. Only online news-
papers paying for the content are allowed to publish it • code: relates and ad with its HTML and/or JavaScript
on their sites. Republishing by other sites is prohibited code
by the license of the content.
3. Measuring attractiveness: The attractiveness of <d i v xmlns=” h t t p : //www. w3 . o r g / 1 9 9 9 / xhtml ”
xmlns:content =
online news sites for an advertiser is determined us- ” h t t p : // p u r l . o r g / r s s / 1 . 0 / modules / c o n t e n t / ”
ing performance indicators, such as page impressions, x m l n s : s i o c = ” h t t p : // r d f s . o r g / s i o c / n s#
unique clients, visits or use time. The page impressions x m l n s : e x=” h t t p : // example . com/ t e r m s / ”>
indicator shows how many pages where displayed in a <p a b o u t=”#a r t i c l e ” t y p e o f=” s i o c : P o s t ”
distinct period of time. The unique client factor indi- p r o p e r t y=” c o n t e n t : e n c o d e d ”>
cates the reach of portal in terms of unique users. The <p> a r t i c l e t e x t </p>
<d i v p r o p e r t y=” s i o c : e m b e d s ”>
visits indicator summarizes how often users visited a
<p a b o u t=”#v i d e o ”
site and subsequently retrieved a set of pages. The use p r o p e r t y=” c o n t e n t : e n c o d e d ”> v i d e o h e r e </p>
time indicates how long an individual user has stayed </d i v>
on a site during one visit. <d i v p r o p e r t y=” e x : h a s a d ”>
<p a b o u t=”#ad ” t y p e o f=” e x : a d ”>
All of these traditional performance indicators oper- <span p r o p e r t y=” e x : c o d e ”>
<o b j e c t ><embed
ate on a site level and not on a content bits or data
s r c=” h t t p : // mydomain . com/ p i c t u r e b u t t o n . s w f ”
level. That means these indictors assume that digital t y p e=” a p p l i c a t i o n /x−shockwave−f l a s h ”>
content bits are only published and consumed on one </embed></o b j e c t >
individual site. </span>
<span r e l =” e x : h a s a d v e r t i s e r ” r e s o u r c e=
” h t t p : //www. g i g a s p o r t . a t / a b o u t/#company ”/>
These contradicting factors cause the current dilemma of <span r e l =” e x : a d v e r t i s e s ” r e s o u r c e=
content providers. ” h t t p : //www. g i g a s p o r t . a t / p r o d u c t /234/# t h i s ”/>
</p>
</d i v>
4. SOLUTION APPROACH
In this section we refer to the three problems which force Listing 1: ”Semantic description of advertised con-
content providers to limit the access to their content (see tent”
section 3) and show how Semantic Web technologies can be
used to address these problems. To describe how resources can be used and reused we are
using the Open Digital Rights Language (ODRL6 ). ODRL
4.1 Presenting Advertisements is a vocabulary which allows expressing terms and condi-
Since content provider need a way to open and interlink tions over assets (Web resources which include any physical
their content without loosing advertisement revenues, we 3
http://sioc-project.org/
suggest to semantically annotate content (e.g. articles and 4
http://xmlns.com/foaf/spec
advertisement) published on a Web site to allow machines to 5
http://www.heppnetz.de/ontologies/goodrelations/
interpret the semantics of content and the relations between v1
6
them. Usage control policies allow controlling not only who http://odrl.net
or digital content). ODRL allows expressing permissions photographers, original intellectual property can be hold by
which indicate the actions that a certain party (e.g. user, editors that created the content. Therefore content provid-
role, group) is permitted to perform on a specific target asset ers need a way to expose under which conditions what kind
manifestation (i.e. format) or a range of manifestations of of content usage is allowed. To address this problem we
the target asset. Constraints may optionally constrain per- again suggest the use of Semantic Web policies. With the
missions and duties may indicate requirements that must be help of policies machines will be able to interpret if the per-
fulfilled to receive a permission [1]. mission to reuse a piece of content in a certain context is
Listing 2 extends the example of listing 1 with usage poli- given or not.
cies which indicate that the video can only be reused i.e. be Listing 3 extends the example of listing 1 with usage poli-
aggregated, modified and extracted (permission) together cies which indicate that the article can only be displayed.
with the ad (duty). The meaning of permissions and duties Content reuse actions, such as extractions, aggregations and
is defined by the ODRL data dictionary7 . modifications, are explicitly prohibited by machine-interpretable
<d i v xmlns=” h t t p : //www. w3 . o r g / 1 9 9 9 / xhtml ” polices.
x m l n s : o x=” h t t p : // o d r l . n e t / 1 . 1 /ODRL−EX#”
<div xmlns=” h t t p : / /www. w3 . o r g / 1 9 9 9 / xhtml ”
x m l n s : o d=” h t t p : // o d r l . n e t / 1 . 1 /ODRL−DD#”>
xmlns : ox=” h t t p : / / o d r l . n e t / 1 . 1 /ODRL−EX#”
xmlns : od=” h t t p : / / o d r l . n e t / 1 . 1 /ODRL−DD#”>
<p t y p e o f=” o x : P e r m i s s i o n ”>
<span r e l =” o x : h a s a s s e t ” r e s o u r c e=”#v i d e o ”/>
<p t y p e o f=”ox : P r o h i b i t i o n ”>
<span r e l =” o x : h a s a c t i o n ”>
<span r e l=”ox : h a s a s s e t ” r e s o u r c e=”#a r t i c l e ” />
<span t y p e o f=” o x : A c t i o n ” p r o p e r t y=”ox:name ”
<span r e l=”ox : h a s a c t i o n ”>
c o n t e n t=” e x c e r p t ” />
<span t y p e o f=”ox : A c t i o n ” p r o p e r t y=”ox : name ”
</ span>
content=” e x t r a c t ” />
<span r e l =” o x : h a s a c t i o n ”>
</span>
<span t y p e o f=” o x : A c t i o n ” p r o p e r t y=”ox:name ”
<span r e l=”ox : h a s a c t i o n ”>
c o n t e n t=” a g g r e g a t e ” />
<span t y p e o f=”ox : A c t i o n ” p r o p e r t y=”ox : name ”
</ span>
content=” a g g r e g a t e ” />
<span r e l =” o x : h a s a c t i o n ”>
</span>
<span t y p e o f=” o x : A c t i o n ” p r o p e r t y=”ox:name ”
<span r e l=”ox : h a s a c t i o n ”>
c o n t e n t=” m o d i f i e d ” />
<span t y p e o f=”ox : A c t i o n ” p r o p e r t y=”ox : name ”
</ span>
content=” m o d i f y ” />
<span r e l =” o x : h a s a c t i o n ”>
</span>
<span t y p e o f=” o x : A c t i o n ” p r o p e r t y=”ox:name ”
</p>
c o n t e n t=” a n n o t a t e d ” />
</ span>
<p t y p e o f=”ox : P e r m i s s i o n ”>
<span r e l =” o x : h a s d u t y ”>
<span r e l=”ox : h a s a s s e t ” r e s o u r c e=”#a r t i c l e ” />
<p t y p e o f=”o x :D u t y ”>
<span r e l=”ox : h a s a c t i o n ”>
<span r e l =” o x : h a s o b j e c t ” r e s o u r c e=”#ad ” />
<span t y p e o f=”ox : A c t i o n ” p r o p e r t y=”ox : name ”
<span r e l =” o x : h a s a c t i o n ”>
content=” d i s p l a y ” />
<span t y p e o f=” o x : A c t i o n ” p r o p e r t y=”ox:name ”
</span>
c o n t e n t=” d i s p l a y ” />
</p>
</ span>
</ div>
</p>
</ span>
</p> Listing 3: ”Copy restricted content”
</ d i v>