Audience Behavior Mining Integrating TV Ratings With Multimedia Content

Multimedia Capturing, Mining, and Streaming
broadcasting live up until the time indicated by
Audience the red dotted line. At that point, as the thumb-

nails show, one station (represented by the
green line) showed advertisements—that is,
Behavior Mining: commercial films (CFs). This led to a big transi-
tion in viewers from the “green” station to the
Integrating TV
“blue” station. Similar behavior from the
“blue” station to the “green” station also occurs
at the gray dotted line, where again, the thumb-
nails show a change in live coverage. This indi-
Ratings with
cates that viewers were greatly interested in this
event and switched between stations as needed
for continued event coverage. This example
captures audience behavior for just one event,
Multimedia
but analysis of larger amounts of such data
could enable the automatic extraction of audi-
ence behavior patterns. Furthermore, integrat-
ing such data with multimedia content could
Content
reveal hidden behaviors.
Understanding audience behavior is useful
for several reasons. First, it indicates what is of
interest to people, which is important for creat-
ing TV programs that attract viewers. It’s also
Ryota Hinami important in terms of advertising, because iden-
University of Tokyo tifying patterns that lead to higher ratings can
help broadcasters obtain more sponsors.
Shin’ichi Satoh Finally, it helps with risk management by
National Institute of Informatics revealing how people get information follow-
ing a disaster, which could help determine how
best to convey important information to people
in an emergency. (For more information, see
the “TV Audience Ratings” sidebar.)
Here, we focus on discovering knowledge in
TV ratings data by combining such data with
T
V ratings have been investigated
multimedia content (videos and transcripts).
A general framework for many years, but most work1,2
for audience Our goal is to establish a generic framework for
has focused on forecasting the rat- mining TV audience behavior from TV ratings.
behavior mining ings of particular TV programs. The
integrates the Although user behavior is being explored by
motivation was to estimate the cost of TV adver- many works,3,4 most of this recent work focuses
analysis of TV ratings
tising, because advertising rates are directly on social networks; little work has addressed
and multimedia
linked to TV ratings. Few works have targeted the behavior of TV audiences.
content. Focusing on
the mining of ratings data, even though such To discover relationships between TV ratings
change points in TV
data contains valuable information. Moreover, and multimedia content, we focus on the
ratings data, the
researchers have not investigated how to inte- change points—that is, the points in time when
authors propose
applications for grate TV ratings with multimedia content— people first tune in to a particular TV program.
interactive mining of such as with video data from TV programs. Because these points reflect the active intention
audience behavior Integrating TV ratings with multimedia con- of TV viewers, they contain valuable informa-
and for detecting tent could help identify relationships between tion about viewers’ interests. We describe these
popular news topics. audience behaviors and TV program content. points using visual features extracted from
Consider, for example, Figure 1, which shows video and using keywords extracted from tran-
the ratings data for two TV stations covering scripts. Because the number of such points is
the flooding of the Kinugawa River on 10 Sep- huge, we apply various filters. Here, in this
tember 2015, along with thumbnail images extended version of our earlier work,5 we pro-
from the two broadcasts. Both stations were pose two applications using our framework,
44 IEEE Transactions on Knowledge and Data Engineering(Volume: 24, Issue: 2, Apr.-June 2017)
TV Audience Ratings
TV audience ratings (TV ratings) are a key indicator in the field of TV broadcasting; they’re used to assess the popular-
ity of TV programs. A program’s TV rating indicates the percentage of all TV households tuned in to that program. TV
ratings are a standard measure used to determine the impact of advertising—that is, how many targeted people are
watching the advertisement. Broadcasters focus on increasing the ratings of their programs to acquire more
sponsors.
TV ratings can also be used as sensor to gauge the interests of people. Programs that capture people’s interests
get high ratings. Conversely, if people are not interested in a program’s content, they switch to another channel,
and the ratings decrease. TV ratings can thus act as social sensors that indicate popular topics and social trends—
such as what types of news are of interest and which performers are currently popular. Discovering such knowledge
from TV ratings data can help broadcasters create TV programs that attract more people.
TV ratings also include important information for risk management. In an emergency, such as a natural disaster,
the government must deliver correct information to people in a timely manner. TV is a key medium for conveying
information to many people in real time. By analyzing TV ratings to determine how people get information from TV,
we can judge whether critical information has been correctly delivered and can take measures to improve the dis-
semination of such information.
demonstrating how the framework can extract

valuable knowledge from TV ratings data.
TV Ratings in Japan
In our work, we use TV ratings provided by
Video Research, which started audience meas-
urement in Japan in 1962. It has been the only 17.5
major firm providing audience measurements

15
since the Nielsen Corporation left Japan in 2000.
Their ratings are calculated from sampled house- 12.5
Rating
holds randomly selected from certain areas. The

ratings are surveyed for each of the 27 broadcast 10
areas. According to Video Research’s Rating Guide
Book,6 there are 600 households in each of the 7.5
three main areas of Japan (Kanto, Kansai, and

5
Nagoya) and 200 in each of the other 24 areas,
15:15 15:30 15:45 16:00 16:15 16:30
for a total of 6,600 households. Systematic ran- Time
dom sampling is used to select households: first,
a random starting point is selected, and then Figure 1. Audience behaviors observed in TV ratings data for the flooding of
every nth household is picked up from the list of the Kinugawa River. The blue and green lines represent two stations covering
all households in a certain area. The interval n is the flood. The red line indicates when the “green” station showed an
the total number of households divided by the advertisement and lost viewers to the “blue” station. The gray line indicates
number of households to be sampled. the reverse situation.
For the work presented here, we used the TV
ratings for the Kanto area (the greater Tokyo
area), which has a population of approximately proposed method, we discuss audience behav-
40.6 million, constituting 18.2 million house- iors—that is, how people typically watch TV
holds with a TV. The ratings in Kanto are calcu- and switch channels.
lated on the basis of data collected from 600 Figure 2 shows examples of audience behav-
sampled households.6 The rating for a particu- iors. Audiences seemingly select channels
lar TV program is defined as the percentage of according to their interests. However, audiences
households that watched the program, which is do not always watch the programs that best
recorded minute by minute. match their interests, because they only have
information about the channel they’re watch-
Typical Audience Behavior ing (the highlighted regions in Figure 2); they
Our focus here is on the mining of audience don’t have information about the channels
behaviors. Before explaining the details of our they’re not watching (the shaded regions). In
45
Turn on TV and zap to Switch channel because “channel surfing”). In the examples in Figure 2,
Audience find target news CF starts, topic changes
behaviors (15:10) (15:32, 15:55)
channel switching when the topic changes or
t
when a CF starts corresponds to such cases.
ch1 Audiences change channels and start zapping
when uninteresting content (such as a CF)
ch2
starts. Moreover, because audiences get the
ch3 information of all channels by zapping, they
can select the channel of most interest. In other
ch4
words, such cases show the active intentions of
Events CF starts Topic is changed audiences. Our framework extracts such active-
on TV (15:32, ch2) (15:55, ch4)
audience intentions by focusing on the change
(a) points in TV ratings, which facilitate the discov-
ery of audience interests.
Turn on TV to Watch drama Keep watching after Zapping when Switch to the channel
watch drama (20:00–21:00) the end of drama CF starts of most interests
(20:00) (21:00–45) (21:45–47) (21:47) t A Framework for Mining Audience
Behavior
ch1
To automatically find particular patterns or
ch2 events indicating the interests of people, we
focus on the change points in ratings data. In
ch3
particular, we focus on the micro-level change
ch4 points, where the per-minute ratings change sig-
nificantly in a minute. Because the number of
Drama starts Drama ends CF starts
(20:00, ch2) (21:00, ch2) (21:45, ch2) viewers increases or decreases suddenly at such
(b) points, we assume that these points provide
more valuable information than other points,
Figure 2. Example behaviors of two TV audiences: one audience member (a) so we detect and analyze them in combination
searches for information on specific news topics and another (b) turns on TV with video content and other metadata, such as
to watch a specific drama. The channels that the audience member watches captions. We can better understand the interests
are highlighted and other channels are shaded. The audience member only of viewers by analyzing the content correspond-
has information on the highlighted channels. (CF: commercial film.) ing to these change points.
The meaningful patterns in TV ratings data
can be discovered by mining a large number of
change points. The pipeline of the proposed
Ratings data Video Captions EPG framework, shown in Figure 3, has three steps:
1. Detect change points: Detect as many
micro-level change points as possible from
Change-point Change-point Filtering, ratings data.
detection description aggregation Visualization
2. Describe change points: Extract information

Interactive feedback for each change point from multimedia
Figure 3. The flow of the proposed framework: detect, describe, and filter and
data (extract visual features from video
aggregate change points.
data, for example).
3. Filter and aggregate: Filter points to reject

Figure 2a, for example, and audience member
Engineering(Volume: 24, Issue:
noise or extract the target, and aggregate

looks at all four channels before selecting filtered points to extract knowledge.
channel 2. However, in Figure 2b, the audience
member starts with channel 2 without check- Our main focus is on steps 2 and 3—that is,
IEEE Transactions on
Knowledge and Data
ing the other channels and continues watching describing, filtering, and aggregating the
2, Apr.-June 2017)
that channel, even after the targeted drama has change points using multimedia content.
ended. Such passive watching behavior does Various features are extracted from one-
not reflect the audience’s intention. minute data for each point to attach rich
The only cases when we can directly observe descriptions that characterize the points. These
the intentions of audiences is when they characterizations are used to filter and aggre-
change channels after “zapping” (also known as gate the points, and the filtered points are then
46
Table 1. Visual features for each frame.
Feature No. of Short description

dimensions
Hue-saturation 3 Mean hue, saturation, and brightness
value (HSV)7
HSV (affective)7 3 Pleasure, arousal, and dominance computed from brightness and saturation
Color clusters7 11 Ratio of pixels covered by each color cluster (black, blue, brown, gray, green, orange, pink, purple, red,
white, and yellow)
7
Texture 5 Texture based on the gray-level co-occurrence matrix (GLCM): entropy, dissimilarity, energy, homogeneity,
and contrast
Object 9 ImageNet classification scores for top-level categories (plant, geological, natural, sport, artifact, fungus,
(top-level) and person)
Object 100 Binary feature indicating whether a frame belongs to each of 100 object clusters regenerated for TV
(re-generated) content
Emotion 8 Scores for amusement, anger, awe, contentment, disgust, excitement, fear, and sadness, computed using a
method described elsewhere7
visualized. Visualizing the statistics of change ber of households start watching a particular
points helps us better understand patterns in TV program at change points, either by turning
ratings increases. We can interactively add or on the TV or switching from other programs.
change filters to visualize more detailed cases For example, given that the Kanto area has 18.2
and find meaningful patterns. million households with a TV, a rating increase
We use the following multimedia data to of 1 percent means that over 182,000 house-
describe the change points: holds tuned into the program.
video—that is, broadcast videos corre-
Describing Change Points
sponding to the TV ratings;
We use several features to describe change
captions—that is, text transcripts of TV points.
programs; and
Visual features. We use several visual features
electric program guide (EPG) informa- to characterize images (summarized in Table 1).7
tion—that is, information on TV programs, We use color and texture as low-level visual fea-
including title, category, and description. tures, and we use object and emotion features
as mid-level features. The details of each feature
We can describe the content at each change are described elsewhere.5
point with rich information using this data. Fil-
tering and aggregating described points enables Object features. We generate two object cate-
a precise analysis of the relationship between gory features based on the ImageNet large-scale
the ratings and the multimedia content. visual recognition competition (ILSVRC) classi-
fication score.8 First, we use the scores for the
Functions nine top-level categories of ImageNet, calcu-
Here, we review the functions of the proposed lated by aggregating the classification scores for
framework. 1,000 object categories. Second, we define new
object categories suitable for TV content by
Detecting Change Points clustering the features of Convolutional Neural
We adopt a simple approach to detect change Networks. On the basis of these categories,
points—we identify the points where the rat- binary feature are obtained for each frame,
ings increase (or decrease) above a predeter- where each dimension indicates whether the
mined threshold in a minute. The rate of frame belongs to each cluster. Each generated
increase (or decrease) for each minute is calcu- cluster represents certain objects or scenes that
lated as the difference between the previous frequently appear in TV content. Figure 4 shows
and current per-minute ratings. A certain num- three examples of created clusters (representing
47
(a) (b) (c)
Figure 4. Examples of clusters created for TV content: the clusters represent (a) sumo wrestling, (b) other sports, and (c) weather
reports.
sumo wrestling, other sports, and weather detected CFs using a method based on frequent
reports). sequence mining.9
We used two filters to reject change points.
Keywords. Keywords are extracted from the The first filter rejects change points that occur
transcripts at each change point. The words during a CF and within two minutes after the
that characterize the TV content are selected as CF. This helps remove changes motivated by a
keywords. We first extract nouns from caption CF (instead of by user interests). The second fil-
and then remove words that are not important ter extracts change points corresponding to the
to characterize the content. The details of key- transition from a channel showing a CF to
word extraction are described elsewhere.5 another channel, which is used to focus on peo-
IEEE Transactions on Knowledge and Data Engineering(Volume: 24, Issue: 2, Apr.-June 2017)
ple’s interest in the program content. At such

Other information. We obtain basic informa- points—that is, those corresponding to when a
tion, such as the broadcast time and category of TV station starts a CF—some viewers start zap-
TV programming, from the EPG. ping and select a TV program of most interest.
Filtering Techniques Visual features and their thresholds. As

Here, we describe the filtering techniques noted earlier, visual features are used to filter
we use. out change points. The type of visual feature
(see Table 1) and the threshold of its value
TV program boundaries. Because many peo- determine the filter. For example, we can
ple switch the channel when a new TV program extract the points that have more than 50 per-
starts, TV ratings change greatly at the boun- cent black pixels, or we can filter out the points
dary of a TV program. The change points at the that have a score for “animal” of more than 0.5
boundary of a TV program reflect the audience’s (or do both).
interest in the TV program itself—not their
interest in the content being shown at the time Other filters. In addition to the filters just
of change. This means that the points at boun- introduced, we use four additional filters: time
daries are sometimes noise obscuring, making it period (such as 19:00–23:00), term (such as
difficult to analyze the relationship between Sep.–Nov.), TV program category (such as
the content and TV rating. We therefore imple- news), and keyword (such as Kinugawa River).
ment filters to exclude points at the bounda-
ries—that is, the points within five minutes of a Experiments
TV program beginning. Our data includes the audience ratings for
seven TV stations in the Kanto area in 2015. We
Commercial films. Many change points occur also use video data, transcripts, and EPG data
around CFs, because a certain number of people from the NII-TVRECS Video Archive System.10
change the channel when a CF starts. We detect
CFs from all broadcast data beforehand, User Behavior Mining System
because we sometimes want to deal with TV Based on our framework, we developed an
content separate from the CFs—for example, interactive system for discovering user behav-
to exclude transitions made during a CF. We ior. The system detects change points using an
48
Filters
Add/change
Apply filters
filters
Feature statistics
of filtered points
Feature distribution Breakdown by object
See details of
selected feature
Breakdown by dominant color and emotion Ratings graph List of change points
Figure 5. The interface for the user behavior mining tools. It shows how to use the tools and what each
function indicates
Table 2. Filters used in our framework.
Filter Argument (input)

Exclude program boundary On or off
Exclude CF On or off
Limit to transition at time of CF On or off (enabled or disabled)
Visual feature Category of feature, threshold (such as entropy—more than 0.5)
Time period Start time, end time (such as 19:00–23:00)
Term List of months (such as Sept., Oct., Nov.)
Category Category of TV program (such as news)
Keywords List of keywords (such as Kinugawa River)
“increase” threshold of 0.7 percent and a summarizes the filters that can be used. Here,
“decrease” threshold of 0.9 percent; 67,728 and because we want to analyze the news, we select
41,367 points were detected, respectively. We the news category filter (see the top left corner
computed all of the visual features listed in of Figure 5).
Table 1 for the detected points, using the fea-
tures to filter and aggregate points. We then Statistical analysis. The points after filtering
rejected the change points based on user-speci- are aggregated and visualized, which provides a
fied filters and visualized the aggregated result. clue to find the meaningful pattern of TV rating
Analyzing the increase and decrease points increases/decreases. That is, the visualized
revealed valuable knowledge, such as which results display information that should help
types of programs were of interest to people, identify which feature is related to the TV rat-
which program features resulted in high TV ratings. For each visual feature, the differences in
ings, and how people behaved following a cer- the means of the feature values over the
tain type of event. increased/decreased change points is shown in
Figure 5 shows the system interface and Figure 5 (see the “feature statistics”). This infor-
flow. In the following, we exemplify system mation indicates which features are correlated
usage by analyzing what type of news captured with TV ratings. Features with a large difference
users’ interests. seem to contribute to an increase in the TV rat-
ing, and vice versa.
Filtering. The user first specifies the filters in A breakdown in the form of pie charts of the
accordance with the analysis target. Table 2 change points by dominant color and emotion
49
is also displayed in Figure 5. “Dominant color” ness and saturation) were the highest. The
means the color with the highest value of the example thumbnails indicate that scenes where
11 basic colors. The same is true of dominant the screen is heavily dark tend to be serious
emotion. The user selects the feature that seems scenes during which viewers usually stay on
to be important from these charts in order to that channel. For variety shows (Figure 6c), we
analyze it in detail. In this example, the user found that high entropy and amusement tend
selects the feature “blue.” to increase TV ratings. The example thumbnails
indicate that lively scenes match the needs of
Feature details. The system then shows, in the people watching variety shows. The difference
form of a graph, details for the features selected in user behavior for variety shows versus
by the user. The graph shows the distribution of dramas becomes clear when comparing the vis-
the selected features in terms of increase and ualizations produced by our system.
decrease points. It also shows a breakdown by In contrast to Q1–Q3, for our fourth question
dominant regenerated objects for the change (Q4) focuses on the decrease points (see Figure
points with the 200 highest values (the “bluest” 7a). Because the initial statistics suggest that
objects, in this example). Thumbnails of the there’s a correlation between the color brown
points corresponding to each object are also and decrease points, we selected brown and
shown. These charts reveal that blue is the most found that people tended to switch the channels
significant color because of the weather reports, during news commentary and live televised
indicating that the viewers were highly inter- broadcasts of parliamentary proceedings. On the
ested in such reports. basis of the results, we filtered out objects corre-
In addition to these charts, the system shows sponding to news commentary and parliamen-
lists of points with thumbnails and related tary proceedings to analyze other factors. After
IEEE Transactions on Knowledge and Data Engineering(Volume: 24, Issue: 2, Apr.-June 2017)
information, sorted by the feature value (see filtering, black seemed the most significant fea-
the list of changes points in Figure 5). The sys- ture. By selecting and further analyzing this color
tem also provides more detail for each point in feature, we found that live broadcasts in the dark
the form of a “ratings graph” along with the are also correlated with a ratings decrease.
program guide information. For Q5, we tested whether animals influence
ratings (see Figure 7b). We first compared the
Refiltering with interactive feedback. We distribution of the scores for an animal between
can interactively add or change filters to dis- the increase and decrease points for certain
cover additional patterns. In the example categories and found that a difference was
shown in Figure 5, the object corresponding to observed, especially for variety shows. To dis-
weather reports (ID ¼ 85) can be filtered out by cover more about how an animal can contrib-
adding a filter for object features. This filter lets ute to TV ratings, we used an animal filter with
us analyze popular new topics after removing a threshold score of 0.3. Analyzing decrease
the noise from the weather report. In this way, points revealed that animals do indeed seem to
numerous patterns of audience behavior can be influence ratings, except for animals in water.
discovered by combining various types of filter- For Q6, we targeted a specific event (the
ing with interactive feedback. flooding of the Kinugawa River) to learn how
Figures 6 and 7 show examples of questions viewers behaved following that event. We
that can be asked to obtain knowledge from TV searched for the target event by combining the
ratings data using our framework. The first following filters: month, program category, and
three questions (Q1–Q3, shown in Figure 6) rep- keywords. The increase and decrease points
resent an analysis for finding a particular pat- revealed that people tended to watch live
tern of a ratings increase for a specific broadcasts and turn to other channels when
category—here, the categories were news, live event coverage ends.
dramas, and variety shows. In addition to the
program category filter, we used the filters that News Event Detection and Analysis
extract transitions made during a CF to focus We developed an application to detect news sto-
on people’s interest in the program content. ries of interest by analyzing the set of change
For the news program (Figure 6a), we found points in news programs. We use increase points
that weather reports and sports news were pop- observed in two categories—news and informa-
ular. For dramas (Figure 6b), scores for “black” tional programs—which reflect viewers’ interest
and “dominance” (computed from the bright- in news stories. In addition to finding popular
50
Q1: In what type of news are people interested?
Weather report Sports news
Blue Green
Remove
weather
report
Filter: news
Filter: news, remove object (85)
(a) → A: People are interested in weather reports and sports news.

Q2: What makes audiences watch drama?
Black Dominance
Filter: drama
(b) → A: People tend to stay on a channel in serious scenes when the screen becomes dark.
Q3: What is the key indicator to increase ratings for variety shows?
Entropy
Amusement
Filter: variey
(c) → A: People want lively scenes with high entropy in variety shows.
Figure 6. Examples of analyzing user behavior using the proposed mining system to detect an increase in ratings for a specific
category. These examples answer questions about (a) news, (b) drama, and (c) variety shows.
news topics, our analysis revealed other valua- the Yomiuri newspaper, as the ground truth.
ble knowledge. For example, by comparing Table 3 shows the correspondence between the
broadcasting times and ratings increases, we ground truth and our top 10 detected results.
could analyze whether the intention of TV The CP column shows the rank based on the
stations matched the interests of viewers. In number of change points in each detected news
addition, analyzing audience behaviors when story. This result shows our method detected
watching news about a disaster can help broad- five of the 10 ground truth stories. The Yomiuri
casters and others determine how best to rankings shows the “big” news stories in gen-
disseminate information regarding disaster eral, while our rankings shows the news that
preparedness to maximize who sees, retains, viewers were most interested in (because many
and acts on such information. people switch channels to a specific news story
We detect news stories by analyzing keywords at the change point). Our results revealed that
that co-occur with increase points. We first con- viewers tended to switch the channel to nega-
struct a graph from change points, with each tive news stories, such as those about a murder
node corresponding to an increase point. We can or disaster, to proactively collect more informa-
find clusters that consist of multiple change tion about the specific news stories. However,
points by mining the graph. The clusters that con- viewers didn’t actively switch the channels to
tain the most increase points correspond to the positive news stories, such as those covering a
news stories of most interest. The details of the Nobel Prize aware or the Rugby World Cup; the
detection algorithms are described elsewhere.5 viewers instead passively watched the stories.
To evaluate whether our method could Table 3 also shows the accumulated airtime
detect important news stories, we used the top of each news story. The murder cases that were
10 news stories in Japan for 2015, as reported by ranked in our results but not ranked by the
51
Q4: When do people switch the channel in news programs?
Brown Black
Remove
some
objects
Filter: news,
Filter: news reject object (18, 82, 67)
(a) → A: People tend to switch channels during news commentary, parliamentary proceedings, and live broadcasts in the dark.
Q5: Do animals influence ratings?
Distribution of animal scores for each program category
Blue
News Information Drama Variety Filter: animal>0.3
(b) → A: Animals influence ratings, especially for variety shows, except for animals in water.
Q6: How did the audience behave when the Kinugawa River flooded?
Blue Gray
Filter: news+information
keyword (Kinugawa) Example of increase points Example of decrease
(c) → A: People tended to watch live broadcasts and switch the channel when the scene changed.
Figure 7. Examples of analyzing user behavior using the proposed mining system to (a) detect a rating decrease for news programs, (b)
determine whether animals contribute to ratings, and (c) detect audience behavior related to an event.
Yomiuri newspaper (the Osaka, Kawasaki, and within broadcasting time for each of the three
IEEE Transactions on Knowledge and Data Engineering(Volume:
Wakayama murder cases) were ranked highly in disasters. Airtime is indicative of the TV stations’
airtime, while the important news stories that intent, while we used the highest per-minute
were not detected by our method, such as the rating to show what percentage of people
launch of Hokuriku Shinkansen, also received watched news about the disaster that day. This
much airtime. This indicates that TV stations result shows that the flooding of the Kinu-
broadcast important news stories equally to gawa received the most airtime (Figure 8a), and
some extent, regardless of people’s level of many people watched the broadcast (26 per-
interest, while news stories of public interest cent, which is significantly high in Japan). On
receive more airtime. the other hand, the Great Hanshin Earthquake,
We also analyzed how the information which occurred 20 years earlier, was not as
related to a disaster is broadcast and viewed. We widely broadcast and thus had lower ratings
compared the three disasters: the flooding of (Figure 8c). Although ratings for the Great East
the Kinugawa River on 10 September 2015 (Fig- Japan Earthquake, which happened four years
24, Issue: 2, Apr.-June 2017)
ure 8a), the Great East Japan Earthquake on 11 before the Kinugawa flooding, are not as signif-
March 2011 (Figure 8b), and the Great Han- icant as for the flooding, on that particular
shin/Awaji Earthquake on 10 January 1995 (Fig- day (11 March 2011), TV stations spent a lot
ure 8c). We extracted the broadcasting time for of time broadcasting the disaster, and nearly
each disaster with appropriate keywords and 20 percent of potential viewers were watching.
compared the three disasters using airtime and Moreover, this event occurred between two
ratings within the broadcasting time. holidays, which also might have been a con-
Figure 8 shows the accumulated airtime per tributing factor in terms of the number of peo-
day and the highest per-minute rating in a day ple watching. This suggests the intention of
52
Table 3. Correspondence between top 10 news stories in Yomiuri Newspaper and our top 10 detection results based on the number
of change (increase) points.
Story Detected keywords Yomiuri rank No. of change points Time(rank)†

Two Japanese awarded Nobel Prize (Nobel Prize, Kajita, Omura)* 1 - 11h 57m (11)
Three victories in Rugby World Cup (Rugby, World Cup, Goroumaru)* 2 - 3h 3m (15)
Japanese hostages killed by ISIS Goto, Jordan, Islam, release, restraint 3 2 115h 25m (1)
Start of My Number system (My Number, system, individual)* 4 - 19h 51m (9)
Heavy rainfall in Kanto and Tohoku areas outburst, save, levee, heavy rain, Kinugawa 5 1 29h 45m (5)
Enactment of security bills bill, ruling party, vote, security 6 10 21h 35m (8)
Launch of Hokuriku Shinkansen (Hokuriku, Shinkansen, Launch)* 7 - 21h 53m (7)
Building data falsification material, falsification, building, construction 8 8 24h 43m (6)
Agreement in principle on TPP (TPP, agreement, partnership)* 9 - 9h 58m (14)
Withdrawal of Tokyo Olympics logo emblem, design, Olympics, Belgium, logo 10 6 18h 20m (10)
November 2015 Paris attacks Paris, terrorism, France, multiple - 3 89h 36m (2)
Osaka junior high school students’ murders Hirata, Yamada, abandonment, Neyagawa - 4 55h 52m (3)
Kawasaki murder at Tamagawa Uemura, boy, Kawasaki, bank, Tamagawa - 5 34h 36m (4)
Plane crash in Chofu small size, crash, Chofu, airframe, airfield - 7 10h 10m (13)
Wakayama 11-year-old boy’s murder Nakamura, cutlery, boy, Morita, Kinokawa - 9 10h 37m (12)
*
Keywords annotated by hand because our methods didn’t detect the news story.
†
The accumulated airtime for each news story and the ranking based on airtime among the 15 news stories.
6 25 6 25 6 25
5 5 5
20 20 20
Airtime (hours)
Airtime (hours)
Airtime (hours)
4 4 4
Rating (%)
Rating (%)
Rating (%)
15 15 15
3 3 3
10 10 10
2 2 2
1 5 1 5 1 5
0 0 0 0 0 0
Sep. 5
Sep. 6
Sep. 7
Sep. 8
Sep. 9
Sep. 10
Sep. 11
Sep. 12
Sep. 13
Sep. 14
Sep. 15
Mar. 6
Mar. 7
Mar. 8
Mar. 9
Mar. 10
Mar. 11
Mar. 12
Mar. 13
Mar. 14
Mar. 15
Mar. 16
Jan. 12
Jan. 13
Jan. 14
Jan. 15
Jan. 16
Jan. 17
Jan. 18
Jan. 19
Jan. 20
Jan. 21
Jan. 22
(a) (b) (c)
Airtime Ratings Holiday Day of disaster
Figure 8. The accumulated airtime and the highest per-minute rating per day in terms of broadcasts about the three disasters: (a) the
flooding of the Kinugawa in 2015, (b) the Great East Japan Earthquake in 2011, and (c) the Great Hanshin Earthquake in 1995.
TV stations to make people more aware of This article is an extended version of “Audience
disaster preparedness. Behavior Mining by Integrating TV Ratings
with Multimedia Contents,” presented at ISM
2016. This article is based on the joint research
F uture work will include integrating more

extensive video content analysis, which
will disclose the interests of viewers more
project for the National Institute of Infor-
matics, Video Research Ltd., and Sonar Co.
specifically. MM References
1. P.J. Danaher and T.S. Dagger, “Using a Nested
Acknowledgments Logit Model to Forecast Television Ratings,” Int’l J.
This special issue is a collaboration between Forecasting, vol. 28, no. 3, 2012, pp. 607–622.
the 2016 IEEE International Symposium on 2. P.J. Danaher, T.S. Dagger, and M.S. Smith,
Multimedia (ISM 2016) and IEEE MultiMedia. “Forecasting Television Ratings,” Int’l
53
J. of Forecasting, vol. 27, no. 4, 2011, pp. Trans. Circuits and Systems for Video Technology,
1215–1240. vol. 23, no. 6, 2013, pp. 1054–1069.
3. H. Yin et al., “A Temporal Context-Aware Model for 10. N. Katayama et al., “Mining Large-Scale Broadcast
User Behavior Modeling In Social Media Systems,” Video Archives Towards Inter-Video Structuring,”
Proc. 2014 ACM SIGMOD Int’l Conf. Management of Advances in Multimedia Information Processing—
Data (SIGMOD), 2014, pp. 1543–1554. PCM 2004, LNSC 3332, Springer, 2005, pp.
4. A. Ferraz Costa et al., “RSC: Mining and Modeling 489–496.
Temporal Activity in Social Media,” Proc. 21th ACM
SIGKDD Int’l Conf. Knowledge Discovery and Data Ryota Hinami is a PhD candidate in the Department
Mining (SIGKDD), 2015, pp. 269–278. of Information and Communication Engineering,
5. R. Hinami and S. Satoh, “Audience Behavior Min- Graduate School of Information Science and Tech-
ing by Integrating TV Ratings with Multimedia nology, at the University of Tokyo. His research
Contents,” IEEE Int’l Symp. Multimedia (ISM), 2016, interests include multimedia and computer vision.
pp. 44–51. Hinami received an MS in information and commu-
6. TV Rating Guide Book [in Japanese], Video Research, nication engineering from the University of Tokyo.
2016; www.videor.co.jp/rating/wh/rgb201610.pdf. Contact him at hinami@nii.ac.jp.
7. J. Machajdik and A. Hanbury, “Affective Image
Classification Using Features Inspired by Psychol-
ogy and Art Theory,” Proc. 18th ACM Int’l Conf. Shin’ichi Satoh is a professor in the Digital Content
Multimedia (MM), 2010, pp. 83–92. and Media Sciences Research Division at the
8. A. Krizhevsky, I. Sutskever, and G.E. Hinton, National Institute of Informatics (NII), Japan. His
“ImageNet Classification with Deep Convolutional research interests include image and video analysis
Neural Networks,” Proc. Advances in Neural Infor- and database construction, management, image and
mation Processing Systems (NIPS), 2012, pp. video retrieval, and knowledge discovery based on
1097–1105. image and video analysis. Satoh has a PhD in infor-
9. X. Wu and S. Satoh, “Ultrahigh-Speed TV Com- mation engineering from the University of Tokyo.
mercial Detection, Extraction, and Matching,” IEEE Contact him at satoh@nii.ac.jp.
54

Audience Behavior Mining Integrating TV Ratings With Multimedia Content

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Audience Behavior Mining Integrating TV Ratings With Multimedia Content

Hochgeladen von

Copyright:

Verfügbare Formate

Multimedia Capturing, Mining, and Streaming

broadcasting live up until the time indicated by

Audience the red dotted line. At that point, as the thumb-

demonstrating how the framework can extract

major firm providing audience measurements

holds randomly selected from certain areas. The

three main areas of Japan (Kanto, Kansai, and

2. Describe change points: Extract information

3. Filter and aggregate: Filter points to reject

noise or extract the target, and aggregate

Feature No. of Short description

ple’s interest in the program content. At such

Filtering Techniques Visual features and their thresholds. As

Table 2. Filters used in our framework.

Filter Argument (input)

Weather report Sports news

(a) → A: People are interested in weather reports and sports news.

Distribution of animal scores for each program category

News Information Drama Variety Filter: animal>0.3

Story Detected keywords Yomiuri rank No. of change points Time(rank)†

Airtime Ratings Holiday Day of disaster

F uture work will include integrating more

Das könnte Ihnen auch gefallen