Tobacco Control


International Agency for Research on Cancer

World Health Organization

Volume 12

Methods for Evaluating

Tobacco Control Policies


This Handbook was made

possible thanks to the
generous funding by the
Ministre de la Sant,
de la Jeunesse et des
International Agency for Research on Cancer

The International Agency for Research on Cancer (IARC) was established in 1965 by the World Health Assembly, as an
indepently financed organization within the framework of the World Health Organization. The headquarters of the Agency
are in Lyon, France.
The Agency conducts a programme of research concentrating particularly on the epidemiology of cancer and the
study of potential carcinogens in the human environment. Its field studies are supplemented by biological and chemical
research carried out in the Agencys laboratories in Lyon and, through collaborative research agreements, in national
research institutions in many countries. The Agency also conducts a programme for the education and training of
personnel for cancer research.
The publications of the Agency contribute to the dissemination of authoratative information on different aspects of
cancer research. Information about IARC publications, and how to order them, is available via the Internet at:
This publication represents the views and opinions of an IARC Working Group on Methods for Evaluating Tobacco
Control Policies which met in Lyon, France, 12 - 19 March 2007.


The IARC Tobacco Control Handbook Volume 12 was funded by the Ministre de la Sant, de la Jeunesse
et des Sports, France.

IARC Handbooks of Cancer Prevention

Published by the International Agency for Research on Cancer,

150 cours Albert Thomas, 69372 Lyon Cedex 08, France

International Agency for Research on Cancer, 2008

Table of Contents

List of participants

Chapter 5
Acknowledgements ..................................................viii Strategies for evaluating specific policy domains ....189

Section 5.1
Preface .....................................................................ix
Measures to assess the effectiveness of tobacco
Chapter 1
Ensuring effective evaluation of tobacco control
Section 5.2
interventions ............................................................1
Measures to assess the effectiveness of
smoke-free policies ..................................................215
Chapter 2
General methods and common measures ...............33
Section 5.3
Measures to assess the effectiveness of tobacco
Section 2.1
product regulation ....................................................231
The importance of design in the evaluation of
tobacco control policies ...........................................33
Section 5.4
Measures to assess the effectiveness of
Section 2.2
restrictions on tobacco marketing communications ...259
Developing and assessing comparable questions in
cross-cultural survey research on tobacco...............59
Section 5.5
Measures to assess the effectiveness of tobacco
Chapter 3
product labelling policies ..........................................287
Outcomes and major determinants ..........................75
Section 5.6
Section 3.1
Measures to assess the impact of anti-tobacco
Measuring tobacco use behaviours .........................75
public communication campaigns ............................319
Section 3.2
Section 5.7
General mediators and moderators of tobacco
Measures to assess the effectiveness of tobacco
use behaviours.........................................................107
cessation interventions.............................................351
Section 3.3
Chapter 6
Measurement of nicotine dependence .....................123
Summary .................................................................367
Chapter 4
Chapter 7
Existing data sources ...............................................137
Recommendations ...................................................381
Section 4.1
References ...............................................................383
Data sources for monitoring tobacco control
policies .....................................................................137
Appendices ..............................................................413
Section 4.2
Working Procedures for the IARC Handbooks of
Using production, trade and sales data in tobacco
Tobacco Control .......................................................453
control ......................................................................153

Section 4.3
Data sources for monitoring global trends in
tobacco use behaviours ...........................................161

Ron Borland (Co-Chair) Carolyn Dresler (not attending) Prakash C. Gupta

Cancer Control Research Institute Tobacco Prevention and Cessation Healis-Sekhsaria Inst. for Public Health
The Cancer Council Victoria Program Plot No. 28, Sector 11
1 Rathdowne Street Arkansas Department of Health CBD Belalpur
Carlton, Victoria 3053 4815 W Markham St. 601/B Great Eastern Chambers
Australia PO Box 1437, Slot H-3 Navi Mumbai
Little Rock, AR 72203-1437 India
K. Michael Cummings (Co-Chair) USA
Department of Health Behaviour David Hammond
Roswell Park Cancer Institute Jean-Francois Etter Department of Health Studies and
Elm and Carlton Streets Faculte de Medecine Gerontology
Buffalo, NY 14263 Universite de Geneve University of Waterloo
USA 1 rue Michel-Servet 200 University Avenue West
CH-1211 Geneve 4 Waterloo, Ontario N2L 3G1
Timothy Baker (not attending) Switzerland Canada
Center for Tobacco Research and
Intervention Geoffrey T. Fong Gerard Hastings (not attending)
Univeristy of Wisconsin Medical School Ontario Institute for Cancer Research Centre for Tobacco Control Research
1930 Monroe Street, Suite 200 and Department of Psychology University of Stirling and the
Madison, WI 53711-2027 University of Waterloo Open University
USA 200 University Avenue West Stirling FK9 4LA
Waterloo, Ontario N2L 3G1 Scotland
Ursula Bauer Canada
Tobacco Control Program Andrew Hyland
New York State Department of Health Gary A. Giovino Department of Health Behaviour
ESP Corning Tower, Room 710 Department of Health Behavior Roswell Park Cancer Institute
Albany, NY 12237-0676 School of Public Health and Health Elm and Carlton Streets
USA Professions Buffalo, NY 14263
SUNY at Buffalo USA
Frank J. Chaloupka 622 kimball Tower
Economics, College of Business Buffalo, NY 14214-3079 Luk Joossens, (not attending)
Administration USA Belgian Foundation Against Cancer
Health Policy and Administration 479 Chausse de Louvain
University of Illinois at Chicago G. Emmanuel Guindon B-1030 Brussels
601 S. Morgan St, Room 2103 Centre for Health Economics Belgium
Chicago, IL 60607-7121 and Policy Analysis
USA Health Sciences Centre 3H1 area
McMaster University
1200 Main Street West
Hamilton, Ontario L8N 3Z5

List of Participants

Alan Lopez, (not attending) Megan E. Piper Martina Potschke-Langer

The University of Queensland Center for Tobacco Research & Cancer Prevention and WHO
Herston Road Intervention Collaborating Center for Tobacco
Herston Qld 4006 University of Wisconsin Control
Australia Medical School Deutsches Krebsforschungszentrum
1930 Monroe St., Suite 200 Im Neuenheimer Feld 280 D-69120
Anne Marie MacKintosh (not Madison, WI 53711-2027 Heidelberg
attending) USA Germany
Institute for Social Marketing
University of Stirling and the Open James F. Thrasher IARC Secretariat
University Health Promotion, Education and Andrea Altieri
Stirling FK9 4LA Behavior Robert Baan
Scotland School of Public Health Julien Berthiller
University of South Carolina Paolo Boffetta (Group Head)
Ann McNeill 800 Sumter Street, Room # 215 Lars Egevad
Division of Epidemiology & Public Columbia, SC 29208 Fabrizio Giannandrea (Post-Meeting)
Health USA; and Julia Heck
University of Nottingham Instituto Nacional de Salud Pblica, Mara E. Len (Responsible Officer)
Clinical Sciences Building Cuernavaca, Beatrice Secretan
Notthingham NG5 1BP Mexico Kurt Straif
Charles (Wick) Warren Administrative assistance
Mark Parascandola Office on Smoking and Health Catherine Benard (Secretarial)
Tobacco Control Research Branch Centers for Disease Control and Latifa Bouanzi (Library)
National Cancer Institute Prevention John Daniel (Editor)
6130 Executive Blvd. MSC 7337 4770 Buford Highway, NE Jennifer Donaldson (Editor)
Bethesda, MD 20892 Atlanta, GA 30341-3717 Roland Dray (Graphics)
USA USA Sharon Grant (Library)
Georges Mollon (Photography)
Armando Peruga Representatives Sylvia Moutinho (Secretarial)
Tobacco Free Initiative Nathan Jones Annick Rivoire (Secretarial)
World Health Organization Office on Smoking and Health Josephine Thevenoux (Layout)
Geneva Global Tobacco Control Program
Switzerland Centers for Disease Control and
Patrick Petit 4770 Buford Highway, NE
Tobacco Free Initiative Atlanta, GA 30341-3717
World Health Organization USA

The Working Group acknowledges the major contribution to the work presented in this Handbook by
Mary E. Thompson (University of Waterloo, Waterloo, Ontario, Canada), Daniel M. Bolt (University of
Wisconsin, Madison, Wisconsin, USA), Matthew C. Farrelly (RTI International, Research Triangle Park,
North Carolina, USA), Timothy P. Johnson (University of Illinois, Chicago, Illinois, USA) and Karl E.
Wende and Jennifer L. Graf (University of Buffalo, Buffalo, New York, USA).

The IARC secretariat is grateful to the staff of the Libraries at the International Agency for Research on
Cancer, Lyon, France and the World Health Organization, Geneva, Switzerland.

The IARC Handbooks on Cancer been covered in the Handbooks. through the Framework Conven-
Prevention have traditionally However, we know from tion on Tobacco Control (WHO
presented the scientific evidence numerous publications that one FCTC). The WHO FCTC
on the effects of interventions, way of inducing quitting in a encompasses a range of
such as sun protection or dietary proportion of the population of measures, in their totality
chemoprevention, on preventing smokers is through policy representing a comprehensive
cancer, as well as the evaluation measures, implemented by local, approach designed to control
of the strength of the evidence in regional, and/or national govern- tobacco use and supply. The body
addressing the alleged protective ments, intended to reduce both of policies stipulated in the WHO
effect. the number of smokers and the FCTC treaty became binding
In Volume 11, the first amount smoked in persistent international law on February 27,
dedicated to tobacco control, the users (e.g. by increasing the cost 2005. Of the 38 articles, articles 6
effects of smoking cessation on of tobacco products through the to 14 cover policy interventions
the risk of developing or dying of use of pricing and taxation directed at preventing tobacco
cancer, cardiovascular diseases, policies). Interventions, which use, decreasing consumption,
or chronic obstructive pulmonary have been implemented at the reducing toxicity, protecting non-
disease were examined. In that individual and societal level to smokers, and diminishing tobacco
volume, the health benefits of control the use of tobacco and use initiation. Articles 15 to 17
quitting smoking were investigated concomitant health effects, have relate to measures controlling the
by comparing epidemiological been adopted at different paces availability of tobacco (WHO,
studies reporting the risk of and with varying degrees of 2003). In other words, the policies
disease in never, former, and comprehensiveness in countries are a series of measures
current smokers, as well as around the world, generating an conceived to counteract multiple
differences in risk with length of irregular response to the tobacco domains of tobacco availability
smoking abstinence, when epidemic. These interventions and use. The joint observance of
available. An evaluation of the have included, to list a few, total or the treaty by countries around the
weight of the evidence was given partial bans on smoking in work world will make it a global
for each disease contemplated. and public places; suppression of response to the tobacco epidemic.
For IARC, Volume 11 was tobacco advertising, promotion, However, the reach of the policy
exceptional in including disease and sponsorship; anti-tobacco interventions included in the WHO
outcomes other than cancer. education and communication FCTC will depend on how
Given the prominent etiologic campaigns to raise awareness; effectively countries formulate and
position of smoking in other changes to tobacco product implement these policies. As of
disease outcomes, limiting the labeling; and smoking cessation November 7, 2008, 161 countries
review to cancer would have given services. have become parties to the treaty
a partial picture of the benefits A global, coordinated effort to (
derived from quitting smoking. use legislation and associated work/en/index.html; accessed
How individuals overcome the programmes to arrest the tobacco November 10, 2008).
smoking habit to achieve use epidemic is now led by the The FCTC has propelled
sustained abstinence has not World Health Organization tobacco control into a new era, as

IARC Handbooks of Cancer Prevention

countries all over the world policy works, the potential effectiveness of policies on
incorporate its policies and moderator variables to consider tobacco taxation, smoke-free
recommendations into their own when evaluating a given policy, environments, tobacco product
laws. As tobacco control policies and the data sources that might be regulations, limits on tobacco
are formulated and implemented, useful for evaluation. marketing communications, pro-
it is important that they undergo The working group proposed a duct labeling, anti-tobacco public
rigorous evaluation. In the same common conceptual framework to communication campaigns, and
way that evidence-based guide future FCTC policy tobacco use cessation inter-
medicine has been built from evaluation, specifying two levels of ventions. Additionally, it provides
thorough evaluation of treatment mediating variables: those specific examples of measures used to
options, evidence-based public to the policy, and those that are assess key constructs, with
health must build on a database of part of more general pathways to special attention to measurement
rigorous evaluations of public the outcomes of interest. It also issues with survey methods. Also
health policies. Such knowledge accepted that various other factors provided are descriptions of
will allow implementation of the (moderators) might affect the size sources of data on tobacco control
most powerful policy interventions, of the effect, and recognized the policies, tobacco production and
and will do so in ways that will possibility of effects incidental to trade, and repositories of youth
maximize their effectiveness. those an intervention is designed and adult surveillance surveys.
Towards this goal, IARC to produce. Given the already These sources of information are
convened a working group of well-established relationship particularly important for making
international tobacco control between tobacco use and comparisons between countries,
experts from March 12-19, 2007 to disease, and the lag time between and in some cases can be used to
propose a framework for guiding reductions in tobacco use demonstrate the impact of
the evaluation of tobacco control prevalence and observed reduc- policies, although not the
policies expected to be formulated tions in disease outcomes, this mechanisms by which they occur.
worldwide in response to WHO Handbook (V Volume 12) recom- Thus, Volume 12 is offered as a
FCTC. Four broad questions were mends that tobacco use be guide to evaluators in the field,
considered by the working group, utilized as the appropriate and consequently a frame for
each with several more specific endpoint for most policy future IARC Handbooks that focus
related sub-questions, to guide the evaluations. The group elaborated on evaluating the impacts of
review of the scientific literature on on the model most completely for societal level interventions to
the methods and measures of tobacco use outcomes, but it was control cancer, and other
tobacco policy evaluation. The also applied to policies affecting preventable diseases, caused by
broad questions cover how the product harmfulness. tobacco use.
effects of a policy are determined, Included in this Handbook are
the core constructs for under- logic models outlining relevant
standing how and why a given constructs for evaluating the

Chapter 1
Ensuring effective evaluation of tobacco control

Introduction to the disseminated programmes of Lopez, 2003). If current trends

non-governmental agencies. continue, it will cause some 10
This volume is concerned with This chapter provides an million deaths each year by 2030,
methods for evaluating the evidence introduction to the importance of with around 70% in low-resource
for the effects of policy initiatives. By having well-evaluated, population- countries (Peto & Lopez, 2001;
policies we mean the enacted level tobacco control interventions Ezzati & Lopez, 2004). This
decisions of governments and their and of having a framework for projected shift is due, in part, to
consequences on the environment achieving them. It outlines criteria increasing population size and
(legal, social and physical) in which used to evaluate constructs and increased smoking in low-resource
tobacco use takes place or on measures, and how these relate to countries, but it is also partly due to
tobacco use directly; that is, specific strategies for most effectively greater success in controlling
instances of the policys mani- gathering information to evaluate smoking in many higher-resource
festations (interventions). This the effectiveness of interventions, countries. In the 21st century, if
means evaluating the effects of the mechanisms by which they current usage patterns persist,
laws, regulations, taxes, admin- work, and the conditions that smoking will cause approximately
istrative decisions, programmes and moderate their effects. 1000 million deaths, a tenfold
efforts to publicise or disseminate Cigarette smoking is not only the increase over the previous century
discrete interventions such as most prevalent form of tobacco use, (Gajalakshmi et al., 2000). A
smoking cessation aids. It includes it is also among the most harmful, substantial fraction of these
evaluation of policies that have the as it kills one in two long term users expected deaths could be averted
explicit goal of tobacco control, as prematurely. In the 20th century, by efforts to discourage tobacco use
well as policies that affect tobacco cigarette smoking caused an and to assist those addicted to
use incidentally, although our focus estimated 100 million deaths tobacco to quit (IARC, 2007a).
is primarily on the former. The worldwide. Most of these deaths Most countries have ratified the
Working Group (WG) is primarily were in developed countries of the World Health Organizations (WHO)
interested in evaluating inter- world where cigarette smoking first Framework Convention for Tobacco
ventions that are designed to have became popular in the 1920s to Control (FCTC). It is the first piece
effects at a population level, 1940s. This resulted in an epidemic of international law emanating from
especially those enacted at a of smoking-induced cancer, heart the WHO. Its objective is:
national level, but the principles disease, and chronic obstructive
apply to many subnational- and pulmonary disease (COPD) deaths. to protect present and future
even local-level policies. While the In 2000, tobacco use was generations from the devastating
focus of the WG is on how to assess responsible for approximately 4.83 health, social, environmental and
policy consequences of govern- million deaths, evenly divided economic consequences of tobacco
ments, the evaluation framework we between the industrialised and non- consumption and exposure to
have developed could equally apply industrialised worlds (Ezzati & tobacco smoke by providing a

IARC Handbooks of Cancer Prevention

framework for tobacco control Technical Cooperation and Com- designed to reduce tobacco use.
measures to be implemented by munication of Information spells These will include but not be
the Parties at the national, regional out a framework for research, restricted to those mandated or
and international levels in order to surveillance and technical coop- recommended by the Convention.
reduce continually and substantially eration to facilitate the achieve- Ensuring the right mix of policies
the prevalence of tobacco use and ment of the policy goals. requires an understanding of the
exposure to tobacco smoke. Article 20, Research, surveil- determinants of tobacco use and
(Article 3) (WHO, 2003). lance and exchange of informa- of how tobacco harms health.
tion, calls for The parties [to] Tobacco use is determined by
To achieve this objective, the undertake to develop and promote multiple factors, and attempts to
WHO FCTC calls for a national research and to coordi- control the epidemic require
comprehensive range of nate research programmes at the changes in societies as well as
measures, specifically: regional and international levels in individuals (see Figure 1.1).
Price and tax measures to the field of tobacco control. The Analysis of the factors that
reduce demand (Article 6) article, among other things, calls influence tobacco use should
Protection from exposure to for the development and promo- encompass smokers, those
tobacco smoke (Article 8) tion of national research efforts, vulnerable to uptake, tobacco
Regulation of the contents of national systems of surveillance of products, those who produce and
tobacco products (Article 9) tobacco consumption and related sell tobacco products, and
Regulation of tobacco product social, economic and health indi- governments who determine the
disclosures (Article 10) cators; coordination of activities so parameters of use. The role of
Controls on packaging and that data can be compared across cultural and economic diversity
labelling of tobacco products countries; exchange of publicly should also be considered.
(Article 11) available scientific, technical, Further, we need to understand
Programmes of education, socio-economic, commercial and how both the determinants of use
communication, training and legal information, as well as infor- and actual use and/or exposures
public awareness (Article 12) mation regarding practices of the are affected by interventions.
Bans on tobacco advertising, tobacco industry; and that the fi- Policies and the disseminated
promotion and sponsorship nancial and institutional resources programmes that result from pol-
(Article 13) be put in place to allow this to hap- icy decisions are of particular in-
Programmes to promote and pen. terest because of their potential to
assist tobacco cessation and Article 22, Cooperation in the affect large numbers of people, in
prevent and treat tobacco scientific, technical, and legal some cases entire populations. As
dependence (Article 14) fields and provision of related a result, it is important to be able
Elimination of illicit trade in expertise, expands on Article 20 to show that they achieve their ob-
tobacco products (Article 15) with regard to such things as jectives and do so in a cost-effec-
Measures to prevent sale of providing developing countries tive way, with any incidental
and promotion of tobacco to with technical and material effects ideally having net benefits.
young people (Article 16) support and training, and Evaluation allows the most effec-
Provision of support for identifying methods for tobacco tive interventions to be maintained
alternative crops to tobacco control, including comprehensive (and perhaps improved further)
(Article 17) treatment for nicotine addiction. while less effective interventions
The WHO FCTC will likely are either improved or aban-
In addition, Part VII of the result in the proliferation of policies doned.
WHO FCTC, on Scientific and and associated programmes

Ensuring effective evaluation of tobacco control interventions

Environment: Physical, institutional, communication, policy, legal, scientific,

cultural, social & inter-personal

Tobacco Cues Tobacco

marketing Use Control

Tobacco People: Awareness, appraisals, Host

industry experiences, habits, values, Biology
control expectancies, choices,etc

products Tobacco use

Active and passive exposures Socio-economic effects

Health outcomes

Figure 1.1 Major influences on tobacco use and its consequences

Used with permission of Ron Borland

Tobacco and health products cause should consider mouth lining or stomach because
the composition of what is the lungs are more sensitive. The
The amount of harm created by ingested and how the products are evidence that exclusive cigar or
tobacco use in a given population designed to be used. Thus for pipe smokers have notably less
is a function of the toxicity of the combusted tobacco products, the health risk than cigarette smokers
products, the site(s) of exposure, focus needs to be on the smoke, (Doll, 2004) is probably because
the toxins taken in, the period over rather than on the unburned these smokers tend to only take
which exposures occur, and the product, although the composition the smoke into their mouths.
distribution of those exposures in of the unburned product is Decades of research on the
the population (IARC, 2004, relevant to the extent that it health effects of tobacco have
2007b). The harms from tobacco influences the composition and/or identified numerous diseases
use are mainly from long-term density of the smoke. Mode of causally related to tobacco use,
use, which is made more likely by ingestion is often ignored; including several sites of cancer
the addictive nature of the however, some chemicals are (including lung, oral cavity, esoph-
product. Calculation of the more toxic when absorbed agus, larynx, stomach and pan-
potential harms that tobacco through the lungs than through the creas), major vascular diseases

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 4

IARC Handbooks of Cancer Prevention

(including ischemic heart disease, witz, 1999). Most of the harm is due the toxin-reduced forms) can
peripheral vascular disease and to other constituents in tobacco and reduce harm, but does not
cerebrovascular disease), major tobacco smoke (IARC, 2004). Thus eliminate it (Critchley & Unal,
respiratory diseases (including nicotine only indirectly contributes to 2003; Foulds et al., 2003; Roth et
chronic obstructive pulmonary dis- most of the harms, by leading to al., 2005; Henley et al., 2007).
ease, tuberculosis, and pneumo- prolonged use of dirty delivery Reducing or eliminating smoked
nia), reproductive effects and systems, especially cigarettes. tobacco use is a higher priority for
reduced bone health. Epidemio- The epidemiology is clear. The health than reducing smokeless
logical methods have been ap- health risks of smoking are far tobacco use. Research is needed
plied to estimate how much of greater than those associated with to determine whether smokeless
these diseases in different popu- smokeless tobacco use. The tobacco might play a role in this or
lations with different tobacco use health risk of each kind of whether nicotine replacement
histories is due to tobacco (Peto et smokeless tobacco varies signi- products and other cessation aids
al., 1992). ficantly as a function of their are all that is needed.
While prolonged exposures are toxicity. For smoked products, the
responsible for most fatal likely variability in toxicity does not Patterns of tobacco use
consequences of tobacco use, seem to translate into clear
there is increasing evidence of differences in health risks. To date, Tobacco is a plant containing the
adverse short-term effects, seen cigarettes with levels of toxins psychoactive and addictive drug
most clearly in the rapidly reduced by enough to be plausibly nicotine. It has a long history of
reversible impacts of passive less harmful are not used by use and has been used in a wide
smoke exposures on non- smokers, so are irrelevant to variety of forms. The two main
smokers (Raitakari et al., 1999; tobacco control efforts. forms of tobacco use are by smok-
Wong et al., 1999; Wakefield et Some harms, particularly minor ing and by chewing or parking
al., 2003a). There is no safe level harms and those related to wads of tobacco in the mouth and
of exposure to tobacco smoke. cardiovascular disease, are allowing the active ingredients to
Risks of cardiovascular problems reversible on quitting smoking. be absorbed (smokeless use). In
are largely reversible, and effects While quitting can improve health, the 20th century, the use of ciga-
seem to asymptote at lower doses cutting down on consumption does rettes came to dominate both the
than those related to cancers and not seem to (Hecht et al., 2004; smoked and overall markets in
chronic lung conditions (e.g. Tverdal & Bjartveit, 2006). This nearly all countries. It is also the
emphysema), where the dose- may be in part because, for some product that has been the focus of
response curve is more linear illnesses much of the harm occurs most of the research. In most
across typical exposure patterns at relatively small doses, and partly countries factory-made cigarettes
(Law & Wald, 2003; Pechacek & because smokers who reduce the dominate the market; however
Babb, 2004). The addictive nature number of cigarettes they smoke, roll your own cigarettes have en-
of tobacco makes it likely that often smoke the remaining joyed a resurgence in some coun-
people who begin to use it will not cigarettes harder, ingesting more tries. In other countries, most
be able to stop before the negative toxins per cigarette, thus reducing notably India, people consume a
effects associated with long-term or eliminating the potential benefits diverse range of tobacco prod-
harm start to occur. of smoking less (National Cancer ucts, both smoked and smoke-
Nicotine is the main psycho- Institute, 2001). There has been less. Among smoked products,
active ingredient of tobacco and the some success in reducing the the bidi (tobacco hand-rolled in a
source of its addictiveness, but is toxicity of smokeless tobacco leaf) is the predominant form used
otherwise a minor contributor to the products. Changing from smoked in the Indian sub-continent. Use of
harm (Murray et al., 1996; Beno- to smokeless products (particularly water pipes is common, particu-

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 5

Ensuring effective evaluation of tobacco control interventions

larly in the Middle East. Cigars oc- The experience of countries like The most comprehensive
cupy a position as a luxury to- Singapore and Thailand, which change in tobacco control has
bacco product, but use is have so far successfully prevented been in attitudes and rules about
generally low. All forms of female uptake, suggest that the smoking in enclosed public places
smoked tobacco are extremely Lopez et al. model does not de- and workplaces. As late as 20
dangerous to health, and there scribe a necessary progression, but years ago, smoking was
has been no major progress to- that the epidemic may be able to be effectively ubiquitous in most
wards creating less toxic versions largely averted in some sub-popu- countries, with smoking allowed
of these products that are suffi- lations, most notably women, when virtually everywhere (except
ciently acceptable to consumers to effective tobacco control policies where there was a danger of fires
be successfully marketed. are implemented. or damage to equipment). In some
Smokeless tobacco is not used in Over the last 2030 years, countries, this environment has
many parts of the world, but use is smoking prevalence has fallen transformed; several countries
significant in other parts, with the markedly in some countries. This (starting with Ireland and Norway)
products used ranging widely in is well documented for some in- now prohibit smoking in all public
places like India (e.g. gutka, use dustrialised countries (Gilmore, places and workplaces, and other
with betel quid, nicotine tooth- 2000; Giovino, 2002; White et al., countries are following rapidly.
paste), but is limited to one main 2003). One country, Bhutan, has The social acceptance of
form in others (e.g. snuff (pow- banned the sale of tobacco prod- smoking is declining in most places
dered tobacco) either in loose or ucts to its citizens. However, in where it has been studied. This
prepackaged, small tea-bag-like some other countries, rates of to- decline seems to be related to the
portions). Use of smokeless to- bacco use may have increased. length of time the society has taken
bacco is increasing in some places The great diversity both between to regard the problem as serious,
(e.g. Sweden) (Foulds et al., 2003). countries and within countries and to progress in the imple-
Non-cigarette tobacco use is over time creates huge challenges mentation of smoke-free places. In
under-researched in comparison to and opportunities for scientific un- Thailand, for example, equivalent
cigarette use. derstanding. One challenge, for levels of smokers see their habit as
The proportion of the population example, concerns preventing non-normative (i.e., that society
who use tobacco varies greatly women from smoking in societies disapproves) as in Western
from around 20% to around 60% where few currently smoke. This countries such as Australia,
(Shafey et al., 2003). In many coun- challenge needs to be taken up in Canada, the UK and the USA, all
tries, few women smoke, often ac- ways that are not contrary to the of which have decades of strong
companied by high smoking rates greater emancipation of women in action. By contrast, even though
in males (e.g. in Asia). By contrast, those societies. In developed personal disapproval of smoking is
in most developed countries female countries, e.g. in North America high in neighbouring Malaysia,
smoking rates are typically only a and Western Europe, the tobacco which has only recently taken up the
few percentage points below that of industry skilfully used female issue systematically, smokers are
males. There has been some pre- emancipation as a strategy for far less likely to perceive societal
dictability in these patterns of use, linking smoking to images of the disapproval (ITC South East Asia
leading to Lopez, Collishaw & modern woman. The slogan project, unpublished data).
Pihas (1994) four-stage model of Youve come a long way baby However, it is not just trends in
the tobacco epidemic, with devel- from the notoriously successful tobacco use and tobacco-related
oped countries first to experience it. Virginia Slims advertising cam- knowledge that are likely to affect
In this model, female smoking ini- paign typifies this strategy (US efforts to control tobacco use.
tially lags behind male smoking, Department of Health and Human Broader societal issues may also
with female rates eventually rising. Services, 2001). play a key role. The rapid

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 6

IARC Handbooks of Cancer Prevention

emergence of China and other beyond the scope of this volume to potency and their timing (see
countries as economic power- speculate as to what these effects Figure 1.2). While the under-
houses is likely to affect tobacco might be. However, unless efforts standing of their potency is focal to
use, at least in those countries, as are made to understand how this volume, it needs to be
more and more people have tobacco control fits into broader remembered that the sooner
money to spend on consumer social changes that are sweeping action is taken, the more lives will
products like tobacco that are the world, important determinants be saved. Every year of delay
marketed to appeal to modern of use may be missed, with the adds millions to the eventual
sensibilities. Worldwide concerns resultant reduction in the capacity burden of lives lost. Enough is
about the environment, including to identify and implement policies known to act in a comprehensive
the issue of global warming, and and programmes that work. manner now. The evaluation effort
the rise of religious funda- In thinking about the potential is primarily about helping us refine
mentalism in some countries are health benefits of interventions, it those interventions, to ensure they
also likely to have effects, but it is is important to consider both their are delivered in ways that
maximise their effects, and only
secondarily, to the development of
new more effective interventions.

Where does th is volume f it

within Tobacco Control?

This Handbook is not intended to

be a one-stop resource for all
tobacco control evaluation needs.
It is designed to present a
framework for evaluation directed
at policy effects and to provide
strategies and measures that are
specific to tobacco control, rather
than try to replicate material that is
general to all forms of evaluation.
In analysing the potential
contribution of research to policy
evaluation, it is useful to outline
the various roles it can play.
Impact of policies depends on factors including: Applied science proceeds through
- Intervention date a series of iterative stages once a
- Effect size problem has been identified (in
this case tobacco as a cause of
Figure 1.2 Projected impact of population-level tobacco control health harm): elaboration of a
interventions on estimated cumulative tobacco deaths theory or theories as to the cause
Estimated cumulative tobacco deaths 1950-2050 showing the effects of of the problem and of possible
different intervention strategies. In red baseline, in blue if proportion of young solutions, observation and des-
adults taking up smoking halves by 2020 and in green, if adult consumption cription of the problem informed
halves by 2020 by the theory, understanding
Adapted from Jha & Chaloupka (1999), The World Bank
causal mechanisms, intervention

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 7

Ensuring effective evaluation of tobacco control interventions

development, intervention deploy-, for reviews) pro- 1. A research design

ment and evaluation, and vides regularly updated reviews of 2. The choice of constructs and
re-evaluation of the problem. From evidence in these areas. How- measures to assess them
this, there might be the need for ever, we are concerned with the (predictors and outcomes)
new or revised solutions, which evaluation of effects of these in- 3. A sampling strategy
may require refinement of the terventions when applied to popu- 4. Study implementation
theory or development of a new lations. 5. Data analysis
one. Research can play a number The focus of this volume is the Of these, we only focus on the
of important roles in the process of evaluation of tobacco control poli- first two, although some attention
developing and disseminating the cies in the short to medium term. is given to issues of sampling,
most effective policy interventions. We concluded that for policies di- particularly of the value of having
It can be used to: rected at tobacco use, tobacco representative samples as a core
use was the outcome of interest, part of the research design. We do
1. help in the development of new rather than on the subsequent not consider data analysis as the
interventions; health effects. Clearly, as we tools here are largely generic and
2. help make the case for an move forward, we will want to are covered in the main computer
intervention being adopted; evaluate the summative effects of analysis packages, including the
3. fine-tune an intervention before all the efforts to reduce tobacco emerging techniques of GEE
implementation to meet local use, and the consequential health models (Generalized Estimating
needs (formative evaluation); outcomes. For a few jurisdictions Equations) (Hanley et al., 2003).
4. document the quality of that have had active tobacco con- This Handbook was not written
implementation (process evalu- trol programmes for decades, this with the needs of those conducting
ation); process is already underway evaluations at a community level in
5. assess the effectiveness of (Thun & Jemal, 2006). However, mind. However, much within it is
component parts, or of the the reality is that for most coun- likely to be relevant, at least at a
intervention under ideal cir- tries, we will never know exactly conceptual level. The cumulative
cumstances; how many tobacco-caused deaths approach adopted means that for
6. evaluate the effects of the are being averted, because there evaluations of interventions that
intervention as implemented, is insufficient data on how many have been shown to be effective in
both intended and incidental; such deaths are currently occur- comparable situations, the need
7. determine the cost-effective- ring. The global estimates referred for intense evaluation will be less,
ness of the intervention; and to earlier are a result of careful ex- as the evaluation can rely on
8. assess the cumulative effects trapolation from those countries indicators validated in previous
of changes in outcomes on where good data is available and work. However, for novel inter-
health. from studies that have been able ventions, the more powerful
Of these, only number 6 is of to estimate the fraction of deaths methods outlined here should still
focal interest here. All of the oth- from various causes that are due be used wherever the resources
ers are important, but to have cov- to tobacco. The methods for doing allow. The US Centers for Disease
ered them all would have made this are beyond our remit, as are Control (CDC) has published a
the volume too broad and too ways to model the potential im- useful guide to the evaluation of
long. We also do not consider pacts of interventions on smoking more local programmes (Mac-
evaluation of the efficacy of dis- prevalence or on the burden of Donald et al., 2001). A major
crete interventions that can readily disease (e.g. Levy et al., 2006). difference between that guide and
be tested in randomised trials; e.g. The typical evaluation research the present volume is the capacity
smoking cessation aids. The study can be thought of as having to use national surveys and data
Cochrane Collaboration (www. five components: collections in ways that are not

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 8

IARC Handbooks of Cancer Prevention

usually possible for local initiatives. weak, and there exists the poten- In the broad area of tobacco
That said, to evaluate local tial for poorly thought-through pro- industry control, there is some
initiatives country-level data can grammes to actually be consideration of illicit trade in the
be used as a control, with counterproductive. Most of the re- section on sources of production
complementary data collected search on the effects of prevention and trade (Section 4.2) and in the
from the community to assess the programmes in schools is from in- section on tax policies (Section
intervention effects. dustrialised countries. School pro- 5.1). Neglected areas include
grammes are plausibly of more restrictions on the number or type
Policy areas not importance in non-industrialised of outlets in which products are
emphasised in this volume countries, where school is a con- sold. There are few examples of
duit for new knowledge into the attempts to restrict the number or
There are a number of tobacco community in a way it no longer is type of outlet selling tobacco.
control policy domains that are in industrialised countries. The dif- However, it seems inevitable that
either not included, or not ficulty of developing successful in the future some jurisdictions will
emphasised. This is not because prevention education comes at try to restrict access to all
the WG believes that they are not least in part from the problem that smokers, not just youth.
important, but because it sought to raising the issue engenders inter- We also do not address the
keep the size of the volume est and thus curiosity about the evaluation of policies that restrict
manageable. Policy domains not products. Doing this in a way that for-profit companies from opera-
focussed on include some that are overcomes the potential threat of ting in the market. Some countries
designed to affect tobacco use curiosity leading to increased ex- have actual or virtual state
directly, such as sales to minors, perimentation, and that has a net monopolies on the sale or
restrictions on sales outlets, and negative effect on use, has proven production of tobacco products.
school-based prevention. Others difficult. This may explain the in- Several countries have been
are directed more at the tobacco terest of some tobacco companies forced to abandon these mono-
industry, or parts of it, and include in promoting such strategies. To polies by the World Trade
prevention of illicit trade, industry the extent that educational pro- Organisation. It has been argued
subsidisation, controls on access grammes are translated into the that non-profit control of the
of for-profit companies into the mass media, strategies for evalu- industry should make tobacco
market (and the role of ating them are covered in Section control efforts easier (Borland,
government monopolies), and 5.6 on Measuring the Impact of 2003), but there is little work
agricultural policies that affect leaf Anti-Tobacco Public Communica- evaluating either the move to free
production. tion Campaigns. markets or the potential of
The most significant area we Another prevention strategy we restricting the markets. In both
have not focussed attention on in do not address the evaluation of is these areas, research is needed
the volume is the lack of detailed policies to prohibit sales of tobacco to evaluate possible options and
attention to population-level pre- products to minors, and to enforce to estimate likely effects.
vention policies. There is a large these laws by using young people
body of evidence on the effective- attempting purchases. Such A critique of current
ness of school-based education programmes can result in a approaches to evaluation
programmes (Thomas & Perera, decline in the proportion of such
2006). The current evidence shows attempts that result in sales, but To achieve maximally effective
that, taken in isolation of other soci- the evidence that this actually tobacco control requires the
etal efforts, the impact of school- reduces youth smoking is not development and ongoing
based programmes is generally strong (Stead & Lancaster, 2000). refinement of a viable set of

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 9

Ensuring effective evaluation of tobacco control interventions

methods for integrating research the population for a number of separately: the questions of effi-
and evaluation in the imple- related reasons. First, imple- cacy, effectiveness, and dissemi-
mentation of tobacco control mented policies cannot be nation (Flay et al., 2005). First is
interventions. The population randomised and analogue the efficacy question: Can this
health challenge is to use scientific studies, where randomisation can intervention work? That is, when
methods to ensure that systems occur, may lack critical elements implemented in a controlled and
are set up to understand the of policy interventions (e.g. optimal way, does it work? Here
effects of the policy initiatives in authority of law, or it being applied the double-blinded randomised
such a way as to allow their to all in the community). Second, controlled trial (RCT) is the gold
evolution into the most effective over-reliance on RCTs, which standard, where possible. Second
ways of controlling the epidemic of focus on the detection of is the question of effectiveness:
tobacco use and related harms. intervention effects, can lead to a does it actually work when
Evaluation researchers in tobacco neglect of theory, which is critical implemented under real-world
control, like professionals in other for generalising from results to conditions, and with what degree
areas of population health, have related areas, and for of variation? Third is the question
been concerned for some time understanding the mechanisms of dissemination: Is the inter-
about limitations in the evaluation by which interventions work. vention used by enough of the
framework used. Third, RCTs are not able to population who would benefit from
The current dominant model of answer questions about the it to have an impact? An effective
intervention evaluation for im- relative effectiveness of inter- intervention that few are prepared
proving population health involves ventions across different to offer or few are prepared to use
extrapolation from the use of populations. Fourthly, when RCTs is of little benefit. One must also
randomised controlled trials are compromised, in terms of consider the extent to which the
(RCTs) of clinical (most typically, deviation from the double-blinded intervention is similarly attractive
pharmaceutical) therapies. It is ideal, they are less powerful, and for all with the problem. When only
based on the desire to identify the may be less strong than a subset of the population
active therapeutic agent or agents alternative methods with different benefits, any barriers to selective
within any intervention. This validity limitations. Finally, focus- adoption or influence should be
model is important and extremely sing on RCTs to provide answers examined. As we move from
successful for testing the efficacy to questions can result in a addressing questions of efficacy,
and often effectiveness of discrete neglect of other evaluation through effectiveness, to dissemi-
interventions offered at the techniques, which although not as nation issues, it becomes
individual (and even small group) inferentially strong as RCTs, may increasingly difficult to fit the
levels, particularly where double have complementary strengths. It conditions for RCTs, even for
blinding is possible. This is where is important to understand the clinical interventions.
neither researcher nor participant conditions under which RCTs are RCTs involve a number of
know who is getting the thera- limited and what the implications (usually implicit) assumptions.
peutic agent under evaluation and are for inference. First, RCTs assume that the
who is getting either a placebo or measurement required for the
the existing best-practice inter- Limitations of RCTs evaluation does not affect the
vention. RCTs produce consi- integrity of the intervention.
derable certainty about causes. Determining whether a discrete Second, it is presumed that the
However, reliance on RCTs is not intervention works involves interventions can be evaluated in
always possible or appropriate for answering three questions, which isolation of environmental factors,
the evaluation of policy impact in sometimes can only be answered including the societys response to

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 10

IARC Handbooks of Cancer Prevention

the intervention and to other Finally, there is no capacity to dent on how the individual
cultural trends; i.e., that the consider closely related indeed, responds to them. For clinical
effectiveness of the intervention functionally equivalent inter- interventions, the frame is quite
can be determined prior to its ventions as a class, and develop different. Their questions are
widespread implementation. Third, different criteria for evaluating new framed: If the appropriate system
it is assumed that any impact of versions of essentially the same is put in place to ensure the
personal choice over whether to intervention. For example, different person with the illness uses the
have the intervention can be executions of a cognitive-beha- intervention properly (or is given it
separated from the core thera- vioural cessation treatment or even properly), then can we
peutic effect. Fourth, it is assumed the various forms of Nicotine demonstrate a benefit? The
that the intervention is uniformly Replacement Therapy (NRT) get question the WG ask is quite
effective for all who are eligible to treated as independent interven- different and much broader: Can
be given it. None of these tions. In the case of NRT, all a system be put in place that will
assumptions are tenable for policy variants have had to go through the make the intervention work, and
interventions and disseminated same process of testing through how can that system be optimised
programmes. independent randomised trials under different conditions?
The assumption that a given before they were able to be Where limitations exist on the
dose of an intervention is marketed. internal validity of RCTs for
assumed to have an equivalent Population interventions tend making the inferences of interest,
effect on all who have the to be different in observable ways the strategy of using meta-
condition it is intended to treat is wherever they are implemented. analyses of similar studies to draw
problematic even with many Information-based interventions inferences is similarly problematic.
pharmaceuticals. The solution to are dependent on language, and Alternative means are required to
this problem has been to treat the language used must vary by control for these threats to
each identified population as novel culture, not just linguistic group. inference. It is only in the context
and to require separate RCTs. Language must be kept up-to- of being able to assume
This might work for major distinct date to make it contemporary, and generality, having few enough
differences, but when there are thus attract interest (and some- interventions to assume each is
many possible populations to times increase) comprehension. an independent case, and having
consider, the strategy becomes People-based interventions in- the capacity to test interventions in
cumbersome and costly. More variably differ. Policy-related isolation of their context, that the
efficient strategies are required. interventions encompass those model of RCTs as the keystone of
RCTs are similarly a cum- major aspects of the system that evaluation is possible.
bersome method for evaluating allow them to operate, not just the The allure of having a simple
interventions that vary continu- core requirements. It is not model based on RCTs to allow
ously, as they involve creating reasonable to assume that definitive inferences about the
discrete categories for randomi- population-based interventions effects of interventions treated in
sation. This means there is, for have their effects independent of isolation seems to have distracted
example, poor quality information anything the person does or us from considering the potential
on optimal dosage, both amount thinks, unlike most pharma- utility of other approaches. In
per dose and duration of use. This ceutical interventions. Like particular, the RCT-focussed
makes RCTs a particularly virtually all psychological and framework tends to neglect the
cumbersome method for evalu- social interventions, as well as role of theory and of the potential
ating interventions where the dose some pharmaceutical and other contribution of combined studies
of an intervention can vary ones, the effectiveness of policy with different kinds of limitations.
considerably. interventions is critically depen-

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 11

Ensuring effective evaluation of tobacco control interventions

The contribution of theory is provide maximal help to all, or to This knowledge is part of a
undervalued in tobacco control produce the desired structural or foundation that is sometimes
and in public health more cultural changes. No single theory forgotten. The question we are
generally. We agree with the can encompass the complexity of really asking is: Under what
noted psychologist Kurt Lewin: controlling tobacco use; however, conditions can the desired effects
There is nothing so practical as a more can be done to consider how be optimised? This includes
good theory. Some in the social theories that deal with different as- concern about the form of the
sciences take theory to refer to the pects of the problem interrelate, in- intervention, the ways it is
existing, sometimes demonstrably cluding different timescales for delivered, and various charac-
limited social science models, and change (e.g. behaviour change teristics of the populations to
take the theories from other areas versus change in cultural norms whom it is provided.
(typically from the biological and practices). The set of theories A new evaluation framework,
sciences) to be accepted fact, used should be compatible with one that is less reliant on the RCT,
rather than theoretical models; e.g. each other, even if the nature of is required. It should have a
of how a chemical will affect the interrelationships is not fully ar- systems perspective; use the best
behaviour. Theory is thought of in ticulated. possible methods, including RCTs
an encompassing sense of the The most important implication where appropriate; allow a more
accumulation of our under- of considering theory is that it central role for theory, to allow
standing of how things work, not allows explicit linkage of tobacco more efficient consideration of
merely the original ideas. Theory control to relevant existing know- possible variation in effects across
provides the mechanism to ledge. A focus on evaluating populations; and provide a more
systematically use existing know- interventions in isolation tends to efficient means of understanding
ledge to understand likely future distract from what is known, effects of dosing and other
effects. The aim should be to specifically: aspects of implementation.
develop consistent sets of ideas One approach to evaluation
(theories) that describe and predict Information campaigns can that is popular among public
actual outcomes. A hunch or a increase knowledge about health practitioners, but that has
past empirical finding is an tobacco. less credibility with researchers, is
unarticulated theory of what will Knowledge can change beliefs that of programme evaluation (e.g.
happen in the future. Unless and attitudes. Patton, 1997). These models have
articulated, these implicit theories Beliefs and attitudes can affect grown in areas where there are no
cannot be subject to proper tobacco use. simple relationships between
scrutiny. If they turn out to predict Advertising can change beha- programmes and sought policy
outcomes, there is no capacity to viour independent of conscious outcomes, yet there is a need to
work out why without first awareness of the influence. demonstrate progress. Thus the
articulating them. There are programmes and focus of these models of program
Theories specify mechanisms aids that can help people quit evaluation is often on determining
or mediating pathways of effects, using tobacco. intermediate effects when it is
allowing these pathways to be There are ways that the toxicity difficult to demonstrate effects on
tested. They also can specify con- of products can be reduced. the main outcome goals. We
ditions under which interventions Price rises affect levels of believe that there is value in
will work (i.e. moderate interven- consumption of tobacco pro- extending these models to
tion impact). One can test whether ducts. consideration of outcomes as well.
these factors affect outcomes, and Poorly designed and/or exe- The essence of these approaches
thus be better placed to develop cuted communications can is to test the theory behind the
the suite of interventions needed to have boomerang effects. programme, sometimes also

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 12

IARC Handbooks of Cancer Prevention

called the programme logic, to effective evaluation we need to Evaluation is enhanced by
assess whether the various consider what effects might occur showing the mechanisms of the
aspects of a programme can be (theory), and design studies that effects, not just restricting itself to
shown to contribute to the allow detection of effects in the determination of effect size. This
achievement of its goals (Mac- variables of interest (description) is critical in population research
Donald et al., 2001). The WG has and making of valid causal because most of the outcomes we
adopted the idea of using logic inferences about the contribution are interested in are potentially
models as a core element of the of the intervention to the observed determined by multiple factors;
framework we have developed. changes in outcomes. thus it helps demonstrate a
We found that doing so increased contribution from the focal
conceptual clarity and provided a Theory interventions as distinct from other
useful organising frame for interventions happening at the
thinking about the policies and a Evaluation must begin with a same time. Thus, the theory
more coherent way to organise the theoretical evaluation of how an needs to spell out the mediational
chapters and sections. intervention might work. Often model of how an intervention
there will be one clear theoretical might work. Mediational models
Framework for tobacco mechanism, generally provided as allow us to test each step along a
control evaluation part of the justification of having proposed causal chain from
the policy, but sometimes intervention exposure to beha-
The role of evaluation is to alternative modes of effect might viour (see Figure 1.3). If some
determine the effects of inter- be postulated. This is particularly relationships are not as predicted,
ventions, determine under what the case when the head of power the intervention may not be
circumstances these effects (constitutional source of capacity working, at least in the way it was
occur, and help identify ways to to legislate/regulate) under which intended to work. In cases where
make the interventions more policies are enacted is limited. the intervention is known to be
effective. To do this involves Thus policies to protect workers potent, evaluation of mediators
determining how the interventions from exposure to passive smoking may only need to proceed as far
work, and diagnosing any prob- cannot explicitly consider the as assessing uptake/exposure.
lems that either prevent them from possible benefits of smoke-free However, where the potency is
working as desired or diminish places for reducing cigarette unproven, testing the inter-
their impact, in particular any consumption or for enhancing ventions impact through to the
differences of effects within the quitting. Good evaluation requires desired outcomes (e.g. smoking
target population (equity issues). consideration of all potentially cessation) becomes necessary. In
In doing this one should consider important outcomes, not just those an area like tobacco control where
the totality of effects, both used to justify or provide a legal the main outcomes of interest
intended and incidental. To do basis for the policy. (e.g. smoking cessation, pre-

Policy Policy- specific General Policy

as implemented mediators mediators outcomes

Figure 1.3 A generalised model of mediation

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 13

Ensuring effective evaluation of tobacco control interventions

vention of uptake) are determined whose spokespeople are old, requires a good description of the
by multiple factors, mediational which are typically not seen as problem and its context, and of
models can also help establish the relevant to young people (the how these are changing. This
relative contribution of specific converse is less likely to be true). involves finding appropriate
interventions. Testing mediational Something as simple as choice of measures of the constructs of
models can also enhance under- actor can create moderator interest and of collecting data
standing of basic mechanisms effects, which under other using the appropriate measures.
and facilitate the development of conditions would not be present The goal here is to provide
new and improved interventions. (or be so small as to be ignored). population estimates of what
Other theoretically important Incidental effects must also be people do and think, focusing on
factors are those that may considered. Sometimes it can be key outcomes. It involves
moderate the relationship useful to separate these out from collecting data in four principal
between the intervention and the intended effects (see Figure domains: 1) who uses tobacco,
outcomes. That is, what conditions 1.5). Incidental effects can occur what they use, how much, and
affect the efficacy of the for a range of reasons; some may where and when they use it, as
intervention, or how does its be theoretically expected, while well as any relevant knowledge,
effectiveness vary by identifiable others may not. Some can occur beliefs and attitudes (including
sub-groups. Where one finds or as a result of counter-actions of those of ex-smokers and non-
theorizes moderator effects, it is sections of the tobacco industry to smokers); 2) tobacco industry
important to understand where reduce the threats of policies to behaviour, including charac-
they occur along the proposed their profitability. These effects teristics of their products; 3)
mediational pathways, or indeed can be incorporated within the tobacco control activities to which
whether different mediational more general model (Figure 1.4) people are exposed; and 4)
pathways exist for different groups as all such effects can be either aspects of the broader environ-
or situations (see Figure 1.4). For due to reactions to the policy, or to ment that might affect tobacco use
example, if an intervention is not independent other factors (and or tobacco harm outcomes
seen to be relevant to or targeted thus should be treated as (cultural norms, controls on
at a group, this group may not moderators). activities like alcohol consumption
respond to it. Here, making the that are linked to tobacco use).
intervention relevant might be all Description High-quality data collections, such
that is needed to remove the as regular cross-sectional sur-
moderating effect. A good The relevant theory tells us which veys, are essential to describing
example of this is advertisements constructs to measure. Evaluation the nature of the problem and the

Policy as Policy- specific General Policy

implemented mediators mediators outcomes


Figure 1.4 A generalised model of mediation, making allowance for moderator effects

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 14

IARC Handbooks of Cancer Prevention

Incidental effects

Policy as Policy- specific General Targeted policy

implemented mediators mediators outcomes


Figure 1.5 A generalised model of mediation, making allowance for both moderator and unintended or
incidental effects

quantification of trends in tobacco be through rules and restrictions, constructs of interest actually
use and in key determinants of making available alternatives or measure something different
use. In tobacco control, because substitutes, and/or providing rele- (usually a closely related con-
the tobacco industry or sections of vant resources and/or skills. The struct) or are contaminated by
it might be motivated to moderate mediational pathways vary both some systematic error (e.g. social
the effects of policies, it is for outcomes and policies. For ex- desirability can affect responses
important to conduct surveillance ample, mediational pathways to about beliefs and intentions).
of possible counteractions to knowledge acquisition are shorter Confounding occurs when the
policies. More generally, possible than ones to smoking cessation. association with the outcome of
incidental effects of policies interest appears stronger or
should always be considered and Inference weaker than it truly is as a result
measured where appropriate. of an uncontrolled association
There are five broad types of The core of good evaluation is between the intervention and
outcomes that relate to indivi- designing studies to detect other mechanisms of effect (e.g. a
duals: improvements in knowl- changes in outcomes that might different policy intervention). The
edge, changes to attitudes and be attributable to a specific contribution of chance is a
related normative beliefs, changes intervention, and putting in place function of naturally occurring
to behaviour patterns, changes in measures to rule out alternative variability in outcomes of interest,
exposures, and health outcomes explanations. These alternative and its impact is controlled for by
(particularly acute ones that can explanations are of three types: ensuring adequate sample sizes.
be detected soon after a policy is those related to systematic errors The quality of evidence from
implemented). Interventions typi- of measurement (bias), those any single study is a joint function
cally change the environmental related to alternative mechanisms of the study design and of the
conditions that affect and thus of effect (confounding), and quality of the measures used: that
sustain these outcomes. Mecha- chance effects. Bias occurs where is, their reliability and validity.
nisms for behaviour change can the measures used to assess the Where optimal research designs

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 15

Ensuring effective evaluation of tobacco control interventions

are not available, one must focus the risks of wasting resources. case in ways that enhance
on the relative strengths of This involves a model in which shareholder value, which is why
different designs. It is not enough science plays a role of evaluating they are almost invariably directed
to conduct meta-analyses of the interventions once they are at increasing or at least main-
individually strongest studies. A disseminated, not just restricting taining use. Even the best
diversity of research designs (and its activity to evaluating inter- thought-through interventions
associated measures) with ventions before they are sometimes fail to work as
complementary strengths, should disseminated. It is a science of expected, and policies that work in
be combined, and that information evidence in action, not just of one context sometimes stop
combined in ways that increase evidence preceding action. One working when the context
the validity of inferences. Demon- aim of this volume is to provide the changes. Because neither past
stration of similar effects with conceptual framework and some of experience nor theory can be
different methods and/or mea- the tools to allow more effective relied upon to always deliver the
sures increases confidence in the evaluation of implemented policies best solution to our problems,
reality of effects and of the and programmes. It is designed to methods must be established to
plausible causal mechanisms. complement the often (necessarily) check when and how things are
limited evaluation of interventions working. This is what modern
Evaluation as a dynamic that occurs before they are evaluation is about. A framework
process implemented. for effectively evaluating policy
There is the possibility that interventions is essential.
The evaluation of policy inter- empirical work will show the Such a model places less
ventions occurs after they are theoretical model used to develop stringent tests on demonstrating
instituted, as they first must be and or evaluate the intervention to that something has equivalent
implemented somewhere before it be flawed: either incorrect in some effects in a new context or when
is known how they actually work. of its assertions (including delivered in a new form (where
Because the authority of inclusion of factors that have little there is no reason to expect
government policy or law may or no influence), or incomplete by changes in efficacy) than it does
affect compliance, it is not ignoring important factors. It is for evaluation of truly novel
possible to confidently generalise only by specifying models that one interventions or their implemen-
from the results of analogue can systematically work to make tation under conditions where
studies conducted prior to imple- them better. differences in effects is plausible.
mentation. This means one A model of evaluation is However, it still calls for stronger
cannot in principle be certain of required that is designed for the evaluation methods when evi-
the effectiveness of interventions dynamic, ever-changing world in dence accumulated to question an
before they are implemented; which we live. The potential of the assumption of equivalence. Thus
hence, lack of evidence needs to worlds diversity must be viewed it provides an explicit link between
be used with caution as a reason as a tool to aid in understanding, the roles of ongoing auditing of
for delaying needed policy not an obstacle to be overcome. programmes to ensure continued
change. Scientific methods can Each action of government is an effectiveness and more intensive
be used to help us minimise our attempt to influence outcomes in evaluation activity when there are
risk of error, but they can never ways consistent with policy goals, concerns. As these decisions are
eliminate it completely. Science which, hopefully, aim to improve based around clearly articulated
should not inhibit action when the health and well-being of the theories, the framework is open to
there is a need for action, but community. Similarly, the actions scrutiny and should allow the most
rather act to maximise the of tobacco companies are also cost-effective possible evaluation
chances of success and minimise designed to affect smoking, in this by demanding plausible reasons

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 16

IARC Handbooks of Cancer Prevention

before testing for differences in relevance for communication, without such bans feel a need to
effects. changes quite rapidly in some justify their positions. Before the
communities. Similarly, across tipping point, even quite intense
Characteristics of interven- cultures, intervention may need to interventions may have limited
tions be framed differently to ensure impact (as has been the case for
cultural relevance. Under some implementing smoke-free homes
Typically, policy interventions are circumstances it can be useful to (Hovell et al., 2000)), while after it
designed to have sustained conceptually separate the core people may be readily able to
effects, but in some cases this concepts in an intervention from change without help (as evi-
may require designing ongoing the mode of communication used denced by rapid adoption of the
programmes to ensure that this to convey them. Thus evaluation practice in some countries (e.g.
happens. Further, there may be might focus on the cultural Borland et al., 1999)). Where
short-term onset effects. For relevance of the intervention or on things change, the rate of change
example, there is evidence that its underlying potency, or both. must be considered as well. When
warning labels on cigarette packs Analogous to the way societies it is more rapid than the time for
have an onset effect as well as a and/or people change, inter- the institutionalisation of inter-
sustained effect (Hammond et al., ventions need to change to ventions through traditional testing
2007a). We need evaluation maintain their relevance. This calls of efficacy and so on, then new
methods that can differentiate for an equivalent model to that of methods must be adopted to allow
onset effects from sustained how to create new immunizations interventions to be changed in
effects, and which also can help for emerging strains of influenza. train with the changing context.
us understand the conditions Here, the rate of change in the This is one of the reasons why it is
under which both kinds of effects problem is too rapid for RCTs to important to pre-test the mes-
are maximised. be practical, and quite different sages used for cultural relevance,
It is necessary to understand methods are used. even for proven interventions
what, if anything, is required to Changes to interventions may when applied in new contexts.
sustain potential enduring effects: also be required as a society This is also why it is important to
that is, what endures without progresses through the adoption conduct ongoing evaluation of
further intervention and what of an innovation cycle for adopting disseminated interventions.
requires regular updating, or a new sets of values and beha-
sustained presence. For example, vioural options for tobacco use. How policy interventions
anti-smoking mass media cam- Take, for example, encouraging that target behaviour work
paigns have a short-term impact the adoption of smoke-free
on quitting (Snyder, 2001). It homes. This happens first in the Evaluations of population-level in-
seems important to maintain cues face of social disapproval, or at terventions are typically interested
in the environment to remind least lack of understanding. An in determining the overall effect of
people of information for that entity instituting a ban will often be the intervention. As a conse-
information to have a maximal asked to justify it, and some might quence, it is not so much about
impact. The form of some kinds of see it as unreasonable. However, asking whether an intervention of
interventions may also need to as such bans become more this kind can work, but of asking
change over time if the effects of common, there comes a tipping under what circumstances does it
the intervention are to be point where smoke-free environ- work and how to optimise those
sustained. This applies particularly ments become the norm. Since conditions to get maximal impact.
to communication-based inter- justification is no longer neces- This involves consideration of the
ventions. What is seen as sary, smokers often just do not reach of the intervention (some-
up-to-date, and thus of most smoke when indoors, and those times no more than awareness),

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 17

Ensuring effective evaluation of tobacco control interventions

the ways people respond to it and cheaper brand, or seek out coherent theory or set of theories
its underlying potency or efficacy. sources of cheaper cigarettes, or as to what tobacco control is
There are three key aspects of even re-interpret smoking as about. This should extend beyond
interventions from the perspective something more exclusive and the list of tasks identified in the
of the individual: awareness of, ac- thus desirable. Like awareness, WHO FCTC to an analysis of how
ceptance of, and actions taken in re- acceptance can only really be the various domains of inter-
sponse to policies. Evaluation must evaluated at a population level, vention are theorised to contribute
deal with all three. The first aspect although it is typically the to the overall goal. The nature of
is determining the extent to which acceptance of each individual that the relationship between tobacco
the target population is aware of the is critical. In some collectivist use and harm must be sufficiently
intervention, which is a function of cultures, the views of community understood to know what
its implementation, dissemination, leaders are also critical, as they behavioural aims are appropriate.
and surrounding publicity about it. determine what it is acceptable to Such an analysis should consider
Awareness is generally a prerequi- think and do. These roles are in the broad scope of potential
site of policy effects, except in those addition to the roles of leaders in impacts, not just those that are
rare cases where the policy creates all cultures as policy makers. part of the rationale for
environmental conditions that can The third aspect is the implementing any particular policy
have direct conditioned effects; i.e. evaluation of the actions that initiative. For example, the impact
independent of conscious aware- result: that is, the consequences of smoke-free places, introduced
ness. or outcomes of the intervention in to protect non-smokers, also have
The second aspect is terms of both intended and beneficial effects on smokers and
documenting attitudes towards the unintended incidental effects. This do not appear to have some of the
intervention by the target is a function of both the actions adverse effects on economic
population, as this can affect their taken by the individual and the activity that some had feared
responses to it. Policies that are potency of the intervention. While (Scollo et al., 2003). Detailed
unpopular are more likely to be traditional intervention evaluation analysis of the conceptual foun-
resisted, and forms of assistance restricts its focus on outcomes dations of specific interventions is
that seen as inappropriate to the among those who are encouraged provided in the relevant sections
persons needs are unlikely to be to use the interventions, for policy later in this volume. Here the WG
adopted. Thus, a smoker who interventions this is not a useful addresses a few broader issues.
objects to smoke-free rules is restriction; one must consider the A broad schematic overview of
more likely to ignore the rules or to total impact on the population, key influences on tobacco use and
seek convenient alternatives, including those who are tobacco-related harm is provided
while a smoker who approves and unaffected. Outcomes should be in Figure 1.1. This figure makes it
sees this as an opportunity to gain considered as a joint function of clear that policy and socio-cultural
greater control over their smoking, the potency of the interventions, influences have indirect effects on
may not only comply, but use the the ways they are used or use and that the most proximal de-
opportunity to either quit alto- responded to (a function of terminants of use are the product;
gether or reduce their attitudes to them), and the degree cues in the environment; charac-
consumption. A price increase will of exposure to them. teristics of people, including cog-
only cause smokers to try to quit if nitions about the products; and the
they see the increased price as The theories behind persons biology (both conditioned
making smoking no longer worth tobacco control and innate). Further, the behav-
the cost. Alternatively they could iour and the product jointly deter-
smoke more of each cigarette to A critical step in developing an mine exposures, which, in
maintain the value, or shift to a evaluation framework is having a interaction with existing biology,

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 18

IARC Handbooks of Cancer Prevention

determine harm (see Figure 1.6). model for this. It is possible to elab- Tobacco control efforts can be
The role of a systematic science of orate this figure to include other im- focussed on users and potential
tobacco control is to analyse and pacts of policies (see Figure 1.7). users of tobacco products (e.g.
clarify the components of this sys- With generic models of this kind, changing knowledge and beliefs),
tem and their interrelationships areas that require greater attention or they can be designed to directly
over time, with the aim of introduc- can be expanded upon and boxes reduce use (e.g. price and
ing interventions that will minimise where things are more straightfor- availability controls), or to reduce
the harms. Figure 1.6 is a generic ward can be combined. use indirectly by changing the
environment to increase cues to
inhibit use (e.g. warning labels on
Policy-related Other Tobacco packs), or reduce cues to use (e.g.
interventions influences industry by constraining tobacco com-
panies marketing practices), or by
changing the nature of the
tobacco products on the market
(see Figure 1.8). Efforts can also
Propositions Sensory Tobacco be directed at reducing the toxicity
about tobacco stimuli products of tobacco products (targeting the
industry), and at reducing the
exposures of non-smokers (tar-
geting tobacco users). To
Tobacco product intervene in any of these ways
Conscious processing contents with either people or companies
requires a good understanding
(theory) of how the factors
producing unwanted effects
Tobacco use operate and how the intervention
Tobacco product will affect those operations. It is
yields beyond the scope of this volume
to spell out such a complex theory,
Patterns of Toxin exposure although in each section, relevant
use per use elements are canvassed.

Tobacco in dustry controls

Tobacco industry controls are

Cumulative exposure about targeting the 4 Ps of mar-
keting: Product, Price, Place (or
availability) and Promotion; to
which a fifth P can be added,
Tobacco harms Packaging; and, unrelated to mar-
keting, the imposition of specific
obligations to provide information
(for example, warning material) re-
Figure 1.6 Schematic diagram of main pathways by which policies gardless of its impact on the mar-
affect tobacco use, tobacco exposures and tobacco harms ketability of the products. This is

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 19

Ensuring effective evaluation of tobacco control interventions

Policy-related Other Tobacco

interventions influences industry

Propositions about Sensory stimuli Tobacco

tobacco products

Conscious processing product

Tobacco use Tobacco

product yields

Passive Patterns of Toxin exposure

exposures use per use

Cumulative exposure

Tobacco harms

Figure 1.7 Model from Figure 1.6 expanded to illustrate where effects other than on tobacco use fit in

achieved through a mix of laws tions to counter the intended ef- constituents or emissions (e.g.
and agreements, generally tar- fects, or to otherwise minimise ad- upper limits on tar, nicotine and
geted at manufacturers or distrib- verse effects on their business. carbon monoxide as measured by
utors, but in other cases, at other Product controls (see Section ISO standard testing; restrictions
points in the supply chain (e.g. re- 5.3) include rules about what types on additives/ ingredients), or on
tailers). Evaluation of tobacco in- of products can be sold (e.g. engineering features (e.g. man-
dustry controls also requires an smokeless tobacco is banned from dating reduced ignition propensity
analysis of possible industry ac- sale in some jurisdiction), levels of cigarettes, filters). The aims of

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 20

IARC Handbooks of Cancer Prevention

Tobacco Industry Control

Production regulations
- types
- constituents
- engineering

Tobacco Price controls and taxes


Constraints on

Controls on promotion

Cues to use
Tobacco Controls on packaging

warning labels

Cues to
inhibit use Tobacco Use Control

Passive Rules on use, e.g.

exposures smoke-free policies

Education campaigns

Tobacco users and

potential users
Cessation aids

Figure 1.8 Schematic overview of tobacco control interventions and how they relate to tobacco products,
users and potential users

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 21

Ensuring effective evaluation of tobacco control interventions

product rules vary from preventing young people (Wilson et al., 1987; it provides cues to inhibit use.
new forms of tobacco (to a Assunta & Chapman, 2004a; Warning and other risk-related in-
market) becoming established Prokhorov et al., 2006). The formation can be required on pack-
(e.g. bans on smokeless), to effects of such policies may ets, at the point of sale, on any
reducing their appeal (e.g. bans operate through reducing cues to permitted advertisements, or in
on flavourings), both of which are use, or by making the product less conjunction with any depiction of
designed to reduce use, and rules attractive, reduce the value of trademarks or commercial mention
to reduce the harmfulness of the using such products. of products.
products (e.g. constituent limits), Controls on promotion (see Tobacco industry controls are
which can also have direct effects Section 5.4) are the most promi- often about reducing cues to use
on the harm caused. nent form of control on the indus- tobacco, while tobacco use control
Price controls (see Section 5.1) try. They are essentially about efforts and information provision
includes efforts to damper reducing cues to use, but in doing requirements directed at industry
demand through increasing prices so, might also reduce the appeal are about increasing cues to
(e.g. taxation of various forms), of the products. Controls include discourage use. For cues to use,
which can have direct effects on bans on paid advertising, spon- the effect on behaviour is often
use, as well as strategies to sorships, and product placement, conditioned such that they will
prevent price-related marketing and encompass restrictions on stimulate tobacco use unless
(e.g. setting minimum and/or packaging (including controls on actively resisted. By contrast, cues
maximum prices to prevent dis- the use of trademarks, e.g. to inhibit use are more likely to
counting and other forms of generic packaging). Because to- operate via conscious processing.
price-related marketing). bacco is sold in a competitive mar- Evaluation of tobacco industry
Place or availability controls ket, some signs differentiating control is first about assessing
refer to efforts to reduce the products as belonging to a manu- compliance with the rules. This is
availability of the products and facturer/marketer are necessary. unlikely to be an issue where the
include restrictions on the number Even in places when brand dis- rules are to control obvious
or types of outlets, and to whom plays and advertising is banned at activities of small numbers of
they can be sold (e.g. age limits point of sale, a generic sign say- companies (e.g. compliance with
and bans on vending machines). ing that tobacco is sold is allowed. labelling requirements), but can be
Many of the existing rules have This promotes availability. To- an issue where there is more
been put in place to discourage bacco retailers can also promote potential for avoidance (e.g. many
use by young people, but res- products to customers by word of potential actors or where the
trictions could also be used to mouth. action is not so obvious; e.g.
reduce impulsive purchases and/or The final type of rules is inde- payment/avoidance of taxes).
to discourage use in certain venues pendent of attempts to control Evaluation is next about deter-
(e.g. bans on sales in bars). marketing, and is about what form mining the effects of the rules.
Packaging controls include and content are required for warn- What is involved here varies as a
rules about what can be on the ings. The content may include function of whether the rules
pack (e.g. use of terms like Light facts about the adverse effects of mandate some actions (e.g.
and Mild; see Section 5.5). It tobacco use, benefits of quitting, warning labels, higher prices) or
also includes rules that prohibit and information about toxin levels whether they mandate removing
sale of single cigarettes and (see Section 5.5). Here the aim is something (e.g. promotional cues
establish a minimum pack size to to discourage use or at least en- to smoke) that would otherwise be
stop use of packs with small sure that any continuing or new there. In the former case, issues of
numbers of cigarettes, which are use occurs in the context of some reactions to the change need to be
known to appeal primarily to information about the risks; that is, evaluated. In the latter, the extent

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 22

IARC Handbooks of Cancer Prevention

of previous response to the cues Provision of messages essen- pharmaceuticals and coaching or
(or other things) removed must be tially relates to mass media advice programmes of various
known before the impact of their campaigns, where the intent is to types. As noted earlier, this
removal can be effectively evalu- expose as many people as volume is not concerned with
ated. As noted above, it is possible to the campaign (see evaluating the efficacy of these
necessary to monitor and evaluate Section 5.6). This may include products or services themselves,
any industry actions that might campaigns to promote pro- but on evaluation of their com-
occur to reduce the impact of the grammes. Campaigns are munity-wide dissemination and
rules on their businesses. designed to inform people and to use. Beyond this, there is interest
make the issue emotionally salient in considering the effects of the
Tobacco use control enough to stimulate appropriate existence of cessation services on
action. One of the enduring the broader community. There is
Tobacco use interventions are challenges of tobacco control is some evidence that awareness of
those targeted at tobacco users or that because the main adverse the availability of quit-smoking
potential users directly. They in- effects of smoking are not evident programmes can stimulate quitting
clude rules about use, attempts to until after a long lag time, smokers activity even among those who do
provide messages aimed at pro- do not experience any significant not use the services (Ossip-Klein
viding information and changing sense of the harm they are doing, et al., 1991).
attitudes and beliefs, and pro- and thus tend to underestimate its Evaluation of tobacco use
grammes to deliver interventions harmfulness (Slovic, 1998). There interventions should consider both
that can facilitate appropriate be- are extra issues to consider in the their intended effects and
haviour change, or in the case of evaluation of prevention cam- incidental effects. They need to be
prevention interventions, effec- paigns. Focussing on an issue informed by a sophisticated
tively inoculate against uptake of increases awareness of it and may understanding of psychological
any of addiction-level use. increase interest, which if principles, and where there are
Rules about use include unchecked could lead to increased competing psychological pro-
policies to make various places experimental use. Designing pre- cesses involved, it is important to
smoke-free (see Section 5.2). vention campaigns or programmes put in place measures of all
Smoke-free rules are generally in ways that overcome this relevant processes. Where addi-
designed to protect non-smokers, increased interest requires tional effects to those sought are
although in doing so they have thought. There is evidence that known (or hypothesised) they can
effects on smokers that need to be some prevention campaigns, become further justifications for
understood. Rules could also be especially those emanating from action (or inaction, if they are or
about which products could be tobacco companies, can have might be undesirable).
used, and by whom. However, adverse effects (Wakefield et al.,
where there are restrictions on 2006), presumably through the Use of logic models
use of products or who can use increased interest in the issue they
them, they are usually also engender. Achieving a comprehensive
codified as rules against selling Programmes to disseminate approach to tobacco control
such products (e.g. smokeless interventions include rules regu- requires adoption of a range of
tobacco) or selling to particular lating cessation medications, different strategies, underpinned
individuals (e.g. minors), so these provision of services, and sub- by differing constructs and
are best considered under sidies to products or services (see theories. It is important to spell
industry control even when the Section 5.7). The kinds of out the relevant concepts to
parallel restrictions are imposed products/services vary, including consider in each area in which a
on individuals as well. self-help resources, stop-smoking policy intervention might be

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 23

Ensuring effective evaluation of tobacco control interventions

planned. The WG has adopted the theory and not directly in The problem of the measure that is
strategy of encouraging the use of relationship to what measures available not being a direct
logic models or flow charts to spell them, error is localised in the measure of the construct of interest
out the main constructs that need imperfect relationship between the may be greater when existing data
to be measured for each type of underlying construct and the are used, as compromises are
policy. The criterion we adopted measures used to assess it. commonly made in the interests of
was to divide an area to the point Many of the concepts that need to being able to use what is at hand.
where the causal pathways were be measured are not directly These data were often collected for
sufficiently different to make observable, or, where they are, quite different purposes to those of
dealing with the various possi- they sometimes stretch the focal interest, and thus the
bilities difficult within the one capacity of the respondent to measures used are often of related
frame. The WG used Figures 1.4 recall or otherwise come up with a constructs, not the exact ones
and 1.5 as generic models, but as valid answer (e.g. remembering being studied. Dependent on the
will be seen, found the need to quit attempts months or years study, evaluators may be forced to
modify them considerably for some ago). As a result, most measures use measures of constructs with
policy areas. We accept that as are subject to a range of possible different limitations. They need a
knowledge about how some of biases as indicators of their target language to help them talk about
these interventions work accu- constructs. Exceptions are the quality of measures in
mulates, new distinctions may characteristics such as sex and relationship to the constructs they
become necessary, which could date of birth, which in most are using the measures to assess.
lead to further subdivisions of cultures at least can be reported Unfortunately there is no con-
intervention type. Further, in some very reliably (although not in all). sistent language for talking about
cases, distinctions may be shown One of the great challenges of these distinctions, and the WG
to be of lesser importance, allowing measurement is that the mea- were unable to develop one for this
some of the existing boxes to be sures that are most easily volume. The WG views the
combined. It is only once a obtained are often not ideal development of such a language
coherent theoretical model of the operationalisations of the con- as critical to reducing the potential
domain has been established that structs of interest. For self- for conceptual confusion that can
determining the constructs to reported data, most things people occur from failing to consider the
measure becomes possible. report are used as indicators of limits of specific measures to
behaviour patterns or of under- actually measure the constructs
Measurement issues lying beliefs, behaviour patterns evaluators are interested in
and/or understanding, not as measuring.
Measurement is critical to simple answers to the question.
evaluation. To measure the con- The lack of direct measures also Determining what to
cepts of interest, these concepts occurs for many physical mea- measure
must first be defined in ways that sures. For example, cotinine
make them amenable to measure- levels are sometimes used to Choice of potential measures
ment. These definitions constitute assess intake of nicotine or extent begins with an elaboration of the
the constructs. Constructs can be of smoking. However, because theory or theories as to how the
operationalised in many ways. people differ both in size and in intervention might work, including
This operationalisation must come rate of nicotine metabolism, the range of expected outcomes
from a clear consideration of the cotinine is a biased measure of and potentially mediating (or
concepts and thus of the intake or exposure at an individual intermediate) and moderating
underlying theory. Because con- level, although it can be a good variables (effect modifiers), as well
structs are defined in terms of the estimator at a population level. as incidental effects. It might also

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 24

IARC Handbooks of Cancer Prevention

consider questions like: What they must assess how well the This can be assessed through the
outcomes will lead to health constructs of interest can be meas- relationship between the measure
gains? and What might influence ured. Where adequate measures and a gold-standard measure (cri-
policy adoption and/or continu- do not exist, there will unavoidably terion validity), or by showing that
ation? Evaluators should also be gaps in the modelling. Some- the measure related to other theo-
consider whether the same times these gaps can be covered, retically related constructs as hy-
outcomes are relevant to all at least in part, by using sets of pothesized (convergent validity).
cultures. For example, in Islamic measures of related constructs. One form of convergent validity is
countries and others where In Chapters 4 and 5 of this predictive validity, where the
alcohol use is prohibited or not Handbook the WG provides measure is shown to predict out-
socially significant, consideration guidance on measures that might comes as theorised. A valid meas-
of smoking policies in bars is of be used in various evaluation ure of one construct is unlikely to
little interest. Also the relevance of contexts. For any domain of be an equally valid measure of
some issues can change as a interest we attempt to characterise even a closely related construct.
function of a societys status in constructs that might be Also, the validity of a measure
regards to tobacco control efforts. measured as one of: may vary as a function of how it is
For example, support for and 1. Core constructs: those that being used. Thus reports of
reports of smoke-free hospitals should be included whenever awareness of environmental cues
are now so high in many this domain is being studied. are not a valid measure of the ex-
countries, it is no longer These will include key out- tent to which any single individual
necessary to ask. However, in comes along with major is exposed (because of differ-
countries where passive smoking theorized mediators and ences in sensitivity), but may be a
has not become an issue, asking moderators. Not having mea- valid measure of overall commu-
about smoke-free hospitals may sures of any of these is likely to nity exposure (as the individual er-
be critical to assessing emerging compromise the study, or at rors are assumed to cancel out
community concern. This analysis least limit the range of across the population). Validity
identifies the concepts that it inferences that can be drawn. also only relates to the contexts in
would be desirable to measure. 2. Important complementary con- which it is established. As the
Next, evaluators need to con- structs, to use for detailed context changes the validity of a
sider how they want to operational- investigation of a domain. measure may vary. For example,
ize the concepts as constructs. 3. Other measures or indicators self-reported age is generally a
This needs to be done in a way that that may add some limited or valid measure of how old some-
ensures that the constructs are uncertain value, but which we body is. This is so in cultures
structurally independent of related cannot recommend (for or where birthdays (anniversaries)
constructs they might want to relate against), or only recommend in are important occasions, but may
them to in causal pathways. Fur- limited circumstances. be less so in cultures where peo-
ther, they need to consider whether 4. Not recommended: these only ple take no notice of birthdays.
the construct can always be meas- need to be specified for com- Also the validity of measures
ured in the same way. Physical monly used measures that have varies directly with the precision
measures typically measure the been shown to have no utility. required of the measures: meas-
same thing regardless of context, The quality or validity of the ures that may be valid for detect-
but answers to questions may not. measures used for each construct ing large-scale effects might not
For example, the direction of social also must be considered. Validity be adequate for detecting small
desirability biases might switch as of measures refers to the extent to effects.
smoking becomes less socially which they actually assess the The WG uses the following
normative. For any given study, construct they are designed to. broad categories to provide an

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 25

Ensuring effective evaluation of tobacco control interventions

indication of the quality of in mind that the quality of a program of activity that is put in
measures: measure may be dependent on place to implement it (which is
the type of study in which it is usually more difficult to
Gold standard measure. Estab- collected and the use to be made document). Policy documents
lished valid measure of a of it. The assessments made here should be collated and coded in
construct of interest that is assume the measures are made ways that allow appropriate
better than alternatives in all in appropriate circumstances. comparisons to be made. There is
ways. now an international repository of
Clearly validated outcome or Types of data u se d in information about the content of
predictor. There is evidence that evaluation national tobacco control policies
this is a good way of measuring (See Section 4.1), making this
the construct, in at least some The type of data needed for task easier, at least for national-
specifiable contexts. Limits to evaluation varies, and in some level policies. Some countries
validity should be noted. cases it can be found in existing collect this information for sub-
Evidence of utility. There exists data collections, although some- national policies, but in most
some validity data, but it is not times measured in ways that are cases, the information will need to
strong. It might be one of a less than ideal for the new be collected from each jurisdiction.
range of alternatives with no purposes to which it is going to be Where there are many such sets
clear way of differentiating put. In some cases, measures of of rules (e.g. of workplaces, local
between them. These should the variables of interest are governments), it is usually more
only be chosen when no better available from more than one convenient to either obtain
measure is available. source. In these cases, decisions samples of policies, or to use
Face validity. This involves an need to be made as to which respondents in population studies
analysis of the extent to which sources of information are most to report on the rules that apply to
the question taps the construct, useful. Issues to consider here them. Clearly, this latter form is
and may be all that is available are validity, practicality of subject to the problem that
for single item self-report collection, and the extent to which ordinary people often do not know
measures. the data can be related to specific about rules, and where they do
Where possible, we also individuals. However, in most not, may respond in terms of what
provide an indication of the cases, the necessary information they remember. For example,
sensitivity of the construct to will need to be collected, giving when asked if there are bans on
measurement error. For example, the researcher greater control smoking in their workplace, some
how robust is a question to over the ways in which the will know the formal rules and
differences in wording? Or indeed, relevant constructs are measured. respond appropriately, whereas
might wording or contextualizing Some of the main types of data others may know the rules but
statements need to differ by and major ways of collecting it are respond in terms of what actually
context and/or by characteristics outlined below. happens (e.g. if there is a rule, but
of the respondent? For example, it is ignored, they will report that
some questions need to change 1. Documentation of policies. there is no rule, interpreting the
for use with current smokers as Critical to any form of evaluation is question to mean, Can people
compared to ex-smokers; e.g. documenting the nature of the smoke?). Others will only be able
How confident are you that you intervention. Documentation of to answer in terms of what they
will be able to stay quit, if/when policy can occur at two levels: the infer from their recalled obser-
you try (The last qualifying phase espoused intent or formal policy vations, e.g. Nobody smokes
is not needed for ex-smokers)? (something that is typically there, so it must be banned. This
Users of this manual should keep documented), and the actual means that such reports may not

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 26

IARC Handbooks of Cancer Prevention

be able to help differentiate point-of-sale displays, bill- 3. Effects on and characteristics

between policy existence and boards, and posters. They can of individuals
policy implementation. Indeed, be collected through obser- a) Self-report data. Characteris-
generally there are difficulties in vation in sampled settings. tics of individuals (knowledge,
directly determining implemen- They may also be estimated attitudes and behaviour) are
tation, especially for complex from reports from relevant generally only available from
policies independent of their organisations (e.g. of work- self-reports (some scope for
effects. This is only a problem places as to the restrictions on proxy reports, but limited be-
when the research questions smoking), but are assessed yond smoking status). Self-re-
include asking whether problems more often by reports from port data can be of internal
with a policy occur at the level of ordinary citizens as to what they cognitive states that are not in-
policy content, or are a problem of experience, or for smokers, dependently verifiable (e.g. of
implementation. what they actually did (e.g. attitudes, knowledge or experi-
when last at a restaurant, did ences), as well as of things that
2. Identifying changes in the you smoke?). These reports can, at least in theory, be vali-
environment or factors that can be averaged across dated, such as behaviours.
might moderate policy effects. communities to estimate overall Sometimes answers to ques-
The challenges of doing this levels of these features. Like tions can also be used to infer
differ by the environment under other respondent reports, these internal states of which the re-
consideration. are subject to sensitivity bias, spondent is either not aware or
a) Mass media. Monitoring of limiting their use for individual- not thought able to report accu-
national and regional media, level analyses. rately (e.g. personality traits).
with sampling of communities c) Production and sales data. Many countries have routine
for audit of local media, is the Various forms of sales data, or behavioural risk factor surveil-
most objective source of what proxies for sales data, may be lance studies and/or tobacco
is potentially available. This available, usually related to specific surveillance studies,
does not cover some important reporting on taxes and excises. and these can be useful in a
sources like the Internet. An These may be national-level, range of contexts. Many coun-
aggregated respondent report but in some cases can be tries use standardised methods
is useful where there are separated by type of outlet or and questions, and are working
sufficient observations per locality. At a national level, towards common repositories
community unit. Individual there are some international of data (see Section 4.3). Self-
reports are subject to sen- repositories of this information reports are affected by ques-
sitivity bias, such that when (see Section 4.2). Self-report of tion wording and by other
thinking about quitting, or trying price paid is a fairly accurate aspects of the ways in which
to quit, the person is likely to be indicator of prices, but little is the information is collected (see
sensitized to mentions or known of possible systematic Section 2.2 for some exam-
images of tobacco or smoking. biases. ples).
This means that respondent d) Characteristics of tobacco b) Physical measures. This in-
reports should not be used as products on the market. These cludes biological and chemical
indicators of exposure in most include composition and engi- measures (e.g. of cotinine lev-
individual-level analyses. neering features of products els). These are often used to
b) Physical environment. These and performance characteris- measure behaviour indirectly,
consist of rules about public tics. These can either be gath- but this should be done with
tobacco use and cues to ered from the manufacturers or caution. Limitations of these
tobacco use from things like through independent testing. measures as well as their

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 27

Ensuring effective evaluation of tobacco control interventions

strengths are well documented (or the required power to detect) The time interval over which
(Benowitz, 1996a; Matt et al., and the desire to explore potential the response is deemed to be
1999; Al Delaimy, 2002 ). moderator effects. In principle, valid is a crucial issue in testing
c) Proxy reports. For observable making a study larger does not causal models. Causes precede
aspects of behaviour, reports improve its representativeness. effects, so one must assume that
of others who know the target However, because size does in- predictor variables when mea-
individual may be useful. crease power to detect moderator sured at the same time as out-
effects, larger samples can be comes, predated the occurrence
Survey methods for evalua- used to increase confidence in the of the outcomes. Sometimes ques-
t io n generalisability of the findings to all tions are given a time frame or tim-
groups who have a sufficient ing of events is asked for to assist
Survey methods are crucial to sample size for such possible in determining sequences. Self-re-
many forms of policy evaluation. interactions to be tested. ports of periods or of dates are
These can range from surveys of Question asking: The main subject to biases in reporting with
individuals to surveys of informants issue with surveys is inconsistency events sometimes displaced in
about the activities of organisations and bias in the ways in which peo- time. Self-reports are typically bet-
(e.g. of governments or work- ple respond to questions. This is ter for recent events (due to mem-
places). Two key issues are part of a general phenomenon of ory effects). Salient events may be
addressed here: the sampling the frame of reference or context reported as experienced more re-
frame and the way the questions for the question affecting how it is cently than in reality, and less
are asked and answered. understood, and thus how it is re- salient events are prone to be for-
Sampling: To be able to gener- sponded to. Variation in frame of gotten.
alise to a population, the sample reference includes mode of sur- Aside from issues concerning
needs to be representative of the veying (e.g. face to face vs. phone the context of survey delivery, the
population. This is a function of interview vs. self-completion). way in which respondents
both the sampling frame and par- There is emerging evidence that interpret questions and response
ticipation. It is thus desirable to some modes of surveying result in formats affects their answers. One
have broadly representative sam- better response rates for sub-sec- key aspect is the extent to which
ples, recognizing that true repre- tions of the population. There is an the conceptual framework under-
sentativeness is unattainable. urgent need for research to de- pinning the questions reasonably
Participation is also crucial. Any bi- velop optimal methods for calibrat- applies across the cultural con-
ases in participation threaten rep- ing both questions and sample texts under consideration. As
resentativeness. Because often characteristics across modes (see research moves from studying
nothing is known about all or some Dillman & Christian, 2005, for a dis- issues like tobacco within Western
of those who do not participate, cussion of general issues concern- European and North American
quantitative estimation of biases is ing mixed-mode surveying). As it is cultures, to studying tobacco use
either impossible, or partial at best, beyond the scope of this volume to across cultural settings where
meaning their likely effects need to document the entire range of is- there may be different values and
be inferred. The higher the re- sues corresponding to questions assumptions, there is a need to
sponse rate, the less likely major (there are several excellent texts question the underlying assum-
biases are, but unless the rates are on this topic; e.g. Foddy, 1993; ptions that frame the research.
close to 100%, biases can occur. Fowler, 2001), we deal only with Within all cultures, there will be
Sample size is another two issues in this chapter. These variation that researchers should
important consideration. The two are the time frame over which an- try to characterise and under-
main factors to consider here are swers apply, and cultural factors in stand. The possibility that cultural
the size of effects that are expected interpreting question meaning. differences may compromise the

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 28

IARC Handbooks of Cancer Prevention

utility of some questions needs to in different ways. Where the place measures of key outcomes
be reviewed on a case-by-case answers are relatively invariant to (at least) as long as possible
basis. Some of these issues and the form of wording, one can have before the policies are imple-
methods for overcoming them are considerable confidence in gene- mented. Obviously the best way to
covered in Section 2.2. ralisability across the inevitable do this is if the measures can be
In principle, the response to a wording differences between part of the countrys ongoing
question can be directly compared languages. However, where res- surveillance system. Where this is
when the respondents are an- ponses are sensitive to wording, it not possible, the studies should be
swering the same question. Peo- is less likely that different forms implemented as early in the
ple generally assume this means are actually measuring the same process of discussing policy
the same wording. However, construct, and extra care will be change as possible.
under some conditions, the same required in translation. For detection of trends, it is
wording can result in quite differ- important that both sampling
ent questions being answered, Study designs for evaluating frame and participation rates
and different wording may be re- population interventions remain constant. This is to
quired to achieve equivalence. maximise the likelihood that
The most obvious example is ask- To best understand the impli- biases are likely to remain
ing questions in different lan- cations of policy change (including constant so that any changes are
guages, but it can occur for the community-wide dissemination of unlikely to be due to a sampling
same language where respon- interventions), research designs effects. Repeatability is more
dents assumptions about what is should be as strong as possible. In important than representativeness
being asked can vary systemati- Section 2.1 the relative strengths for determination of trends
cally, and achieving equivalence of various evaluation designs are because it requires comparability
requires different contextualising can-vassed. In short, evaluation is between estimates over time.
words for different individuals. strengthened with more obser- Such a research agenda re-
This can be caused by words hav- vations (both before and after the quires monitoring of all relevant
ing different nuances in different intervention) within the population variables in a diverse range of
cultures, or effects due to the fa- an intervention occurs in, the more communities or jurisdictions over
miliarity and or normativeness of populations that are studied in a period of time in which there are
the issues being asked about. parallel, and the more alternative differences in policy implementa-
As surveys become stan- explanations for outcomes that tion between those communities.
dardised, there is a tendency for are assessed within each study. In This will include use of repeated
surveys to converge on common addition, the use of cohorts adds cross-sectional surveying, and
ways of asking questions, thus considerable power by allowing where possible, more in-depth lon-
implicitly operationalising the mediation and moderation effects gitudinal cohort studies of samples
constructs they are interested in. to be tested more precisely. of relevant individuals (e.g. smok-
To the extent that either the Finally, representativeness of the ers, and young people at risk of
operationalisation has an arbitrary sample to the study population uptake), to begin to explore how
element or the measure is flawed, can increase the generalisability of the changes come about and
there is a risk of institutionalizing findings. The ITC study (Fong et whether some groups are affected
error. To avoid this, it may be al., 2006a) is a good example of differently to others. This survey-
important to analyze whether what can be achieved by ing will need to be complemented
different ways of asking questions attempting to implement as many by longitudinal monitoring of eco-
may improve the ability to of these attributes as possible. logical variables. The level (nation,
measure a construct. There is Achieving the strongest pos- state, local area) of the variable
always a role for asking questions sible evaluation involves putting in measurement will determine the

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 29

Ensuring effective evaluation of tobacco control interventions

practicality of maintaining ongoing Temporal relationship between outcomes of interest. Smoking

monitoring of all activity or whether intervention and change in prevalence or rates of quitting are
some sampling is necessary. target outcome; determined by multiple factors,
Such a program of data collec- Exposure-response gradient; and establishing the contribution
tion is needed to provide the infra- Biopsychosocial plausibility; of each individual intervention is
structure necessary for under- that is, the effects can be ex- difficult. The task of differentiating
standing the mechanisms of pop- plained as occurring through a the contribution of all possible
ulation level change. Among other plausible mix of biological, psy- contributors to the observed
things, it would increase under- chological and/or social pro- effects is difficult.
standing of which factors are cul- cess; In providing a summative
ture-sensitive, and which are not, Coherence across lines of evi- evaluation of the effects of an
and how the roles of various fac- dence with different threats to intervention, we need to not only
tors change as a persons position validity, e.g. similar results consider the size and nature of
towards changing and adopting tar- using aggregate data and self- effects, we also need to consider
get behaviour changes. Similarly, it reported consumption could the possibility that there is no
would allow for an understanding rule out response biases; meaningful effect. In particular, it
of how community readiness to Coherence of results from is important to make a clear
change affects realized change demonstrations of effects on distinction between evidence of
and how readiness can be modified, different parts of the theorised the absence of effects, and the
as well as the conditions that facili- causal pathway, or by demon- situation where there is a lack of
tate the institutionalization of strating efficacy of components evidence; that we really do not
change. For policy makers, it can (e.g. the evidence of efficacy of know whether an intervention
provide information on need for fur- many cessation aids makes it works or not. We recognize that
ther action. more likely that they have ef- science cannot prove the null
fects when delivered as part of hypothesis, but it can and should
Drawing conclusions about programmes of help); make statements about inter-
causes Evidence that this type of inter- ventions where there is a
vention can have effects on consistent failure to find evidence
The approach the WG has taken to other comparable outcomes of any meaningful effect.
evaluation shares more with the (e.g. on other behaviour pat- We need to qualify effects with
methods used in epidemiology to terns); a statement about generalisability.
determine causes of illness, than Consistency of observed ef- Some interventions have similar
the reliance on RCTs to assess fects across studies and popu- effects in most contexts, others
clinical interventions. As a result, lations, or clear patterns in the can be quite context-specific. This
when considering criteria to use in variability to demonstrate limits consideration needs to cover cul-
drawing conclusions about the to generalisability; tural adjustments to the interven-
effectiveness of policy inter- To which we would add: Elimi- tion itself, as well as factors in the
ventions, we have adapted the nation of theoretically possible environment that might affect its
criteria used in the epidemiology of alternative mechanisms for ex- potency (effect moderators). It is
disease (Hill, 1965). The adapted plaining the observed effects. also important to consider the di-
criteria are: Policy evaluation has added rection of effects. Some interven-
challenges to other forms of tions might prove counter-
Magnitude of the observed outcome evaluation, because productive. Clearly less evidence
effect, particularly in rela- policies usually occur in a mix and should be required to stop an in-
tionship to known naturally policies are only one set of factors tervention where the evidence
occurring variations; that are responsible for the suggests that it is counter-produc-

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 30

IARC Handbooks of Cancer Prevention

tive, than if it suggested no effect quitting. However, nobody has helping to show that policy
or only a small positive effect. shown that there is more quitting evaluation has rigorous methods
The levels of evidence in the context of stronger health and can make important
framework used to evaluate warnings being introduced. How contributions to knowledge.
discrete interventions is not reliably can one conclude that We also hope it will act as a
appropriate for use in evaluating stronger health warnings stimulate stimulus for further action to
policy interventions. We see more quitting? improve evaluation methods and
promise in adapting the criteria Finally, once the effectiveness measures. As such, this Hand-
used by the International Agency of an intervention is established, book will need to be kept as up-to-
for Research on Cancer (IARC) less powerful research designs date as possible. This might
for its Cancer Prevention will be needed to monitor involve periodic revisions once the
Handbooks. This is essentially a continuation of effects and/or to principles have been tested, or
four-level system: Sufficient evi- assess whether similar magni- some other mechanism for
dence of an effect, Limited tudes of effect are attained with moving our expected standards
evidence, Insufficient evidence, new populations. It is only when forward. There is a particular need
and Evidence suggesting lack of there is reason to believe that to update the material on specific
effect. The WGs concerns with there are real differences that measures and on the status of
adapting this framework to our stronger research methods might data repositories, as these are in
purposes, is that it does not allow need to be reapplied. a constant state of change.
for gradations in confidence of We hope this Handbook will
concluding no effects, it does not How to use this Handbook provide a stimulus to work towards
clearly differentiate adverse greater coordination of the ways in
effects, and it does not consider This Handbook is designed as a which policy evaluation operates
issues of generalisability, all of guide for program and policy and the development and/or
which are desirable qualifiers in evaluators. The WG hopes it will expansion of international reposi-
the policy context. One possibility be used as a tool for training new tories to collect the relevant data
would be to adopt a matrix as evaluators and those who need to and reports, and user-friendly
shown on this page, with understand evaluation principles. ways to extract this information
additional statements on effect It can act as a reference source for and synthesise it.
size (for established effects) and arguments about the role of Some future actions the WG
on generalisability. evaluation and the way to think would like to see:
The effect size could be rated about evaluation, and by
as: Small, Medium, or Large (or extension the development of Work to coordinate and arrive
undetermined). Consideration effective interventions. In doing at a set of core terms that are
needs to be given to whether the so, we hope it provides a most useful for our field.
highest level of certainly could be framework for increasing the Work on what the criteria for
applied to interventions where scientific credibility of the field, by validation should be for the
there had not been a direct
demonstration of effects on the The evidence matrix
target outcome, or whether
inferred effects could ever be No evidence is available
rated as better than Probable. For Possible effect: Negative Not meaningful Positive
example, it has been shown that
Probable effect: Negative Not meaningful Positive
larger health warnings lead to
more thought about quitting, and Established effect: Negative Not meaningful Positive
that more thoughts predict future

CHAPITRE1.janvier12:Layout 1 12/01/2009 13:29 Page 31

Ensuring effective evaluation of tobacco control interventions

various kinds of measures In conclusion, this volume out the latest methods and some
used, and how that relates to should be thought of as an impor- guidance in assessing the need to
the different types of mea- tant step in a process, rather than move beyond the measures and
sures. as a static recipe book for evalu- methods described here. We be-
Development and agreement ating tobacco control interven- lieve that this dynamic but sys-
on use of prototype formats for tions. The methods described and tematic approach is the best way
reporting on frequently re- the measures provided are the to approach the future because it
peated interventions, such as best available today. The princi- provides a framework that allows
mass media campaigns. This ples outlined in this volume will evidence to guide action both be-
will facilitate their combination persist, but those principles re- fore and after programmes or poli-
into meta-analytic studies, es- quire that methods and measures cies are implemented.
pecially important for under- be adapted to the changing world.
standing where and when The WG has built into this Hand-
things work. book some guidelines for seeking

2.1 The importance of design in the evaluation of

tobacco control policies

Introduction and Rootman et al., 2001 for the design, not in the statistics. No
evaluation of health interventions). statistical method, not even those
The goal of this section is to We focus on impact evaluation, that whose name may imply some
describe elements of research is, whether the implemented policy special status in this regard (e.g.
design for evaluation studies and led to desired outcome(s), rather causal models) can confirm causal
how they can form the basis for than other forms of evaluation, such direction. A structural equation
stronger conclusions about the as process evaluation (e.g. model (with or without latent
impact of policies. The groundwork identifying and evaluating the variables) that yields a significant
for evidence-based medicine has processes that led to the creation coefficient for AB cannot be used
come from painstaking evaluation and/or the implementation of a by itself to conclude that A causes
studies of treatment options. It policy). B rather than B causes A. To do so
follows then that the foundation of More specifically, our aim is to would be to fall prey to the logical
an emerging evidence-based public highlight how the inclusion of error of affirming the consequent:
health policy must begin with specific features in the design of a
building a database from rigorous policy evaluation study can lead to Statement: If A causes B, then the
evaluation of public health policies. more concrete conclusions about AB path will be statistically
It should be noted that the elements the possible causal impact of that significant
of research design that we offer in policy. This section focuses mostly Observation: The AB path is
the domain of population-level on the structural aspects of statistically significant
tobacco control can easily be research design. Good evaluation False Conclusion: Then A causes B
applied in efforts to evaluate any design involves the selection of
population-level policy or inter- appropriate measures of high The advantage of more
vention in public health. Just as validity and reliability. Guidelines advanced statistical techniques is
surely as the laws of gravity operate and recommendations for such that they can take into account
in Mumbai as they do in Lyon, the measures, across tobacco policy characteristics of the data to yield a
principles of causality, and the domains, are provided in other better estimate of the AB path
methods employed to make more sections of this Handbook. coefficient. For example, structural
confident judgments about causal This section does not provide a equation modeling with latent
relations, are not constrained by review of the statistical analyses variables (Bollen, 1989; Hoyle,
location nor area of research. that are employed in evaluation 1995; Kline, 2005) explicitly models
This section does not offer a studies. However, we do wish to the measurement error from
comprehensive review of evaluation point out one common mis- multiple measures of a construct
research design. (see Cook & conception about the role of (latent variable), so that the resulting
Campbell, 1979; Shadish et al., statistical methods in attempts to estimate of the relation between that
2002; Rossi et al., 2003 for ascertain causality from data: latent variable and another variable
IARC Handbooks of Cancer Prevention

that would otherwise have biased This suggestion is part of the sured or design features to be
the estimate1. However, this recommendations for best incorporated, so that the evaluation
statistical method does not practices that the US Centers for of the policy can explicitly take
advance in any way the argument Disease Control and Prevention them into account.
that A causes B rather than B created for tobacco control
causes A. In fact, a system of programmes in 1999. They Causality
variables with paths going in one strongly recommended that 10%
direction will yield exactly the of the total budget for a Ultimately, the goal of scientific
same model fit as if that same comprehensive tobacco control inquiry is to attempt to identify
system of variables had all the programme be allocated for causal relationships. The concept
paths going in the opposite evaluation and surveillance efforts of cause has challenged and
direction. associated with the programme vexed philosophers and scientists
The key to advancing the quest (1999a).The WHO EURO Working alike through the centuries. The
for causality is to be found instead Group on Health Promotion seminal work of epidemiologists,
in the design of a study. Here we Evaluation made a similar call for such as Doll and Hill (1950,1954),
offer a review of the elements of resources for proper evaluation Wynder and Graham (1950), and
the design of evaluation studies (Rootman et al., 2001). Levin et al. (1950), on the
that will increase the confidence Planning should first identify the association between smoking and
with which causal statements can constructs that are theorized to be lung cancer, stimulated the
be made between and among affected by the policy being thinking about identifying criteria
variables (e.g. whether a tobacco evaluated (i.e. outcome variables that would be used in the
control policy had a desirable and mediators), as well as those determination of causality in
causal impact on behaviour). that could influence the strength of epidemiology. This influential work
In our review of research the impact of policies on those was the basis of the US Surgeon
design features for the evaluation outcome variables and mediators Generals Report of 1964, and
of tobacco control policies, we (i.e. moderators). The choices of was summarized in several
describe the framework of the which constructs to include in an articles including one by A.
International Tobacco Control evaluation study come from this Bradford Hill (1965). We have
Policy Evaluation Project (ITC process. This Handbook provides adapted the original nine
Project), which incorporates a descriptions of the constructs, and considerations of Hill, in assessing
number of the design features that their measures, for many of the the strength of evidence, into
are discussed here (Fong et al., Framework Convention on seven criteria concerning the
2006a; Thompson et al., 2006). Tobacco Control (FCTC) policy possible causal impact of a
domains. tobacco control policy:
The importance of pre-eval- Identification of other possible Consistency of observed
uation knowledge in the events that might act as associations across studies
design of evaluation of confounding factors (e.g. other and populations
policies tobacco control policies being Magnitude of the reported
implemented and programmes in association
The planning and design of operation, tobacco industry ini- Temporal relationship between
evaluation efforts should be the tiatives) should also be addressed intervention and change in
first step in the process of in the planning stage. Knowledge target outcome
formulating and implementing a of possible confounders may allow Exposure-response gradient
policy (or any kind of intervention). additional variables to be mea- Biopsychosocial plausibility
This assumes that the common variance of the multiple measures of the construct perfectly capture the latent variable that the measures
are intended to capture.

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 35

The importance of design in the evaluation of tobacco control policies

Coherence of results across dependence? What are the most reduced; however, it should be
other lines of evidence valid measures of perceived risk noted that this conclusion is not
Evidence that this type of among smokers? These basic automatic. It may be that the way
intervention can have effects measurement issues must be in which a sample deviates from
on other comparable outcomes dealt with in order for the validity the population is not (strongly)
(e.g. other behaviour patterns). of a causal inference to be associated with the variables
addressed with any substance or being analyzed; thus, the net
From criteria for causality to meaning. Sections 3.1 to 3.3 of impact may not be as great as
research design: the frame- this Handbook review the might have been expected.
work of Cook and Campbell construct validity of measures to Another way in which external
assess the effectiveness of validity applies to the evaluation of
Cook and Campbells (1979) tobacco control policies. policies and interventions is in the
seminal treatise on the relationship External validity, also known as distinction between efficacy and
between research design of a ecological validity, refers to the effectiveness (the former referring
study and the strength with which extent in which the conclusions of to a treatment effect in a controlled
a causal relationship might be a given study are maintained context, and the latter referring to
ascertained, is our starting point for across different persons, settings, the effect of that same treatment
a discussion of how design treatments, and outcomes in a more real world setting). In
features can be employed to (Shadish et al., 2002). External general, effectiveness is lower
evaluate the impact of population- validity considers issues such as than efficacy. Interventions
level tobacco control policies. whether a phenomenon studied in originally developed and tested in
Central to the Cook and a laboratory setting, often highly controlled experimental
Campbell framework is the concept involving university undergra- settings are often not as effective
of validity. Cook and Campbell duates, will be obtained in a when implemented in the real
defined four kinds of validity that are real-world environment, which world. This necessitates changes
critical in assessing the validity of a includes individuals from the in an intervention when brought
causal statement: construct validity, general population. However, in into real world settings in order to
external validity, statistical conclu- the public health realm, two issues maintain its effectiveness, as in
sion validity, and internal validity. of external validity (whether or not the more controlled settings.
Construct validity refers to the the issue is expressed in these The two types of validity
extent in which a measure terms) arise. First, there is the described above set the stage for
captures the construct that it is importance of sampling. In the next two forms, which deal
intended to assess. An issue that evaluating a tobacco control policy with the relationship between two
arises in considering construct being implemented in a large and variables and whether the
validity is the method of diverse population (e.g. in an measured association is indicative
measurement and whether there entire country), probability of a causal relationship. For
exists a close or distant sampling methods will provide the simplicity, our discussion revolves
relationship between those best assurance that the study around whether there is a causal
measurements and the construct. sample will be representative of relationship between two vari-
In the area of tobacco control, the population from which the ables, although the logic applies to
examples include: Is cotinine a sample has been drawn and to relationships among more com-
valid measure of exposure to which the intended intervention is plex sets of variables.
tobacco smoke? Is the Fager- directed. To the extent that a Statistical conclusion validity
strom Test for Nicotine sample deviates from a repre- refers to whether there exists a
Dependence (Heatherton et al., sentative sample, the external statistical association between the
1991) a valid measure of nicotine validity may be correspondingly two variables. Issues surrounding

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 36

IARC Handbooks of Cancer Prevention

the consideration of statistical 1. Who the study is collecting internal validity. One such feature
conclusion validity include: statis- measurements from relative to is the inclusion of multiple
tical power, assumptions of the the policy that is being measures within the domain of the
statistical tests being employed, evaluated. Some evaluation policy that is being evaluated,
the inflation of Type I error rates studies only measure the toward the goal of achieving
due to the conduct of multiple impact of the policy by col- convergent validity (multiple
statistical tests, unreliability of lecting measurements from measures of the same construct
measures, as well as the selection those who were exposed to the should be related to each other).
of appropriate covariates/control policy; other evaluation stu- For example, in a study of the
variables in estimating the dies, however, measure the impact of graphic warning labels,
relationship between the two impact by also collecting we would have greater confidence
variables. Though correlation is parallel measurements from that there was a causal impact of
important and necessary, it is not those who were NOT exposed the labels if, after being exposed
sufficient to imply a relationship for to the policy. to them, smokers were signi-
causation, as captured in the 2. When the measurements were ficantly more likely to: (1)
dictum correlation does not suffice collected relative to the policys self-report that the warnings made
to establish causation. implementation. Some evalua- them think about the health risks
Internal validity refers to the tion studies only collect of smoking, (2) more likely to call a
extent to which the studys design measurements after the policy quit line, and (3) more likely to cite
is rigorous enough to support the was implemented; others the warnings as a reason for
conclusion that the statistical collect measurements both seeking assistance for quitting,
relationship between two variables before and after the policy was than if only one of these measures
is due, at least in part, to a causal implemented. was included in the study.
relationship. Here we focus on 3. How many measurements are Another study feature is the
issues of internal validity, as adding collected. Evaluation studies inclusion of measures that are
design features to a study (e.g. a vary in the number of relevant to some other policy that
control group) is largely prompted measurement time points, is NOT being evaluated, as it is
by the objective of increasing the ranging from a pre-post design not changing in the study
internal validity of the study. The involving one pre-policy and population toward the goal of
most relevant threats to internal one post-policy time point, to a establishing discriminant validity
validity in the evaluation of tobacco time series design involving (i.e. measures of different con-
control policies are presented in many measurements over time. structs should NOT be so related
Table 2.1. A further design parameter to each other). In the policy
arises in evaluation studies evaluation context, measures of
Basic study designs and fea- involving more than one mea- the non-changing policy should
tures surement over time; that is, NOT show change that is
whether those multiple measure- comparable to that in measures of
We now proceed to a description ments are obtained on the same the policy under evaluation. In
of aspects of an evaluation study, individuals (the longitudinal or addition, inclusion of measures
and make a distinction between cohort design) or on different that will allow the testing of
study design and a study feature. individuals (the repeat cross- mediational models are designed
The study design is the sectional design). to elucidate the causal pathways
structural aspect of an evaluation In contrast, a study feature is a between the policy and an
study, defined by three dimen- non-structural aspect of a study important outcome variable, such
sions: whose inclusion will enhance the as a quit attempt. For example, in
ability to address threats to an evaluation study of graphic

AMBIGUOUS TEMPORAL PRECEDENCE: Lack of clarity about which variable occurred first may yield confusion
about which variable is the cause and which is the effect.

Cross-sectional survey data are particularly vulnerable to this threat.

SELECTION: Differences in respondent characteristics between groups that could also cause the observed effect.

For example, observed differences between countries could be due to characteristics of the inhabitants rather
than to differences in policies. Cross-sectional studies are particularly vulnerable to this threat.

CONCURRENT EVENT CONFOUNDING (HISTORY): Events occurring concurrently with treatment could cause the
observed effect.

For example, observed differences between countries could be due to other events or some other intervention (e.g.
mass media campaign) rather than to differences in policies. This kind of confounding also includes activities of
tobacco companies, which may be covert. These other events can cause the observed effect to seem stronger or
weaker, positive or negative, compared to the policy/interventions true effect. Concurrent event confounding could
occur in longitudinal (cohort) studies, as well as in cross-sectional studies.

TEMPORAL TREND CONFOUNDING (MATURATION): Naturally occurring changes over time could be confused
with a treatment effect.

For example, trends over time occurring prior to the policy being evaluated, that are unrelated to the policy, could
mimic the expected impact of policy or an adverse impact of policy (e.g. bar revenues dropping prior to the
implementation of the policy could be the cause of a decrease in bar revenues observed after a smoke-free law
compared to before the law).

ATTRITION: Loss of respondents to treatment or to measurement can produce artefactual effects if that loss is
systematically correlated with conditions.

Artefactual effects due to attrition can occur in cohort surveys of different groups (e.g. countries) where the attrition
rate varies across the groups, and that attrition is linked to the outcome variable either directly or indirectly, via its
linkage with an important predictor of that outcome variable. Related to attrition is non-respondent bias, in which non-
respondents in an evaluation study could be differentially affected by the intervention (e.g. the very disadvantaged,
who may be missed by both the intervention and its evaluation). Note that attrition effects in cohort surveys and
selection effects in cross-sectional studies both involve biases in the sample that could lead to artefactual effects.

CONDITIONING (TESTING): Exposure to a test can affect scores on subsequent exposures to that test, an
occurrence that can be confused with a treatment effect.

An example of this threat is the presence of time-in-sample effects in cohort studies: participation in prior waves of
a survey change the responses at the current wave (e.g. knowledge items, if repeated, can lead to observed higher
levels of knowledge because of taking part in prior surveys).

IARC Handbooks of Cancer Prevention

warnings, confidence that the analysis. For instance, the unit has affected them since. One
introduction of graphic warning could be human respondents to a should be cautious about the
labels was responsible for an survey, consumption figures from findings of studies relying solely
increase in quit line calls, rather an economic database, or a venue on such strategies, as con-
than a mass media campaign, at which the levels of respirable siderable experimental and survey
would be greater if there were suspended particulates are being evidence has demonstrated that
measures included of the mass measured. The diagram of this such recall is subject to strong
media campaign (e.g. recall design is as follows: retrospective biases related to the
measures of the campaign), and respondents theories on how the
that these measures were not X O1 intervention might have affected
correlated with the likelihood of them. These recall biases can
quit line calls. O1 occurs after the policy X occur when the respondent
In short, the internal validity of has been implemented. remembers the past as being
an evaluation study can be In this post-only design, there more similar to the present than it
increased by including multiple is no sense of what the actually was (consistency bias).
measures of the policy, or other observations would have been in When asked to estimate whether
intervention, that is hypothesized the absence of X ; therefore, this an intervention affected them, the
to be responsible for the policys design alone is very poor. It does recall bias could be in the direction
impact, as well as measure(s) of not defend against any of the of greater contrast (i.e.
other possible causes. threats to internal validity except remembering the past as being
ambiguity about temporal more discrepant from the present
Designs for evaluation precedence. The history effects, than it actually was, with the
studies and all threats associated with magnitude of this contrast bias
changes over time, are un- being correlated with the res-
In considering designs, we use the controlled. pondents belief about the strength
terminology of Cook and Campbell Given that none of the threats of the intervention (Conway &
(Cook & Campbell,1979; Shadish to internal validity are dealt with in Ross, 1984; Ross, 1989; Pearson
et al., 2002) in which X stands for this design, its value for evaluating et al., 1992)).
the treatment/policy that is being policies, or interventions of any Another more promising
evaluated (e.g. introduction of kind, is low. And yet it should be method of amplifying the value of
graphic warning labels, increase in noted that the absence of a pre- the one-group posttest-only
taxation, smoke-free legislation), test in this design often arises design is to incorporate data about
and O stands for an observation when the need for evaluation is pre-policy observations that are
(e.g. a survey data wave, quarterly recognized too late for a proper available from other sources. For
report of cigarette consumption, or pre-test to be planned and example, if a new tobacco sur-
a set of data gathered by an air implemented. This highlights the veillance survey were created
quality monitoring device). need for evaluation strategies to after a tobacco policy had been
be established well before the implemented, incorporating pre-
Designs without control groups intervention is applied, as valence data from other
discussed earlier. surveillance surveys conducted
The one-group posttest-only In an effort to estimate the prior to the policy would offer
design: impact of X, researchers some comparison with a pre-
sometimes ask post-only res- policy measurement. The
In this design, the researcher has pondents to recall their behaviour, adequacy of this strategy would
conducted one post-policy obser- opinions, or attitudes prior to X, or depend on the similarity between
vation on some relevant unit of to make a judgment as to how X the two surveys (e.g. sampling,

method of measuring the outcome acquiescence bias)) are controlled example of the importance of
variable(s)). for at the individual level. This taking into account these time
leads to greater statistical power, related trends is presented later in
The one-group pretest-posttest and the magnitude of this this section.
design: increased statistical power is a In addition, designs with
function of the extent to which multiple measurements over time
This design adds a pre-policy individuals responses at O1 and allow the evaluation of poli-
observation to the previous O2 are correlated. ces/interventions whose intensity
design, and is denoted as follows: varies over time, permitting the
Multiple pretest-multiple possibility of correlating intensity of
O1 X O2 posttest design: intervention (e.g. measured by
programme expenditures) with its
Here the addition of the pre- This design extends the single- corresponding impact. An example
policy observation allows the group pretest-posttest design by of this approach was used in
computation of the difference the inclusion of additional pretest studies evaluating the California
score, O2 O1, some portion of measurements and multiple Tobacco Control Programme,
which might be causally posttest measurements within the which distinguished between three
attributable to the intervention X. group that received the time periods characterized by
The presence of an explicit policy/interventions, as in this different levels of program
measurement of the pre-post example with 3 pretest and 3 intensity: pre-programme, early
difference makes this far superior posttest measurements: programme, and late program
to the post-only design. (Pierce et al., 1998a).
This design is considerably O1 O2 O3 X O4 O5 O6
better than the one-group posttest
only design. There is an explicit With many time point Designs with a separate con-
measurement prior to the policy measurements, this design trol group but with no pretest
that is not inferred or reliant on the becomes a time series design.
validity of a respondents memory Variations within this multiple time Posttest-only design with non-
or estimate of effect. The O1 acts point model include multiple equivalent groups:
as a control against which the pretest-single posttest and the
post-policy measurement O2 can single pretest-multiple posttest In this design, a control group is
be assessed. In a repeat cross- designs. These designs provide added to the one-group posttest-
sectional design, when O1 and O2 opportunities for assessing the only design. This design can be
are taken from different samples impact of policies/interventions on utilized if the evaluation process
in the same population, the control the time related trends in the started too late to conduct a
exists at the level of the group. In outcome variable that are proper pretest measurement. If
a cohort design, when O1 and O2 unrelated to the policy, but which individuals were randomised to
are measured from the same without knowledge or mea- conditions, the groups would be
individuals, there is an additional surement of those trends, would equivalent on average, as
level of power: each individual bias the measurement of the randomisation equates groups
acts as their own control. Thus, policys impact. When present, with respect to all features of the
response tendencies (e.g. the time related trends constitute an individuals being measured.
tendency to use the high end of a important confounding factor However, in the evaluation of
response scale, or to agree with against which the effect of the national-level tobacco control
survey questions (also known as policy must be evaluated. An policies, or in other cases where

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 40

IARC Handbooks of Cancer Prevention

the unit of intervention is a O1 X O2 groups depends on the selection

jurisdiction or organization, there O3 O4 of those control groups and their
is no possibility of randomisation, similarity. Various strategies can
and hence, no possibility of The quasi-experimental design be used to enhance the selection
equating groups2. The resulting combines both elements that were of control groups that are
design is the posttest-only design used to enhance the internal objectively similar to the poli-
with nonequivalent groups: validity of the one-group posttest cy/intervention group on dimen-
design; added is a longitudinal sions that matter (e.g. smoking
X O1 component and a between-groups prevalence, socio-economic sta-
O2 component. In this design, the tus, similar levels of tobacco
critical starting point for an control intensity prior to the
Case-control studies fall into assessment of the causal impact policy/intervention that is being
this category, and often include of X is the construction of a evaluated in the study).
various procedures to enhance multiple difference score; the It would be more reasonable,
the possibility of causal infer- change over time of the for instance, to compare the
ences, such as methods for intervention group is compared to impact of graphic warnings in
matching the two nonequivalent the change over time of the group Canada to a control group in the
groups. Issues surrounding these that was not exposed to the USA than to a control group in
methods are well-identified in the intervention. The expectation, if Bangladesh. It should be noted
epidemiological literature (Roth- the policy was effective, is that the also that the similarity is not
man & Greenland, 1998), but it pre-post difference in the policy limited to the characteristics of the
should be noted that some of group will be greater than the pre- group. Relevant concurrent events
them, although possible with post difference in the non-policy should also be similar in the two
medical records among patient group. countries. If, for example, the
populations, may not be possible The internal validity of the impact of graphic warnings in
for implementation in evaluation quasi-experimental design, al- Canada were compared over time
studies of national-level policies. though generally greater than the with a control group in the USA,
single group pre-post design, is but during that time between the
Pretest-posttest designs with a dependent on the extent to which pre- and post-policy measure-
control group: the non-policy group is similar to ments there was a large decrease
the policy group (e.g. similar levels in taxes in the USA, but not in
This design is the basic quasi- of economic development, tobacco Canada, the test of the graphic
experiment in which the pre-post use prevalence). The greater the warnings would be confounded by
measurement of the group that similarity, the more reasonable the the fact that the control group had
received the policy is compared to comparison will be. changed in ways that would mimic
another group that did not receive Randomisation to conditions is the hypothesized impact of the
the policy: impossible in studies of policies. warnings. Although the dis-
The strategy of strengthening an crepancy of the difference scores
evaluation study via control would be consistent with the

It should be noted that even in a fantasy world where people are actually randomly assigned to live in two different countries, one of which
implemented a policy that the other did not, the randomisation would simply equate the personal characteristics of the respondents across
the two groups. On average, the two countries would be populated by people who were equal on age, gender, age of initiation, number of
past quit attempts, attitudes about the tobacco industry, etc. But left uncontrolled, would be the concurrent events that might occur along
with the intervention that was being evaluated. The randomisation of people would offer no assistance for eliminating the possibility that
observed differences between the two countries was due to differences in concurrent events. This demonstrates the limitations of
randomised trials in the real world, even if such were possible.

conclusion that the graphic cause must precede the effect. countries) are non-equivalent; that
warnings had a desirable impact, The temporal priority condition is, they could differ on dimensions
the pattern of the data could also provides challenges to cross- that are correlated with the
be explained by a significant sectional studies by measuring outcome measures used for the
unfavorable change in the dif- possible causes and effects at the evaluation of the policies. Selec-
ference score in the US control same point in time. It should be tion biases are difficult to identify
group due to the decrease in noted, however, that the temporal and eliminate. Randomisation to
taxes. priority condition refers to the conditions of an experiment is a
This example points out that temporal ordering of the under- powerful method for equalizing
the structural features of the lying constructs that are being potential biases due to the non-
design endow an evaluation study measured, rather than the equivalence of characteristics of
with the potential for teasing apart temporality of the data collection individuals. However, randomi-
possible alternative explanations, or observances per se. sation is not possible in studies
but that full realization of this In most cases, it is relatively evaluating national-level tobacco
potential is found in the selection simple to establish that the policy control policies; therefore,
of measures and analytic stra- precedes a measurement. Even in selection bias in some form
tegies that are designed to test for a posttest-only design, temporal remains in all evaluation studies.
the causal mechanisms that precedence is established: the One approach to dealing with
underlie an observed difference measurement followed the imple- selection bias within a given
between a policy group and a non- mentation of the policy. However, evaluation study is to select
policy group. These strategies are because the key question is control groups that are as similar
described below in the section on whether the evaluation measure as possible to the policy group.
mediation. changed as a result of the policy Thus, in evaluating the impact of
(i.e. whether the policy caused a policies in Canada, using the USA
Threats to internal validity change in the evaluation as a non-policy control group
and methods for reduction measure), the single mea- would be advantageous, as they
surement made in the posttest- are quite similar on many cultural
Having described some of the only design is insufficient even as and societal dimensions. If a
basic designs and strategies used the temporal precedence con- policy in Canada were evaluated
in evaluation studies, we now dition is satisfied. using, say, Kenya, as a control
proceed to a discussion of the This discussion highlights the group, the inherent differences in
threats to internal validity and importance of multiple time point the two countries would be much
methods for reducing them. As studies in assessing the causal greater, leaving room for many
mentioned earlier, the rigor of an impact of a policy/intervention, more confounding factors.
evaluation study is not only found and is illustrated in greater detail A second approach is to
in its design, but also in the below. measure differences between
features added to a study to countries on constructs that might
enhance its power and internal Selection: systematic differ- vary and act as possible
validity. Examples are provided ences over conditions in confounding factors in the
below. respondent characteristics that evaluation of policies. For
could also cause the observed example, in evaluating a policy in
Ambiguous temporal prece- effect: China compared to the USA, a
dence: possible confounder might be the
Selection bias refers to the fact fact that China is known to be a
A necessary, but not sufficient that individuals in different groups more collectivistic society, while
condition for causality is that a (e.g. different states, provinces, the USA is a more individualistic

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 42

IARC Handbooks of Cancer Prevention

society. Knowing this difference, tobacco research, it has been increased resources for cessation
the evaluation study could add a shown that tobacco industry- programmes, and/or campaigns to
measure of individualism-collec- funded studies of secondhand raise awareness of existing
tivism (Triandis & Gelfand, 1998), smoke are much more likely to cessation programmes.
and correlate this variable with the conclude that it is not harmful, For example, in 2003,
policy-relevant variables in each which is at odds with the very countries of the European Union
country. If individualism-collec- large number of non industry- implemented new tobacco-use
tivism was uncorrelated with the funded studies concluding that warnings, which were prominently
policy-relevant variables, then this secondhand smoke is harmful displayed covering 30% of the
would suggest that, even though (Barnes & Bero, 1997,1998; for package area. This corresponded
the two countries differed on this, review, see Bero, 2005) with the minimal standard of
it was not correlated with the warning labels under the
policy and thus could not be a History: events occurring con- Framework Convention on
viable alternative explanation for currently with treatment could Tobacco Control (FCTC). The ITC
observed policy impact. cause the observed effect: Four Country Survey was
The third approach considers launched in October 2002, in
multiple evaluation studies of the The internal validy of studies that order to collect the pre-policy data
same policy in different settings evaluate the impact of policies for evaluating the impact of this
and different times (i.e. of the over time, is threatened by events enhancement of the warning
overall consistency of the effects). occurring concurrently with treat- labels. In May 2003, the second
This is adopted from one of Hills ment/target policy which could wave was conducted in the same
criteria. If graphic warning labels cause the observed event. It is manner as the first post-policy
are found to be effective in often the case that one treat- data collection.
motivating individuals to quit ment/policy intervention is By the time of the second
smoking in Canada, Thailand, implemented in conjunction with survey, another important tobacco
Venezuela, Brazil, and Belgium, other policies/initiatives relevant to control policy had been put into
then our confidence increases in tobacco control. There are often action. In February 2003, the
making a general conclusion other events, programmes, and United Kingdom implemented a
about the causal impact of graphic interventions that are ongoing at comprehensive ban on advertising
warning labels. Making general the time of the policy that is being and promotion of tobacco-related
conclusions about policy impact evaluated. Therefore, a major products, via billboards, maga-
will not and cannot occur on the challenge is to estimate the impact zines and newspapers, direct mail,
basis of a single study, but rather of a specific policy in the field of domestic sponsorship (May 2003),
after the consideration of multiple other interventions that are website advertising and promo-
studies across multiple countries ongoing simultaneously. tions, and exterior signs in store
and time points. This principle is This is likely a common windows. This second policy
not limited to the evaluation of occurrence. If a government complicated the quest for
tobacco control policies. launches a comprehensive toba- measuring the impact of the
It is worth noting that lack of cco control programme, a frequent enhancement of the European
consistency across studies and recommended strategy would Unions warning labels. Below, we
provides an opportunity to be to implement multiple policies outline an empirical strategy for
examine what factors might be and interventions. This compre- distinguishing the effects of
responsible for that variance. It hensive approach might include different interventions.
may be that studies with weak mass media campaigns, higher Factors that also influence the
designs yield different conclusions taxation, advertising/ promotion/- outcome measures of an
than those with stronger ones. In marketing restrictions, bans, evaluation study of a specific

tobacco control policy include activities. In contrast, more policy- Maturation: naturally occurring
activities of the tobacco industry, specific outcome measures, such changes over time could be
which are designed to reduce or as label salience or the self- confused with a treatment
neutralize the effect of tobacco reported extent to which a smoker effect:
control policies and programmes. states that the warnings have
Without consideration of these made them think about the health Typically, the term maturation
countermeasures (which could risks of smoking, would be less refers to natural changes in
include explicit inclusion of likely to be influenced by industry individuals over time, such as
industry activity variables), a activities. And here there is a changes that children undergo as
policy evaluation study could lead trade-off: the measures of policy they grow older. However, the
to incorrect conclusions. impact that are specific to that concept might instead be called
Although the importance of policy are less vulnerable to time-dependent changes that are
identifying and measuring the influence by tobacco industry unrelated to the treatment. An
impact of tobacco industry counter-activity; as the measures example of how this concept must
activities cannot be over-empha- become broader (e.g. going from be identified and controlled for,
sized, the impact of such activities label salience to perceptions of comes from the claim made by
will vary depending on the out- risk to intentions to quit to quit opponents to the comprehensive
come measure. Broad, down- attempts), they are more smoke-free legislation in Ireland
stream outcome measures, such vulnerable to impact from tobacco that sales volume in pubs had
as prevalence rates, quit attempts, industry influences. declined as measured before and
etc., are likely to be most strongly after the March 29, 2004 ban
affected by tobacco industry (Figure 2.1).

Irish Ban
March 2004

Figure 2.1 Pub sales volumes immediately before and after implementation of the Irish smoking ban in
Source: Central Statistics Office of Ireland
Sales volumes are indexed so that sales volume in 1995 = 100

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 44

IARC Handbooks of Cancer Prevention

The data on the volume of pub in 2001, and then began to fall Time trends can also work in
sales before 2003 and after the fairly steeply. When the full nine the opposite direction. Suppose
2004 ban, as shown in Figure 2.1, year profile is considered, the that the ban in Ireland was
reveals that the volume of pub decrease between 2003 and 2004 implemented between 1997 and
sales (indexed at 100 for volume does not appear to be any 1998. If the evaluation study had
of pub sales in 1995) in 2004 was different than what would be been conducted with data from
lower (103.9) than it was for 2003 expected by the secular trends. only those years, it would have
(109.6). With just those two data The decline between 2003 and shown an increase in sales, which
points, it might be concluded that 2004 was not significantly more might lead to the false conclusion
the Irish ban caused a decline in dramatic than the declines that the ban was the cause of this
sales in pubs. experienced between 2001 and increase. Again, consideration of
However, Figure 2.2 presents 2002, and between 2002 and the pre-policy time trends would
the volume of pub sales for nine 2003. When the more long-term reveal that the secular trend was
years (19952003) prior to the maturation trends are con- indicative of increasing sales, and
Irish ban. Taking into consi- sidered, there was no greater taking that trend into account
deration the data from years prior decline after the smoke-free law would likely lead to a more proper
to 2003 leads to a very different had been implemented. Thus, the conclusion that the ban had no
conclusion. hypothesis that the Irish ban had a impact on sales.
Sales volumes had been rising detrimental impact on the volume The implications for research
steadily since 1995, hit their peak of pub sales is not supported. design are clear: evaluating the

Irish Ban
March 2004

Figure 2.2 Pub sales in volumes in Ireland for the period 1995-2004
Source: Central Statistics Office of Ireland
Sales volumes are indexed so that sales volume in 1995 = 100

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 45

The importance of design in the evaluation of tobacco control policies

impact of policies is best impact of an intervention over time for the subsequent three. Biener
conducted with the inclusion of is the interrupted time series et al. (2000) used similar methods
data that allow the evaluation to design (a specific version of this to analyze prevalence data in
take place within the context of general design is the regression Massachusetts versus the
time trends. This example discontinuity design). In these remaining US states (except
highlights the value of having a designs, which require a fairly California because of their similar
surveillance system in place for lengthy series of observations over comprehensive programme), and
collecting data over time on time, the impact of an intervention concluded that the Massachusetts
outcome variables of interest. can be measured by its impact on programme led to a continued
Although the Irish pub data the mean function of the time downward trend in prevalence,
illuminate the importance of time series. In the regression dis- compared to the flattening of the
trend data, it also provides an continuity analytic framework, a downward trend in the other US
example of how even good time distinction is made between the states during that same time
trend data alone can sometimes be regression line that fits the data period.
incapable of yielding a clear points (capturing the relation Keeler and colleagues (1993)
estimate of policy impact. To between the outcome variable and examined monthly time series data
illustrate this, suppose the ban time) before the intervention, and from 1980 to 1990 in California in
occurred in 2001 instead of 2003, the regression line that fits the data their analysis of the association of
and the evaluation was conducted points after the intervention. The cigarette prices, taxes, income, and
with pub volume data from just 2001 analysis compares the two lines; anti-smoking regulations with
and 2002. Here, consideration of the effect of the intervention is cigarette consumption. Reduced
the time trend might be taken to measured as the difference in the consumption was found to be
mean that the ban definitely slope, the intercept, or both associated with tobacco control
reduced sales; however, it was still parameters of the line. This kind of policies. They highlighted the
positive up to that point. design can provide powerful impact of the tax increase in 1989,
If only the time trend were evidence for the impact of a policy which led to a greater decline in
taken into account, one might be in its temporal context. There are consumption, followed by additional
even more confident of the a number of sources that describe tax increases at other points along
conclusion that the ban decreased these models (Trochim, 1984; the time series.
sales. However, in 2001, Ireland Trochim et al., 1991; Box et al., In general, multiple time point
passed a law that limited the use 1994). data, particularly if such data are
of alcohol, which had an adverse Time series approaches have also available with control groups,
impact on sales volume. Because been used in evaluating the provide strong potential for teasing
of the presence of this known impact of tobacco control out possible confounding due to
negative causal factor, the impact programmes. For example, Pierce time related alternative factors,
of the Irish smoking ban would et al. (1998a) used piecewise and for providing confirmatory
remain ambiguous. Although time regression analysis on time series evidence for the impact of policies
trend data are important in data on cigarette consumption and programes. The strength of
resolving some threats to internal from 1983-1997 in California, this potential (and therefore
validity, they fail to eliminate the versus the rest of the USA, to confidence in attributing changes
threat to validity represented by demonstrate that the California in behaviour or some other
concurrent events in the absence Tobacco Control Programme, important outcome measure)
of information on the impact of initiated in 1989, led to declines in grows with the number of post-
such events. consumption. They also found that intervention data points, which
A research design that is also the impact of the programme was means that more definitive
concerned with understanding the greater for the first five years than conclusions might be reached

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 46

IARC Handbooks of Cancer Prevention

only after a greater delay than will lead to an artificial Time-in-sample: exposure to a
would be desired. The ability to enhancement of the treatment test can affect scores on sub-
come to more definitive con- effect. The cumulative result of sequent exposures to that test,
clusions increases with the attrition will be the net effect of an occurrence that can be con-
number of other evaluation conservative and liberal biases, fused with a treatment effect:
studies of a particular policy, or which will lead to uncertainty
type of policy; within a specific regarding the overall impact of A time-in-sample effect (also
(well-designed) study, the ability differential attrition in any given known as rotation group bias) is a
grows with the passage of time. survey situation. phenomenon whereby an indivi-
Both require greater effort/time Although attrition is unique to duals responses to the same
than is possible within a single cohort surveys, non-response bias question over time varies as a
pre-post evaluation study. is a problem in cross-sectional function of how many times the
studies, as well as cohort surveys. individual has responded to the
Attrition: loss of respondents to Non-response bias occurs when same question in the past (i.e. the
treatment or to measurement the surveyed sample differs from number of prior survey waves the
can produce artefactual effects the population, because some individual has participated in
if that loss is systematically types of respondents are less (Duncan & Kalton, 1987)). In a
correlated with conditions: likely to agree to participate in the cohort survey of nutrition, res-
survey, or are less apt to be pondents were systematically
Attrition is a major concern in contacted in the first place. This rotated out of the survey, so that
cohort surveys. In surveys about poses the same problems as at each survey wave there were
smoking, for example, those who attrition; many factors contributing respondents who had participated
quit are less likely to stay in the to non-response bias are present 1, 2, 3, and up to 9 times before. It
survey, even when specific in biases from attrition. was found that respondents
provisions have been made for As with all threats to validity, an reported eating smaller quantities
those who quit to move to a non- approach to dealing with attrition of food purely as a function of the
smoker/quitter survey, as in the is to measure its impact. The goal number of prior survey waves they
ITC Surveys (Thompson et al., is to develop a model of the had been administered (Nusser et
2006). Thus, it may be that if a correlates of attrition that identifies al., 1996). It is valuable to take into
policy or intervention is successful variables that are associated with account the time-in-sample effect
in increasing the proportion of the likelihood of attrition and the in the analysis of cohort data.
individuals who quit, the greater strength of the relationship.
attrition rate in the policy group, Toward this end, it is valuable in Additive and interactive effects
skewed as it is for those that quit, cohort designs to replenish cohort of threats to internal validity:
will attenuate the observed members lost to attrition at each the impact of a threat can be
treatment effect (i.e. it will make the stage with newly recruited added to that of another threat
statistical test of group differences respondents from the same or may depend on the level of
more conservative). Another sampling frame. Differences another threat:
potential bias due to attrition is between the responses of the
seen in respondents with low cohort and the newly recruited This statement reminds us that, as
socioeconomic status (SES), who replenishment sample can then be with any study, there exists more
are more likely to drop out. If the attributed to biases in attrition, and than one threat to internal validity
policy/intervention is more likely to to time-in-sample effects, to which and more than one source of bias
have an impact on high SES we turn next. in the estimate of an intervention
individuals, the differential drop out effect. Some of these biases may

be in the direction of over- post-policy time point, for example, ation of the graphic warning labels
estimating the effect; others may cannot be measured quantitatively. introduced in Thailand in 2005
be in the direction of under- The reason is that the actual value knowing that a post-policy
estimating the effect. The impact is dependent on knowledge of the measurement is required. But
of one source of bias can depend impact of spurious causal factors. when adding another group to the
on the level of a second source of The value of the second or third design, should this second group
bias. For example, the overall time point depends on whether the be a pre-policy measurement in
impact of participation bias over other causal factors would have Thailand, or a post-policy mea-
time will depend on the level of exerted a policy-consistent or surement in another country, such
attrition. policy-inconsistent impact, which is as the neighboring country of
unknown. In fact, if we actually felt Malaysia? It is strongly recom-
Cost ef fectiveness in the confident enough about the impact mended that a pre-intervention
design of evaluation studies of the other causal factors to put measurement be added. This is
them in such a formula, there because the starting point for all
On some dimensions, study would be little need to actually considerations of measuring the
design can be guided by a conduct the evaluation study in the causal impact of an intervention is
calculation of costs in relation to its first place! Even though we cannot in the difference between pre- and
benefits. The allocation of total be specific about the value of a post-policy (i.e. how respondents
sample size to number of clusters, certain design feature in an changed from pre- to post-policy
and number of individuals within evaluation study, we can make on a label-relevant variable).
clusters, is one example where some general statements about Having an explicit measurement of
prior information (e.g. the the likely relative value of one this pre-post difference is much
incremental cost of conducting the feature or design element over preferred to adding a control group
study in an additional cluster; the another. (Malaysia), as the researcher
intraclass correlation, a measure of As described earlier, the single- would still have to infer what the
the correlation of individuals within group post-only design is not outcome variable would look like in
a cluster compared to the sufficient for evaluation of a policy the absence of the policy at a time
correlation of individuals belonging (or any other intervention). So what prior to the policys implementation.
to different clusters) can be entered could be added to this single As long as there is sufficient time to
into formulas to create the optimal measurement? There are two collect pre-policy data, this recom-
sampling design given specific basic possibilities: (1) create a one- mendation is also the easiest to
resources available for the study. group pretest-posttest design by implement. In the evaluation of
In principle, the same is true for adding a pre-policy measurement national-level policies, it is simpler
designing an evaluation study to from the same sampling frame as to obtain multiple measurements
reduce threats to internal validity, the post-policy measurement: within ones own country than it is
that is, a study that stands to yield either the same individuals who will to obtain the same measurements
a more confident judgment about be measured at post-policy (cohort in a different country.
the causal impact of the design) or other individuals (repeat Thus, the single expansion
policy/intervention. But here, cross-sectional design); and (2) would favor the addition of pre-
however, the process cannot be create a posttest-only design with policy measures. In addition, the
guided by formula or algorithm in nonequivalent groups by adding a logistics of setting up the parallel
the same way as can be post-policy measurement from study (e.g. a survey) in another
accomplished in creating an another group who is not receiving country, with the establishment of
optimal sampling plan. The the policy/intervention. a second research team, and the
increment in internal validity due to For example, suppose a challenges of making the two
the addition of a second or third researcher is planning an evalu- parallel research efforts com-

parable in method and measures, Considerations of study fea- a powerful research design
would be great. tures in the evaluation of allowing more confident infer-
policies ences to be made about the
Summary of study design causal effects of policies and/or
considerations We have made a distinction combinations of policies. We now
between study designs and study turn to an illustration of the use of
To summarize, in the absence of features. In addition to the two these strategies in the Inter-
a randomised trial, there are two design considerations, there are national Tobacco Control Policy
study design strategies that can two study feature strategies that Evaluation Project.
be employed for the rigorous contribute to increasing an
evaluation of the effects of evaluation studys internal validity. The International Tobacco
policies. First is the use of The first is the measurement of Control Policy Evaluation
measurements both before and policy-specific variables that are Project (ITC Project)
after the policys implementation. theorised to be affected initially
These measurements can be after the policy is implemented. The ITC Project was established
taken from either units (usually, For example, in evaluating the with the goal of measuring the
but not limited to, individuals; the impact of a new warning label psychosocial and behavioural
same logic would apply if the policy on behaviour, one might impact of key policies of the FCTC
measures were of households, reasonably predict that for the on tobacco use among adult
schools, or other venues) that are policy to exert its effect on smokers (Fong et al., 2006a;
either the same (as in a cohort behaviour, the target population Thompson et al., 2006). As
design) or different, but drawn must first report noticing the new smokers are directly affected by
from the same sampling process warning labels (Hammond et al., tobacco control policies, this
(as in a repeat cross-sectional 2006). A second strategy is the understanding is crucial to
design). The second design measurement of policy-specific assessing the extent to which the
strategy is the use of a quasi- variables for policies that have not FCTC objectives are met, and of
experimental design, in which one changed; such variables act as desirable and undesirable col-
group that is exposed to a policy another form of control. In a lateral effects. The ITC Surveys
is compared to a similar country where labels have been were explicitly shaped by the four
unexposed group, as discussed enhanced and where taxation has strategies described above. To
above. Combining these two not, for example, we would expect date (as of December 2007), the
strategies in a single study yields that label salience would be ITC Surveys are a set of parallel
a two-group, pre-post design, improved over time, but taxation- prospective cohort surveys of
which offers a higher degree of relevant variables (e.g. perceived representative samples of adult
internal validity than either feature cost of cigarettes) would not. smokers in 15 countries
alone. The utility of longitudinal Recommendations for measures Canada, USA, UK, Australia,
designs is strengthened if there in each FCTC policy domain are Ireland, Thailand, Malaysia, South
are multiple data collections provided in other sections of this Korea, Mexico, Uruguay, France,
before and/or after policy Handbook. Germany, The Netherlands, New
implementation, allowing more Combining the two design and Zealand, and China, with
precise specification of effects two study feature strategies, along additional ITC Surveys under
(e.g. taking into account temporal with the inclusion of other development in other countries
trends that were occurring before explanatory variables (covariates) (Bangladesh, India and Bhutan).
the implementation of the policy). that might help explain differences With these additions, the ITC
evaluation of FCT policies in levels of analysis (e.g. social relate to their effectiveness.
countries inhabitated by over 50% structure and organization), and Several key characteristics of this
of the world populations, 60% of by factors at even finer levels of conceptual model require further
the world smokers, and 70% of analysis (e.g. individual differ- explanation. First, the model
the worlds tobacco users. ences of genetic susceptibility, focuses on how policies affect the
The ITC evaluation framework such as high versus low behaviour of individual smokers,
utilises multiple country controls, a metabolism for nicotine). Ulti- and thus circumvents the potential
longitudinal design, and a pre- mately, however, it is individuals hazards of making inferences
specified, theory-driven conceptual whose behaviour will or will not be about individuals from aggregates
model to test hypotheses about influenced by policies, and in (i.e. policy studies in which
the anticipated effects of specific order for us to understand these countries are the unit of analysis,
policies. behaviours, we must focus on the or individual-level studies that are
individual. repeat cross-sectional analyses
Conceptual model of the ITC The second assumption is that conducted over time).The pre-
Project: there exists a causal chain of sence of macro-level causal
changes within the individual forces that exert pressure on an
The first step in creating the ITC through which the impact of policy individual, are acknowledged in
Surveys was to determine how flows. This assumption directly the ITC conceptual model. For
policies may achieve their relates to the idea of mediation: example, societal norms toward
desirable effects. How do policies that policy causes changes in one smoking, economic conditions,
work? or more constructs, and/or a chain messages from the media that are
In order to address this of constructs within the individual, either pro- or anti-tobacco use,
important issue, a couple of which then eventuates in and the influence of family and
assumptions need to be des- behavioural change. The ITC friends are taken into con-
cribed. The first is that the most Project team created a conceptual sideration. The model specifies,
appropriate level of analysis, to model of how tobacco control however, that the impact of those
understand the mechanisms by policies might work based on a macro-level causes must be
which policies may ultimately combination of existing models measured at the level of the
change public health outcomes, is from the psychosocial literature individual through their percep-
that of the individual person. It is and from health communication tions of the presence of such
the individual who smokes or does theories. The resulting conceptual factors (e.g. beliefs about the
not smoke, the individual who is model, which is presented in norms and expectations of
influenced by anti-smoking media Figure 2.3, guided the selection of society, close friends, and family
campaigns or by marketing questions included in all ITC on smoking). In the end, it is the
campaigns of the tobacco indus- Surveys. individual who takes up smoking,
try, the individual who is or is not The ITC conceptual model who increases or decreases
influenced by societal norms or by assumes that each policy tobacco consumption, who does
influences from close friends and ultimately has an influence on or does not attempt to quit, who is
family, and the individual who behaviour through a specific successful or unsuccessful at
does or does not form intentions to causal chain of psychological attempting to quit, and who may
quit and then either does or does events. It is a general framework contract a smoking-related
not engage in an attempt to quit. for thinking about policies and disease and die. Of critical impor-
Having said this does not their effects on a broad array of tance, and a focus in the ITC
preclude the possibility, indeed the important psychosocial and conceptual model, is to capture
reality, that the individual can be behavioural variables, and for and measure the influences of the
Policy-specific variables

Label salience Moderators

Perceived cost
Ad/promo awareness Country
Awareness of Sociodemographics
alternative products (e.g. age, sex, SES, ethnic background)
Proximal behaviours
(e.g. forgoing a cigarette Past behaviour
because of labels) (e.g. smoking history, CPD,
quit attempts)

Psychosocial mediators (e.g. time perspective)

Outcome expectancies Psychological state

Beliefs and attitudes (e.g. stress)
Perceived risk
Perceived severity Potential exposure to policy
Self-efficacy/perceived (e.g. employment status)
behavioural control
Normalisation beliefs
Quit intentions

Policy-relevant outcomes

Quit attempts
Successful quitting
Consumption changes

Brand switching
Tax/price avoidance
Attitude/belief changes
(e.g. justification)

Economic Public health

impact impact

Figure 2.3 Conceptual model guiding the formulation of questions in the ITC Surveys
Adapted from Fong et al., (2006a)

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 51

The importance of design in the evaluation of tobacco control policies

experienced by the individual. The more downstream effects The policy-relevant outcomes
Ultimately, in order for us to are on the non-specific psycho- that are measured in the ITC
understand the impact of policies social mediators, which are surveys include those that confer
and other macro-level influences conceptually distant from the policy public health benefits (for
on populations, it is essential to and theorised to be affected by example, quitting), but also
measure them at the individual multiple influences, not just include important compensatory
level. It is a fallacy that the policies. Among these are behaviours that the smoker may
presence of macro-level causal variables such as self-efficacy and engage in that, although
forces requires that macro-level intentions, which come from well- responsive to the policy, may not
modelling be conducted. known psychosocial models of lead to the economic and public
Second, policies are seen as health behaviour, including the health benefits that are ultimately
potentially affecting individuals theory of planned behaviour the goal of such policies. For
along a variety of psychosocial (Ajzen, 1991), social cognitive example, smokers may switch to
and behavioural variables, of theory (Bandura, 1986), the Health discount brands in response to
which there are two classes. The Belief Model (Becker, 1974), and price increases, which would
most immediate effects are those Protection Motivation Theory confer no public health benefit.
on the policy-specific variables (Rogers & Prentice-Dunn, 1997). The ITC Project thus attempts to
(those variables that are proximal The ITC conceptual model holds provide a more complete account
(conceptually closest), or most that policies will affect these of the effects that may result from
specifically related to the policy general mediating variables the implementation of a tobacco
itself). Thus, new graphic warning indirectly, through their prior effects control policy, and includes both
labels should increase salience on the policy-specific variables. As the detection of desirable effects
and the ability to notice warnings; each policy has its own policy- and of unintended, undesirable
price should affect perceived costs specific variables, there exists side effects.
of cigarettes (for example, belief potential to estimate the relative In summary, the ITC con-
that cigarettes have become too contributions of various policies to ceptual model is a causal chain
expensive); and lifting of res- the outcomes of interest. model, and, as such, suggests
trictions on alternative nicotine Third, the ITC conceptual that the policy-specific variables
products should lead to increased model explicitly identifies the play a critical mediating role
awareness of the availability of mediators of policy and articulates because they reside between the
those products. These effects may the goal of understanding the policy and the outcome variables
also increase the likelihood of psychosocial processes that that are important in public health
discrete behaviours specifically explain how and why a given (e.g. quitting behaviour). These
linked to the manifestations of the policy may lead to changes in causal paths, from policy-specific
policy such as smokers hesitating, smoking behaviour. The longi- variables to behaviour, could be
or even forgoing or stubbing out tudinal design allows the explicit direct, but more typically will be
cigarettes because of the warning testing of the causal chain of through the more general
labels. Examples of survey effects that is depicted in the mediators. In some cases, there
questions designed to measure model. With a repeat cross- may be pathways through several
policy-specific variables are pre- sectional design, the capabilities kinds of mediators, both the
sented in Table 2.2. Other of modeling the dependence of policy-specific, proximal variables,
sections of this Handbook change in an outcome on the and the more general, distal
describe these and other mea- changes in an explanatory variables. Policies are theorized to
sures of policy-specific variables variable are diminished as data on vary in the psychosocial routes
in each of the FCTC policy the same individuals are not that they take to affect behaviour,
Policy Domain Examples of Questions Measuring Policy-Specific Variables

Warning Labels In the last month, how often, if at all, have you noticed warning labels on cigarette packages?

Warning labels make me think about the health risks of smoking (level of agreement or
disagreement with this statement)

Smoke-Free Legislation Which of the following best describes the rules about smoking in drinking establishments, bars,
and pubs where you live?
Smoking is not allowed in any indoor area
Smoking is allowed only in some indoor areas
There are no rules or restrictions

For each of the following public places, please tell me if you think smoking should be allowed in
all indoor areas, in some indoor areas, or not allowed indoors at all?

Drinking establishments (e.g. pubs/bars)
Restaurants and cafs

Price/Taxation Where did you last buy cigarettes for yourself?

How much did you pay for your cigarettes?

The last time you bought cigarettes for yourself, did you buy them by the carton, the pack, or as
single cigarettes?

The last time you bought cigarettes or tobacco for yourself, did you use any coupons or discounts
to get a special price?

Pro-Tobacco Advertising In the last 6 often have you noticed things that promote smoking?

In the last 6 months, have you noticed cigarettes or other tobacco products being advertised in any
of the following places: television, radio, at the cinema/movie theatre before or after the film/movie,
on posters or billboards, in newspapers or magazines, on shop/store windows or inside shops/stores
where you buy tobacco?

Now I would like you to think about advertising or information that talks about the dangers of
smoking, or encourages quitting. In the last 6 months, how often, if at all, have you noticed such
advertising or information?

Product Regulation Do you agree or disagree with this statement about light cigarettes: Light cigarettes are less
harmful than regular cigarettes?

Table 2.2 Examples of Questions Designed to Measure Policy-Specific Variables in the ITC Surveys
Policy Proximal variables Distal variables Behaviour

(Policy-Specific) (Psychosocial Mediators)

Figure 2.4 Schematic model of how a policy intervention might work (general pathway)

Label Salience Perceived risk Intentions to

Labels Quit attempt
Perc Effectiveness Perceived severity quit
Depth of Processing

Figure 2.5 Schematic model of how an intervention such as warning labels on cigarettes might work

Denorm beliefs
Ad Ban Advertising salience Social accept Intentions to Quit attempt
Positive association Subjective norms quit

Figure 2.6 Schematic model of how an invervention such as banning of pro-tobacco advertissement
might work

mediational model for how it is which in turn affect behaviour The specific articulation of
theorized to operate (Figure 2.4). (Figure 2.5). these mediational models leads to
For example, an enhancement In contrast, advertising bans specific, theory-driven empirical
in warnings may first increase may first decrease awareness of tests. The strategy of testing the
salience/noticing, depth of pro- tobacco-favorable messages, impact of policies through media-
cessing, and other constructs that which may lead to reductions in tional models of this kind differs
have been identified by com- the perceptions that smoking is a from the approach taken in
munication theory as being an socially acceptable behaviour, dealing with threats to internal
important initial step for a then to the idea that subjective validity. That approach, which is
communication attempt to be and societal norms are more a process of falsification, uses
effective. The resulting heightened negative toward smoking, which is research design and analytic tools
perception of the risk or hazards of theorized to lead to quit attentions to determine that a possible
smoking should affect overall and quitting behaviour (Figure confounding factor was NOT
attitudes and outcome expec- 2.6). responsible for the observed
tancies, which affect intentions, pattern of data, whereas explicit

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 54

IARC Handbooks of Cancer Prevention

tests of mediational models level. And what role do microbe- validity is the concurrent events
provide the possibility for con- havioural reactions, such as forego- threat (also known as a history
firmatory analyses, which test ing a cigarette as a result of threat): the presence of events
whether a policy had its impact on noticing/reading warning labels, that occurred concurrently, such
an important outcome variable play in determining longer-term out- as multiple policies, or a mass
because it first caused changes in comes, such as quitting? media campaign that was imple-
a policy-relevant mediator. In order to address these and mented at the same time as the
In general, the design of the other conceptual questions about policy that is being evaluated.
ITC Surveys is guided by the the impact of warning labels, the How can these threats be
possibility of disentangling the ITC Surveys include multiple meas- measured and dealt with?
web of alternative explanations ures to empirically identify from the The only method of keeping
and competing forces through the service results which measures possible alternative causes from
careful selection of specific, may be important in understanding becoming confounders is to
theory-driven mediators. the impact of warning labels. In this measure their potential impact,
The ITC conceptual model of- regard, it should be noted that the and explicitly including them in a
fers an opportunity to test how best measure for understanding model that competitively tests their
policies impact or fail to impact an- the impact of warnings may depend impact. For example, if a mass
ticipated behaviour. For example, on whether the warning is text- media campaign is being imple-
the mere existence of a policy, based or whether it includes mented at the same time as a
even if implemented properly, graphic images. policy to be evaluated, measures
does not guarantee that smokers Mediational models have the of noticing, and the impact of, that
will be exposed to its conse- potential to identify causal mec- mass media campaign (see
quences in the ways anticipated. hanisms, and the importance of Section 5.6) could be included in
Using the example of warning la- this is that knowledge of the causal a post-policy survey, and those
bels, some smokers barely look at mechanisms can inform the measures used as covariates in
a pack when they are smoking and creation of interventions of an analysis of the impact of the
may rarely or never notice the potentially greater power. Thus, the policy. Although the study might
warnings. This, however, could be general mediation model is originally have been concep-
due to motivated avoidance, and it realized differently in diverse policy tualized as evaluating the policy,
is important to measure whether domains; different policies are including measures of the mass
this has an impact on behaviour. In mediated by different constructs. media campaign would augment
a cohort survey of Ontario smok- Because the ITC Surveys measure the study as a simultaneous
ers, Hammond and collaborators all of these constructs, it is possible evaluation of the impact of both
(2003) found that avoidance of the to begin to distinguish whether a policy and the campaign. The
graphic Canadian warning labels, change of behaviour (e.g. quit general point here is that
by means such as covering them attempt) was due to a given policy, unconfounding of alternative
up or by putting them in a cigarette in the context of other policies, or events in the evaluation of a policy
case, was not associated at follow- to other alternative events that can only be attempted through the
up with a decreased likelihood of a occurred at the same time. measurement of the possible
quit attempt. impact of those alternative events.
Additional research questions The use of mediational models It should also be noted that
can be addressed, such as whether as a mechanism for establish- even randomisation to conditions
is it sufficient for someone merely to ing the effect of policies: does not eliminate the threat to
notice warnings or whether it is nec- internal validity posed by con-
essary to read them closely, or As described earlier, an important current events. If randomisation
Taxation Taxation

Labels Rate of quit Labels Rate of quit

Countries attempts Countries attempts
Ad/Promo Ad/Promo

Smoke-free Smoke-free

(a) Basic layout of mediational model designed to test whether (b) Between the two ITC survey waves, for each of the four policy
any of the policies might have been causally responsible for the domains, did any of the countries make a change?
difference between countries in the rate of quit attempts.

Labels Labels
* Rate of quit * Rate of quit
Countries Ad/Promo attempts Countries attempts
* Ad/Promo
0 Smoke-free

(c) Between the two ITC survey waves, suppose there were two (d) The reduced mediational model, having eliminated
policy domains in which one country changed: Labels and Taxation and Smoke-free policies as possible mediators
Ad/Promo (starred paths from countries to those two policy
domains). There were no changes over time in the other two
domains. Thus, those paths are equal to zero, indicating that
differences across countries in the rate of quit attemps could not
have been mediated by changes in Taxation and Smoke-free

* Rate of quit Labels Rate of quit
* *
Countries attempts Countries attempts
Ad/Promo 0

(e) We then examine the paths from each of the two policy (f) Thus, Ad/Promo was not supported as a mediator between
domains (that is, the policy-specific measures for each of the countries and rate of quit attempts. That is, changes in Ad/Promo do
domains) to rate quit attempts to test whether the change in not help explain why countries varied in quit attempts. In contrast, the
those policy-specific measures is associated with differences significant paths from Countries to Labels and from Labels to Rate of
in the Rate of quit attempts. We find that the Label measures quit attempts supports the contentions that the change in warning
are associated with the Rate of quit attemps (indicated by a labels mediated the pathway from Countries to Rate of quit attemps
star), but the Ad/Promo measures are not (indicated by a 0). and that the change in warning labels was responsible for the increase
in the rate of quit attemps.

studies, there would still be the the ITC surveys, and several other quit attempts); personality charac-
need to measure the impact of countries are anticipated to do so teristics (time perspective, de-
other possible influences on in the future. pression, sensation seeking);
behaviour that had occurred The ITC Project is also other environmental effects (stress
between the policy intervention examining the impact of smoke- levels); and potential exposure to
and the post-policy assessment free laws in several ITC countries. policy (unemployed people should
point. To date, the impact has been be less affected by workplace
A more complete articulation of remarkably similar in Ireland smoking policies).
the strategy of teasing apart the (Fong et al., 2006b) and Scotland Dealing with hypothesised
impact of multiple policies, and/or (Hyland et al., 2007). Ongoing ITC moderators is relatively straight-
the presence of other possible surveys will allow a rigorous forward when they are postulated
influences/confounding factors comparative evaluation of the merely to add predictive power to
can be found in the approach to impact of smoke-free laws in other linear models. The issues become
mediational analyses (e.g. Baron ITC countries including France, more complex when different
& Kenny 1986; MacKinnon et al., Germany, The Netherlands and mediational pathways are postu-
2002; Mathieu & Taylor, 2006; and China. Given that the ITC Surveys lated for subpopulations. For
Spencer et al., 2005). An are using identical or very similar example, individuals who avoid
extended example of the logic of measures and parallel data warnings might change behaviour
the approach is provided in collection methods across the set through more emotion-related
Figures 2.7 a-f. The scenario is of ITC countries, the potential for pathways, while those who take in
that ITC countries varied in the making conclusions about the the information on warning labels
rate of quit attempts. For commonality or differences of the might be influenced through more
simpliciity, four policies are listed: impact of smoke-free laws, gra- cognitive pathways. The ITC
taxation, labels, ad/promo, and phic warnings, and the other Surveys have the design and the
smoke free, and the analysis FCTC policy domains will be measures that will allow the
involved the policy-specific varia- strong. creation of separate models for
bles associated with each of the Thus country and the these different subpopulations,
four policies. environmental and cultural factors which will make it possible to test
that country embodies, consti- whether different subpopulations
Moderator variables in the ITC tutes an important moderator within a country, as well as
Project: variable in the ITC conceptual between different country popu-
model. lations, respond in the same way
One of the most intriguing lines of Further, within a country, it is or differently to tobacco control
inquiry in the ITC Project is to possible to test for differential policies.
determine whether the impact of policy impact on subgroups of a
the same or similar FCTC policy population, by including variables Conclusions
differs across different countries. to determine which subgroups are
In the domain of health warnings more favourably (and less This section has provided some
(Article 11), the ITC Project is favourably) influenced by FCTC basic principles of how evaluation
addressing whether the impact of policies. These moderators fall studies can be designed to offer
graphic warnings differs across into five broad classes: socio- more confident judgments about
different countries. Among the ITC demographics (age, sex, SES, the causal impact of tobacco
countries to date, Thailand and ethnic background); past beha- control policies. It has also
Australia have introduced graphic viour (smoking history, current illustrated the use of study designs
warnings since the beginning of consumption (cigarettes per day), (the structural aspects of an

section2.1plus2.2janvier12:Layout 1 12/01/2009 13:34 Page 57

The importance of design in the evaluation of tobacco control policies

evaluation study) and study putative mediator but not another, of these policies on tobacco use,
features (the selection of measures non-policy interventions (e.g. but also to provide valuable
to be used in an evaluation study, mass media campaigns) can be insights into the development of
including theoretically guided tailored to influence those more effective non-policy efforts to
mediators and moderators). mediators that had been identified reduce the burden of tobacco use
The eventual outcome of in the evaluation study to be the throughout the world.
rigorous evaluation studies does operating causal forces leading to
not end with a causal statement, favorable changes in behaviour.
however. If mediational analyses Thus, rigorous evaluation of
demonstrate that a given policy FCTC policies has the potential
questions in cross-cultural survey research on

Introduction linguistic groups. In most cases, meanings ascribed to the same

however, the implications and question, whether phrased in the
The WHO FCTC aims to address methods we describe extend to same or different languages.
the global tobacco epidemic by intranational studies involving Conversely, true differences may be
coordinating national policies to different ethnic groups or even obscured by such factors as the
combat tobacco use. This volume single ethnic groups that speak the differential influence of social
illustrates possible conceptual same language (e.g. Spanish- desirability or the exclusion of items
frameworks, methods, and data speaking Latinos in the USA; people that are important indicators of
sets that will be useful for from different socioeconomic study constructs in one cultural
conducting comparative, interna- groups). In this regard, our general context but not in another. Whereas
tional research to better understand approach may be useful to the implications of these issues
which policies work and why. This researchers interested in ensuring appear most obvious for inter-
section aims to provide researchers the validity of comparative analyses national comparative research, if left
with a basic overview of mea- across cultural subgroups within unaddressed, they may also impede
surement issues involved in the increasingly multi-cultural, intra- our understanding of why certain
design and analysis of cross-cultural national settings. tobacco policies work better among
comparative research, as well as Cross-cultural and cross- some socio-cultural groups than
some of the methods currently national research is often done among others. In the end, valid
recommended for attempting to under the unexamined assumption cross-cultural comparison demands
resolve these issues. When that question meaning, compre- that measurement error be
possible, we illustrate our points hension and measurement pro- minimised across the settings and
with examples from cross-cultural perties are equivalent across groups of interest (Bollen et al.,
tobacco research. The organisation cultural groups (Bollen et al., 1993; 1993; Smith, 2004a).
of the section follows the general Smith, 2004a). However, cross-
stages of research design, illus- cultural differences in language,
trating the corresponding methods social conventions, cognitive Equivalence of conceptual
used to assess and to avoid abilities and response styles may frameworks
introducing systematic measure- cause systematic measurement
ment error due to cultural error that biases results in un- Cross-cultural survey research
differences across the populations predictable ways (Fiske et al., 1998; should begin by assessing whether
in which the research is carried out. Harkness et al., 2003a). Apparent the conceptual definitions and
The growing literature that we differences found across socio- theoretical frameworks that orient
discuss generally reflects concerns cultural groups may be merely due the study reasonably apply across
related to conducting comparative to measurement artefacts, such as the contexts in which the survey
of the universal applicability or this literature may simply not exist. however. As the number of
culturally-specific nature of study This problem may be addressed nations or cultural groups involved
concepts is important because by establishing collaborative in the study increases, so do the
their definitions should inform research groups that involve at amount of difficulty and time spent
subsequent stages of question least one representative from each to coordinate efforts and reach
selection, development, adapta- country or cultural group in which consensus (Kuechler, 1987).
tion and assessment. For surveys will be conducted Granting agencies often demand
example, some concepts may (Kuechler, 1987). Ideally, each clearly defined conceptual frame-
have single or multiple dimen- representative should have native works before they will fund a
sions, each of which should be language proficiency and be project, and without funding to
reflected in its conceptual knowledgeable of both the study develop this framework, it may be
definition. In some populations the topic and the particular contexts in difficult to engage collaborators.
social acceptability of smoking can which data collection will take The local representatives with
be characterised by at least two place. Formulating the studys whom collaboration occurs may
dimensions, one that references conceptual framework in dialogue actually be quite cosmopolitan,
close social network members among a team of such researchers perhaps directly or indirectly
and another that concerns per- can help anticipate incongruities in socialised into the Western
ceptions of a more distal, abstract the conceptual framework across scientific enterprise. Hence, the
socio-cultural milieu (Thrasher et survey contexts, and thereby avoid cultural perspective any parti-
al., 2006a). These referents may any ethnocentric or universalist cular representative provides may
be further subdivided by tendencies in measurement that be a hybrid form that is at once
perceptions of the actual beha- might result (Van de Vijver & transnational yet circumscribed by
viour (i.e. descriptive norms) and Hambleton, 1996). Furthermore, particular social class, gender,
desired behaviour (i.e. injunctive this dialogue may help identify and cultural divisions within the
or prescriptive norms) (Cialdini, cultural or contextual factors that country of interest. In this regard,
2003). Hence, at least four may be important modifiers of people who have direct know-
dimensions could be delineated tobacco policy effects. Such ledge of the local realities of target
within a conceptual definition of potential modifiers may otherwise populations in which survey
the social acceptability of escape consideration because research will take place may make
smoking. Nevertheless, the researchers in one context either more substantial contributions
number of dimensions may vary take them for granted because of toward the development of
between or within any particular their ubiquity or have never culturally applicable concepts.
population. Cross-cultural studies considered them because of their Even so, status asymmetries
should consider construct absence. For example, strong among group members may
dimensionality and whether it religious beliefs in some countries ultimately overwhelm more local
might differ across cultural groups. may play such a role. (and perhaps more locally
Ensuring the equivalence of The collaborative process of relevant), epistemologies, theories
concepts across cultural contexts defining the concepts and and concepts, particularly if they
or groups should begin with framework that orient ques- are incongruent with Western
literature reviews on the topic and tionnaire design goes some way scientific principles (Johnson,
concepts of interest. Pertinent toward ensuring that the survey 1998). These challenges should
literature may nevertheless es- instrument will be meaningful for be recognised and, to the extent
cape the reach of search engines study participants. There are a possible, overcome. Collaboration
or the linguistic capabilities of number of tensions and difficulties with representatives from each
forces at least some consideration prehension and meaning, pre- Hambleton et al., 2005). When
of cultural particularities and testing is needed in each major cultural anchoring is discovered,
concerns. The resulting concep- cultural context or major socio- unambiguous phrasing in the
tual framework should be more cultural group under consideration translated version of the question
likely to fit the contexts studied (see page 68). may necessitate changing the
than a framework constructed in One reason why item selection wording of the original language
the absence of input and matters is that wording that item in order to maintain
involvement of representatives appears neutral may actually equivalence (see page 68). Literal
from these different settings. contain phrases or terms with question translation may never-
culturally idiosyncratic conno- theless result in equivalent
tations, making translation difficult meanings across languages.
Question selection and (Harkness, 2003). Attempts to However, it is crucial to consider
development: equivalence capture the meaning of culturally whether the resulting question
of indicators anchored wordingno matter adequately captures the concept
how unambiguous in the original of interest and whether a non-
The practice of selecting or languagemay produce awkward literal adaptation of the question is
developing questionnaire items in translations that violate question necessary to do so (Van de Vijver
one language and translating design principles and thereby & Leung, 1997; Van de Vijver,
them into other languages is introduce systematic error. One 2004).
common in cross-cultural survey clear example comes from the Cross-cultural survey research
research. The use of established German General Social Survey generally involves translating
items saves time, is inexpensive, item Das leben en vollen zgen items that are established mea-
and allows for ready comparison genieen, which literally trans- sures for particular constructs in
with other studies that have used lates to English as the nonsensical one language group. For this
the same measures. Ideally, these Enjoy life in full trains. For reason, our next sub-section
items will have been pre-tested American English, a more focuses more intensively on
and found to have suitable appropriate translation is the translation approaches. However,
measurement properties across adapted, non-literal phrase Live researchers may nevertheless
subgroups who speak the source life to the fullest (Harkness, consider developing a core set of
language, as well as among those 2003). The often unconscious indicators for use across all sites,
from the linguistic and cultural embedding of cultural anchors in supplemented by culture-specific
groups in which the research will questions may lead to their dis- indicators of the same constructs.
be conducted. Such analyses covery only through the translation The selection of culturally-specific
have been done only for a few process itself. Similarly, question indicators should consider
tobacco survey questions, inclu- meanings may not be shared measurement research on the
ding those related to dependence across contexts, and different same or related concepts
(see Section 3.3). If sound items will need to be developed in conducted within the culture.
measurement properties have order to adequately reflect study However, such research may not
been found for the item in one concepts. For these reasons, exist or may involve items that
linguistic or cultural context, these cross-cultural survey methodo- researchers believe are inade-
properties do not necessarily carry logists increasingly argue for quate to capture the meaning of
over to the translated version of methods that open up the the concept of interest. Item
the item, no matter how good the translation process to greater development can follow any of a
translation (Harkness et al., scrutiny and more conscious variety of methods that are
2003b). To help ensure equi- group decision-making (Harkness standard practice in measurement
driven techniques (DeVellis, parison of dissimilar stimuli. was through the term vicio or
1991) or those that involve Furthermore, cross-cultural com- vice, which connotes a guilty
eliciting meanings from the target parison of only those items with pleasure that is difficult to control,
group of interest, as with focus similar content may exclude potentially dangerous, and often
groups (Stewart & Shamdasani, culturally specific items that are looked down upon socially.
1998), structured interviews the best and most meaningful Participants generally agreed that
(Spradley, 1979), free-listing, pile indicators of the concept of the term addiction, as well as the
sorts and other qualitative tech- interest. Overall, this approach term droga or drug also had
niques (Bernard, 1994; Berkowitz, involves relatively high develop- these connotations. Analyses of
2001). Rapid anthropological ment costs, openness to making data from a subsequent pilot
assessment techniques have also changes to the source instrument, survey of items developed to
been developed to reduce the time and complex organisational struc- capture these additional meanings
and effort required for more ture to adequately coordinate (fumar es un vicio [smoking is a
traditional ethnographic methods, teams (Harkness et al., 2003b). vice]; el cigarro es una droga [a
with one such effort having already cigarette is a drug]) found that
developed a framework for Example of focus groups for these items loaded onto the same
tobacco-related research among item development: dimension as the primary indicator
youth (Mehl et al., 2002). These of perceived behavioural control
and other methods could also be Before fielding an international (tabaco es adictivo [tobacco is
used for developing equivalent survey of adult smokers in Mexico, addictive]), improving the mea-
concept definitions across con- in-depth interviews and focus surement properties of the
texts. groups were conducted with adult construct ( Thrasher et al., 2006a).
One rarely used approach to smokers, with discussions orien- While the meaning of a cigarette
item selection and development ted by the conceptual domains is a drug would likely translate
involves simultaneous, yet included in the survey (Thrasher & back to English, the use of an
independent work by each group Bentley, 2006; Thrasher et al., equivalent English language item
responsible for a particular 2006a). One concept of interest that included the term vice may
linguistic or cultural subgroup involved perceived voluntary be meaningful only within certain
involved in the study (Harkness et control over smoking behaviour. subcultural religious groups. As
al., 2003b). This strategy is likely This attribution to tobacco con- such, this example helps illustrate
to work best when teams use sumption behaviour may not only the development of a culturally-
conceptual definitions that ade- be relevant to self-efficacy specific item that complements a
quately apply across contexts, regarding quit attempts, but also core item shared across surveys.
thereby removing the likelihood to perceptions of tobacco products Cognitive testing of the original
that the concepts under con- as deviant when compared to item in English and Spanish
sideration are too culturally- other products that people freely (see sub-section on Questionnaire
specific and, hence, idiosyncratic. decide to consume. When Pre-Testing) could complement
Each team would assemble prompted, most all Mexican further statistical analyses (see
and/or develop items that they smokers agreed that tobacco was sub-section on Quantitative
believe best reflect the study addictive; however, they found it assessment) in order to determine
concepts. In the end, however, difficult to explain what addiction whether the single item on vice in
incommensurability of items meant. It became clear that the the Mexico sample might be used
across contexts presents analytic more common manner of talking as equivalent to the single item on
difficulties, as few statistical about and understanding toba- addiction in samples from other
Approaches to survey trans- recommendations regarding in- that two additional roles be filled in
lation strument design, see: Dillman the team approach. Reviewers
(2007), Bradburn and coworkers should have language abilities that
Translation of surveys in cross- (2004) and/or Willis ( 2005)). If this are as strong as the translators,
cultural research is often an is not possible, then translation supplemented with knowledge of
afterthought, with little attention should be conducted by people questionnaire design principles,
paid to the design issues involved who are fluent in both languages study design and the topic of
in the complex task of producing and practiced in the translation interest. Adjudicators should at
instruments with comparable between them. At first glance, a least share this methodological
measurement properties across single-person translation appears and topical knowledge, as they will
languages and contexts (Hark- time- and cost-effective. However, make the final decisions about
ness & Schoua-Glusberg, 1998; relying on a single person to make which translation to adopt,
Harkness, 2003). Steps described all translation decisions may preferably in cooperation with the
above to ensure the applicability introduce comprehension prob- reviewers and translators who
and relevance of construct defini- lems due to regional variance in have been more intimately
tions across diverse contexts linguistic expression and meaning, involved in the details of
provide a foundation for sound as well as the translators own translation and evaluation. When
translation practices (Harkness et idiosyncratic interpretations and an adjudicator does not under-
al., 2003b). Yet, even with such a inevitable oversights (Harkness et stand the source or target
framework in place, any of a al., 2004). Since these issues may language well, Harkness suggests
variety of translation methods result in non-equivalent stimuli that consultants should be hired to
could be followed, each with its and, hence, invalid comparison, provide this skill. Team ap-
own advantages and dis- the efficacy of single-translator proaches involve greater expense,
advantages. Generally, survey methods increasingly has been time and coordination than single-
research follows the Ask-the- called into question (Harkness & person translations; however, this
Same-Question model, in which a Schoua-Glusberg, 1998; Hamble- approach is recommended and
questionnaire is developed in the ton et al., 2005). used by numerous ongoing survey
source language and translated A team approach to trans- operations, including the Survey of
to other target languages. lation, which involves more than Health Ageing and Retirement in
Because of its widespread use, one person who is fluent in the Europe (Brsch-Supan et al.,
we describe methods based on source and target languages, 2005), the US Consumer
this model, including the de- appears to help overcome some Assessment of Health Care
centering approach, whose biases that result from single- Providers and Systems (Weidmer
iterative process of translation person translations. Team et al., 2006), the US Census
demands at least some flexibility approaches open up to exami- Bureau (Pan & de la Puente,
in the wording of the source nation and discussion the complex 2005) and the European Social
language questionnaire. decision-making that occurs in Survey (Harkness & Blom, 2006).
Ideally, people who translate a translation, providing a greater The committee approach to
questionnaire should be skilled, range and more balanced translation is increasingly viewed
professional translators who are critiques of translation options as the gold standard in cross-
bilingual in the source and target (Guillemin et al., 1993; McKay et cultural survey research (Hark-
languages, while having at least al., 1996; Harkness & Schoua- ness & Schoua-Glusberg, 1998;
some basic training in general Glusberg, 1998). Aside from Harkness et al., 2004). Generally
principles for developing ques- skilled, professional translators (of two to four translators are used,
tions with good measurement which there may be more than with each additional translator
properties (for some basic one), Harkness (2003) suggests providing more material for critical

bilities. The parallel translation questionnaire problems that other- ple, one project using this method
method involves each translator wise only come to light in translated an English language
independently translating the pre-testing or data analysis. This item that included the term
same source questionnaire in its is not to suggest, however, that embarrassed, which existed in
entirety. Some of the costs this strategy should replace the target languages but had
associated with parallel trans- questionnaire pre-testing. Both stronger connotations than in
lations can be cut by employing researchers and translators are English. Researchers decided to
split translations, in which each likely to come from social strata substitute another term, unhappy
translator is assigned different that differ from the majority of about, which was easier to
parts of the source questionnaire. research participants. Hence, harmonise across the target
In either case, translators bring translation assessment proce- languages and did not com-
their independent translations to a dures described below are critical promise the measurement pro-
reconciliation meeting where at to ensuring sound comprehension perties of the original language
least one reviewer and perhaps and equality of measurement. item (Eremenco et al., 2005).
the adjudicator work with the Researchers may want to The iterative approach to
translators to reach agreement on consider allowing for minor translation is difficult, time-
the best translation. The chosen changes to the source language consuming and expensive, and
wording could be taken directly questionnaire due to issues that each additional language included
from one translation, a mixture of emerge through translation. As in the process will multiply these
the different phrasings offered, or described earlier, cultural an- disadvantages (Harkness et al.,
a previously unconsidered word- choring of words and phrases may 2003b). Unlinking questions from
ing that emerges from discussion result in translated items that shift their cultural connotations may
of the independent translations. original meaning or that violate result in unwanted ambiguity due
Because each question is good question design principles. to vague, unidiomatic phrasing.
translated independently by at Either way, systematic mea- Furthermore, changes in source
least two people, parallel surement error may result. One item wording may necessitate pre-
translations are likely to offer a possible approach to equalising testing in order to ensure that
greater range of translation question meaning involves an measurement properties have not
possibilities than either split iterative translation process called suffered.
translations or a single translator decentering (Werner & Camp- Whichever translation approach
would produce. The final versions bell, 1970). In this method, a is taken, we strongly recommend
can be adjudicated at the source questionnaire provides the that those involved in cross-
reconciliation meeting or, perhaps starting point for translation to cultural tobacco research docu-
provided to the adjudicator for target languages, which could be ment their decisions regarding
later consideration. done using any of the afore- item selection, development and
The team approaches to mentioned methods. However, translation. Study concepts should
translation may seem extravagant translators and reviewers signal be clearly specified and linked to
in the context of many low- which items appear to introduce original, source language items.
resource environments. However, non-equivalence of meaning. Translators should be encour-
the relatively low additional cost of Those in charge of each lan- aged to keep notes regarding their
hiring a second translator is likely guage version of the ques- decision-making processes when
to offset subsequent costs and tionnaire then work in iterative translating the item to another
data quality issues that might fashion, changing items by language. Similarly, team ap-
result from an unscrutinised tacking back and forth across the proaches to translation review
translation. Indeed, this process translations until all versions should involve further docu-

decisions were made. If the entire people helped ensure the use of clarification to this standard
questionnaire is not subject to natural terminology and compre- question had been included in the
later pre-testing, these notes will hensibility among smokers. source language questionnaire in
help determine which subset of Because of logistical and cost order to ensure that respondents
items should be scrutinised more constraints, representatives were considered roll-your-own ciga-
closely. This documentation will not included from each of the rettes, particularly as switching to
also enable future researchers to different regions of Mexico where lower-cost tobacco is a common
adequately interpret the data the survey would be administered. response to raising the price of
associated with these questions, This was a potential limitation. cigarettes (Young et al., 2006).
while providing critical information The reconciliation meeting One non-smoking translator
for further improvement of the involved a full day of work with deleted the last clause of the
measures in later studies. three translators (one was unable English version because she had
to make the meeting but provided never heard of people using such
Example of the committee her independent translation), cigarettes in Mexico. However, we
approach: two bilingual reviewers, and a did not want to exclude mention of
bilingual reviewer/adjudicator. After this practice since it occurs in
One example of the committee beginning the session with a Mexico, although at a low pre-
approach using parallel translation further discussion of question valence. Indeed, one aim of the
involves translating an American design principles, we examined survey was to estimate this
English-language source survey of the original English version and all prevalence, although it would be
adult smokers to the Mexican four translations, addressing one measured with more precision in a
variety of Spanish. Independent question at a time. As emphasised question that appeared later in the
translations of the survey were in the description of the survey instrument. Two general
provided by four bilingual pro- methodology, this process pro- options for describing factory-
fessional translators, three of duced a range of possible made cigarettes emerged: one
whom were Mexican nationals and translations, even for questions was a more literal translation
the fourth an American who had that, on the surface, appeared (cigarros hechos en fbricas,
been living in Mexico for 19 years straightforward. The beginning of literally cigarettes made in fac-
and working as a professional the process was time-consuming tories) and the other turned the
translator for 24 years. Although all and challenging. However, de- focus toward branded and mar-
of them had at least some cision-making became easier as keted cigarettes (cigarros de
experience with survey translation, participants became comfortable marcas comerciales, literally,
each was provided with summary with the process and as we commercial cigarette brands).
materials on question design reached agreement on terms, This second focus was discarded
principles and asked to follow grammatical structure, and res- since rolling tobacco is also
them. Two of the Mexican ponse options that were repeated branded and marketed, even
translators were recruited because throughout the questionnaire. though unbranded, loose tobacco
they were regular smokers, as was As an illustration of the can be bought in some regions of
a young adult, bilingual Mexican decision-making processes in- Mexico. The more literal trans-
research assistant who had been volved in this method, the lation sounded awkward and
involved in earlier stages of the following describes how we seemed to divert attention from
project and who served as a translated the last phrase of the the main question content. In the
reviewer at the reconciliation question On average, how many end, we decided on a phrase that
meeting. As members of the target cigarettes do you smoke each could be roughly translated as
population in which the survey day, including both factory-made cigarettes from the pack

word for pack (cajetilla) connoted make by hand?). Finally, inter- themselves that accords with their
factory-made without sounding viewer training included a focus on perceptions of social norms and
awkward, while setting up the the meaning of the question, so expectations (Marlow & Crowne,
contrast with the roll-your-own that interviewers could anticipate 1960). The phenomenon appears
type cigarettes that would be and respond to any com- to be universal across societies,
mentioned thereafter. prehension difficulties that they with stronger effects found when
For the final clause in the sensed among participants. considering self-report of beha-
question, two options emerged This example illustrates a viours or beliefs that are socially
from the three independent number of the advantages that sanctioned within a given cultural
translations. One used a term for accompany the committee ap- context (Johnson & Van de Vijver,
rolling that is also common for proach to translation. Importantly, 2004). Hence, the differential
rolling marijuana cigarettes (cigar- there were a variety of options to effects of social desirability on
ros forjados a mano) while the choose from. Consistency of self-reported tobacco attitudes,
other introduced the participant as terminology and phrasing across beliefs, and behaviours should be
the one who made (hacer) the translation options would have proportional to the level of
cigarettes (cigarros hechos por provided support for selecting a tobaccos social unacceptability
usted, literally cigarettes made by particular translation. The exam- across the socio-cultural groups
you). There was agreement that ple above indicated incon- under consideration. Because
either option could confuse people sistencies in the terms and social desirability effects also
who did not engage in rolling wording, which led to group appear stronger among minority
cigarettes this would be the vast decision-making about the best or disenfranchised groups within a
majority of study participants. way to resolve discrepancies. society (Ross & Mirowsky, 1984;
However, reference to the par- Moreover, resolutions to dis- Edwards & Riordan, 1994;
ticipant making the cigarettes crepancies did not appear in the Warnecke et al., 1997), it may
seemed on track, since not originally translated versions. disproportionately influence na-
including the participant as agent Finally, the version agreed upon in tional samples that contain more
could cause people to think of the reconciliation meeting still minority group participants.
cigars, which are also hand rolled, needed to be altered a little after Social desirability appears
but by someone else. We agreed cognitive testing indicated undesi- positively correlated with a num-
on a longer version cigarettes that rable connotations for one part of ber of macro-level societal
you make by hand (cigarros que the question. characteristics, such as higher
usted hace a mano). Later levels of collectivism and lower
cognitive interviews indicated that Culturally moderated levels of individualism. Higher
this phrase nevertheless connoted response styles levels of social desirability appear
marijuana cigarettes for some congruent with, and may stem
participants, and so the final, pre- Comparisons across cultural from, collectivist codes of social
tested version clarified that these groups may be biased by interaction that emphasise cour-
were cigarettes made with to- systematic differences in res- tesy, maintaining harmonious
bacco: En general, cuntos ponse styles, such as social relations and saving face (Marn &
cigarros al da fuma, incluyendo los desirability, extreme responding, VanOss Marn, 1991; Johnson &
cigarros de cajetilla y los cigarros and acquiescence. Of particular Van de Vijver, 2004). Smokers
de tabaco que usted hace a mano? concern are social desirability from collectivist societies that
(Literally, In general, how many effects, which manifest when stigmatise tobacco use may view
cigarettes do you smoke each day, respondents misrepresent or edit true representation of their
including cigarettes from the pack their true responses to a question thoughts and behaviours in an

these more important elements of viewers, even when the questions between respondents and inter-
social interaction. On the other are contradictory, a process viewers by attempting to match
hand, people from individualist referred to as acquiescent them on ethnic background or
societies appear to have stronger responding. demographic characteristics in
prohibitions against providing Although there is general hopes of minimising the social
misleading information (Triandis, agreement that social desirability, desirability pressures placed on
1995). Hence, smokers in these extreme responding and acquie- respondents. For example, in
societies may be less likely to scence are each moderated by contexts where deference to
provide socially desirable res- culture, there is less consensus or authority is a key cultural value,
ponses independent of the extent available evidence regarding how interviews conducted by older
of social sanctions against to best account for these potential people of higher social status may
smoking. This suggests that sources of measurement error induce strong social desirability
individualism/collectivism and when conducting cross-cultural effects. Numerous studies are
social sanctions against tobacco research. Several researchers available that demonstrate res-
are likely to interact, producing have attempted to neutralise pondent deference to interviewers
differential social desirability social desirability effects by who represent differing cultural
effects on tobacco survey ques- explicitly measuring these pro- backgrounds (Cotter et al., 1982;
tions. The strongest effects of pensities and then statistically Anderson et al., 1988; Finkel et
social desirability should occur adjusting for them (Nederhof, al., 1991; Davis, 1997; Johnson et
under conditions of strong 1985). Most reported attempts to al., 2000), although it should be
stigmatization of smoking beha- introduce social desirability cor- noted that none of these studies
viour in a collectivist society, rections, however, have been are based on experimental evi-
whereas the weakest effects unsuccessful ( Ones et al.,1996; dence. Under some circum-
would occur in individualist Ellingson et al., 1999; Fisher & stances, too little social distance
societies with weak stigmatisation. Katz, 2000), suggesting that other between respondents and the
Future research should empirically approaches should be explored person interviewing them may
test this proposition. (for reviews of other methods of encourage socially desirable
Several other response styles addressing social desirability in responding (Dohrenwend et al.,
have also been found to vary survey research, see Nederhof 1968). Concern with the effects of
across cultures (Baumgartner & (1985) and Paulhus (1990)). social distance can also be
Steenkamp, 2001). Two that have Some researchers have also extended to interview mode, as
perhaps received the most reported studies in which they the degree of privacy afforded by
attention are extreme response assessed extreme responding each mode of data collection may
styles (Smith, 2004b) and acquie- and/or acquiescence via structural exert differential pressures on
scence (Knowles & Condon, equation modelling ( Mirowsky & respondents to provide socially
1999). Extreme response styles Ross, 1991; Greenleaf, 1992; desirable information. Although
refer to the greater preference of Watson, 1992; Billiet & McClen- little information is available with
respondents from some cultures to don, 2000; Cheung & Rensvold, which to examine cultural varia-
select the most extreme endpoints 2000). In general, however, there bility in mode of interview effects
of response scales, whereas is no consensus on how to best (Marn & Marn, 1989), it would
respondents from other cultures confront problems of systematic seem likely that the social
are more likely to make less cross-cultural variability in survey sensitivity of the answers being
extreme choices when answering. response styles. requested and respondent culture
Moreover, some respondents During data collection, efforts might interact with survey mode in
exhibit a greater tendency to agree are also commonly made to ways that either magnify or

across groups. These effects may because it better approximates lation may reveal some problems
be difficult to predict, particularly the dyadic interplay of survey with target translations, it does not
given the near absence of administration than do focus adequately assess the translated
research on this topic. Re- group dynamics. Finally, another questions comprehensibility with-
searchers should thus carefully promising tool for assessing in the target population (Harkness
consider how the social sensitivity respondent cognitions related to & Schoua-Glusberg, 1998;
of the topics examined might vary translated questions is beha- Harkness, 2003). Furthermore,
across the groups studied, the vioural coding, a technique which the methodology provides no
types of questions asked, and how codes respondent and/or inter- guidance about what qualifies as
the mode of data collection might viewer reactions to questions in an acceptable level of similarity
influence participants responses. recorded interviews to identify across the source and back-
problematic survey questions translated versions. Finally, when a
Questionnaire pre-testing (Fowler, 1995; Van der Zouwen & back-translated questionnaire
and translation assessment Smit, 2004; Johnson et al., 2006). depends on a single translator for
Overall, we emphasise the impor- the forward translation into the
We focus on two approaches to tance of translation assessment target languageas it often
questionnaire pre-testing and and pre-testing as a means of doesit neither opens up the
translation assessment. First, we ensuring sound measurement translation process to critical
discuss back-translation, which properties of the target language scrutiny nor does it produce the
has been used frequently and survey instruments. range of translation options that are
even viewed as a gold standard found in team approaches. These
for translation assessment; how- Back-translation: factors recommend against the use
ever, we describe a number of of back-translation as the only
pitfalls that recommend against its Back-translation is often mistaken method of translation assessment.
use as a sole assessment as a method of translation, but it is Translation quality also needs to be
method. Second, cognitive inter- actually a method for assessing evaluated in a more direct fashion.
viewing is described, since it is the quality of a translation that has An example provided earlier
increasingly recognized as a already been made into a target helps illustrate these concerns.
crucial pre-testing stage before language (Harkness, 2003). It The German General Social
surveys go into the field within involves independent translation Survey item Das leben en vollen
particular socio-cultural settings. of the target language ques- zgen genieen literally trans-
We suggest that the rationale in tionnaire back into the source lates to English as Enjoy life in full
favour of this approach be language and comparing the trains. This translation is readily
extended to support the use of result with the original source back-translated to and reproduces
cognitive interviewing to assess language questionnaire. Back- with fidelity the original German
translated questionnaires. Another translation presumes that the source language phrase. How-
method for determining compre- greater the similarity between the ever, the nonsensical nature of the
hension and meaning attributed to results, the more acceptable the English translation could go
items involves focus group translation (Brislin, 1970). How- undetected without further review.
evaluation with members of the ever, languages are not iso- Moreover, an appropriate British
target population. This assess- morphic, and an unnatural adaptation of this phrase (Live life
ment approach is likely to be sounding or even incompre- to the full) would sound awkward
better than no pre-testing of the hensible target language trans- in American English, for which
survey instrument; however, the lation may produce, or even be different wording would be
information from cognitive inter- necessary for, a good back- necessary (i.e. Live life to the

missed, and in fact be dis- along this pathway may introduce generally developed to anticipate
couraged, with back-translation measurement error, cognitive inter- which kinds of probes, if any, will
that did not entail further review by view techniques focus on these be necessary for each question.
bilinguals (Harkness, 2003). aspects of the recall process. However, the interviewer may also
The think aloud and verbal freely employ probes to address
Cognitive interviewing: report protocols generally involve issues that unexpectedly emerge
asking participants to openly during the course of an interview.
Cognitive interviewing is in- describe the stream of thought in As such, the use of verbal probes
creasingly used to pre-test and which they engage as they answer demands the active involvement
thereby improve comprehension a survey question (Ericsson & and training of the interviewer.
and related measurement pro- Simon, 1984; Conrad & Blair, However, training is less of an
perties of questionnaires within 2004). Responses are usually issue for the survey respondent
particular societies (Willis, 2005). audio-recorded and transcribed for than in the think-aloud. Probes
The rationale for and principles analysis. Advantages of the may nevertheless influence res-
that orient this practice should method include the minimal pondents in ways that do not
extend to assessment of trans- training requirements for the adequately reflect cognitive
lated questionnaires. In the interviewer, whose main task is processes under real survey
absence of such pre-testing, there simply to read the question and conditions. In particular, care must
is no guarantee that the target listen. This generally passive be taken to develop unbiased,
language instrument will have interviewer stance may result in neutral probes that do not lead
sound measurement properties, lesser bias than more pro-active participants to respond in par-
even when the instrument has methods. However, although the ticular ways.
been pre-tested in the source open-ended format of this ap- When addressing survey
language and best practices have proach may allow unanticipated instruments within particular socio-
been followed when translating it response issues to emerge, cultural settings, Willis (2005)
(Harkness et al., 2003b). We subjects may need to be trained to recommends that each round of
describe a few basic principles of think aloud, with some people cognitive interviews involve survey
cognitive interviewing, while unable to develop the skills administration among 8 to 12
referencing key works for readers necessary to provide useful feed- people from the target population.
who are interested in more detail. back. Even good participants At least two testing rounds are
Cognitive interviewing follows wander off track, thinking in ways necessary to assess the adequacy
from research on the cognitive that may only vaguely correspond of the original questionnaire as
processes involved in responding with the mental processes required well as changes that result from
to survey questions (Willis, 2005). to respond to the question under the first round. Although the
The response process generally normal circumstances (Willis, number of testing rounds will
involves question comprehension 2005). depend on the quality of the
(i.e. meaning of terms and per- Verbal probing techniques are original instrument and the
ceived intent of question), retrieval increasingly favoured over think- proposed revisions, Willis sug-
from memory (i.e. availability of aloud strategies in cognitive gests that there are likely to be
and strategies to access relevant interviews (Willis, 2004, 2005). diminishing returns after three
information), judgment processes Probes have been developed in rounds of testing. This may or may
(i.e. motivation to respond and to accordance with principles of not be the case in dealing with
respond truthfully) and mapping sound question design, with more complicated cross-cultural
the internally generated response specific probes used to uncover issues that involve translated
to the question onto the response specific processing issues (see questionnaires, where each round

What to read: interviewer may have difficulty determining what parts of the question to read
Missing information: information that the interviewer needs to administer the question is not provided
How to read: question is not fully scripted and therefore difficult to understand

INSTRUCTIONS: Look for problems with any introductions, instructions or explanations from the respondents point
of view
Conflicting or inaccurate instructions, introductions or explanations
Complicated instructions, introductions or explanations

CLARITY: Identify problems with communicating question intent or meaning to the respondent
Wording: question is lengthy, awkward, ungrammatical or contains complicated syntax
Technical terms: terms undefined, unclear or complex
Vague: multiple ways to interpret the question or to decide what is to be included or excluded
Reference periods: missing, not well specified, or in conflict

ASSUMPTIONS: Determine problems with the assumptions made or underlying logic

Inappropriate assumptions are made about the respondent or about his/her living situation
Assumes constant behaviour or experience for situations that vary
Double-barrelled: contains more than one implicit question

KNOWLEDGE/MEMORY: Check whether respondents are likely to or not know or have trouble remembering infor-
Knowledge may not exist: respondent is unlikely to know the answer to a factual question
Attitude may not exist: respondent is unlikely to have formed an attitude about the argument being asked about
Recall failure: respondent may not remember the information asked for
Computation problem: the question requires a difficult mental calculation

SENSITIVITY/BIAS: Assess questions for sensitive nature or wording and for bias
Sensitive content (general): the question asks about a topic that is embarrassing, very private, or that involves illegal
Sensitive wording (specific): given that the general topic is sensitive, the wording should be improved to minimize
Socially acceptable: a socially desirable response is implied by the question

RESPONSE CATEGORIES: Assess the adequacy of the range of options

Open-ended question: is inappropriate or difficult to answer without categories to guide
Mismatch: question does not match response categories
Technical terms: are undefined, unclear or complex
Vague: responses categories are subject to multiple interpretations
Overlapping: categories are not mutually exclusive
Missing: some eligible responses are not included
Illogical order: order not intuitive

ORDERING OR CONTEXT problems across questions

Table 2.3 Questionnaire Design Issues, from Willis (2005)

would be followed by efforts to Prevention. The goal was to to both the English and Spanish
coordinate and translate ques- produce a Spanish-language language questions: For this
tionnaire changes until any version of the ATS questionnaire question, we want you to think of
cross-group discrepancies in that was equally comprehensible all the cigarettes you ever smoked
question interpretation and com- and that shared the same in your whole life, not on a single
prehension appear to be resolved. meaning among Latinos in the US day. In this case, changes made
Where equivalence of meaning who speak different national to the Spanish-language items
cannot be achieved, researchers varieties or dialects of Spanish. In meant re-evaluating and changing
should document why, and make the first step, a committee the wording of the original,
sure this documentation is approach was used involving English-language version in order
accessible to those who will independent, parallel translations to reinforce equivalence. Ane-
ultimately analyse the data. by bilingual translators of Mexican, cdotal evidence suggests that
Researchers who use the data at Puerto Rican and South American similar comprehension problems
a later date may otherwise believe heritage. This was followed by two characterised the original English-
that the questions are equivalent rounds of cognitive interviews with language version, so the addition
and make invalid comparisons Latinos from nine countries and of this introductory phrase may
across cultural groups. Drawing Puerto Rico. The first round have improved comprehension
from the previous example involved 40 participants using across languages.
regarding the vice connotation of think-alouds after every ques-
addiction in Mexico (see page tion. In the second round, the Quantitative assessment of
62), it may be inappropriate to resulting survey was administered measurement properties
compare Mexican smokers and in normal fashion to 28 par- and systematic
smokers from other countries on ticipants, followed by a debriefing measurement error
the item tobacco is addictive if that targeted particular com-
the dominant meaning of addiction prehension issues. Despite all precautions to ensure
is compulsive behaviour in other One of the many issues that item equivalence across social-
countries. This situation could be came up concerned the trans- cultural groups and linguistic
documented by describing how lation of the often-asked variants of a questionnaire, some
addiction in Mexico appears to English-language question, Have unaccounted-for factor may none-
more strongly connote vice and you smoked 100 or more theless systematically and
less strongly denote compulsion cigarettes in your life. Participants differentially influence responses
than in other countries. repeatedly thought that this provided by the groups under
question referred to daily smoking, consideration. The strategies des-
Cognitive interviewing even after the word entire was cribed here are best employed
example: inserted to read in your entire life after collecting pilot data, but
(en toda su vida) and the phrase before implementing the full
One recent example of cognitive was printed in boldface type to survey. Results can be used to
interviewing to pre-test translated ensure its emphasis by survey eliminate, change or replace items
items involved the Spanish administrators. This underscores that appear to be biased. However,
version of the Adult Tobacco the point that modification of a these methods can also be used to
Survey (ATS) for the United question may not resolve the assess measurement equivalence
States National Center for Health problem, hence modified versions after survey data are collected, with
Statistics and the Office on should also be pre-tested (Forsyth the drawback that it is too late to
Smoking and Health at the et al., 2004). To resolve the issue, change items with poor mea-
Centers for Disease Control and an introductory phrase was added surement qualities. As has been

IARC Handbooks of Cancer Prevention

emphasised when addressing other techniques (Groves, 2001). Never- the two indicators show inconsis-
measurement equivalence issues theless, theory and previous em- tent results, then strong claims
described in this section, it is pirical findings can be drawn upon about either result will depend on
recommended that such issues be in order to predict how the indica- ones ability to convincingly argue
documented so that others who tor should correlate with other vari- for the use of one indicator over
use the data at a later date will be ables. In other words, expected another. Although such post-hoc
aware of these issues. correlations with other particular argumentation may be suspect, it
Three approaches are briefly variables provide evidence of con- can also establish the focus for
described here: single indicators, vergent validity. The absence of subsequent research to clarify
alternative indicators and latent such correlations does not neces- measurement and the interpreta-
variable Structural Equation Model- sarily disprove the validity of the tions that result. With three alterna-
ing (SEM). When multiple indicators measure, however. Rather than tive indicators of the same
of a construct are used, more sta- disconfirming the validity of the construct, results from the third in-
tistical means are available to try to measure, this lack of correlation dicator can tip the balance in favour
rule out systematic measurement may instead merely indicate the in- of the preponderance of evi-
error across groups. However, adequacy or general inapplicability dence. Consistency across all
some approaches demand that sin- of the theory. Indeed, even when three indicators provides relatively
gle constructs be measured with a the measure under consideration is strong confirmation of the validity of
large number of items, which makes correlated with a set of theoretically the results. Smith suggests that the
them less applicable to survey re- related variables, this merely pro- most robust evidence will come
search. These methods, such as vides evidence not confirma- from consistent results across al-
multi-trait multi-method (Saris, tionof the measures convergent ternative indicators that not only
2003a), multi-dimensional scaling validity; systematic measurement contain linguistically different stim-
(Fonatine, 2003), and item re- error across the theoretical set of uli, but that also have different re-
sponse theory approaches (Saris, variables may still bias group com- sponse formats (Smith, 2004a).
2003b) are detailed elsewhere. parisons.
Simultaneous assessment of
Single-item measures of Alternative measures of the multiple indicators:
constructs: same construct:
Data collection on multiple indi-
When a single item is used to When there are multiple indicators cators of the same construct also
measure a construct, it may be dif- of a particular construct, differen- allows for statistical assessment of
ficult to assess whether observed tial item functioning across cultural all indicators simultaneously,
similarities or differences in the groups can be assessed by alter- instead of the sequential format
measure are valid or whether these natively considering each indica- outlined above. Simultaneous
observations result from some tor (Bollen et al., 1993; Smith, consideration of multiple indica-
other nuisance factor. Differential 2004a). With two items, a rela- tors lessens the impact of idio-
patterns of item non-response or tively clear indication involves con- syncratic, and therefore prob-
do not know may indicate non- sistent results for group lematic, indicators (Bollen, 1989;
equivalence. Indeed, these non- differences in means (e.g. both Bollen et al., 1993). It also allows
random patterns violate assum- higher in one group versus an- for the application of more formal
ptions that are necessary when other) and in correlations with statistical procedures to test,
dealing with this issue through pair- other constructs (e.g. number of improve and attempt to equalise
wise or listwise deletion, as well as days and number of cigarettes per construct measurement properties
when using multiple imputation day correlated with addiction). If across groups.

Developing and assessing comparable questions in cross-cultural survey research in tobacco

Exploratory factor analysis mates assume continuous, section suggests that these
(EFA) techniques can provide normally distributed indicators, comparative studies should
evidence for the equivalence of SEM allows estimation using non- consider measurement equiva-
construct dimensionality and dis- normally distributed categorical lence issues in the following ways:
crimination across groups, and ordinal indicators (Joreskog &
although special techniques are Sorbom, 1996; Muthen & Muthen, Research teams should
often necessary to ensure ade- 2004). SEM techniques estimate include collaborators from the
quate comparison (Van de Vijver items unique weighted contri- socio-cultural groups in which
& Leung, 1997). Items may be butions toward the measurement the study is being conducted in
considered for elimination if of latent variables. EFA, on the order to help anticipate issues
substantial group differences are other hand, involves summing or regarding the comparability of
found for factor loading values on averaging variables that comprise the theoretical framework,
the same dimension or for the a particular dimension, treating constructs and the mea-
extent of cross-loading across each indicator as equally weighted. surement of these constructs
dimensions. Cronbachs alpha Finally, several SEM packages across groups. When research
may also be used to determine now adjust for study design effects involves participants from
group differences in inter-item and sampling weightsadjust- distinct language groups, at
reliability. Although some statistics ments that are often important in least one, and preferably more,
are available for evaluating generating reliable, unbiased team members should be
factorial agreement across estimates in cross-cultural survey fluent in the source language
groups, the sampling distributions research. Taken as a whole, these and the target language in
for these statistics are unknown, key advantages recommend SEM which the survey will be
hence there are no statistical methods over standard EFA administered.
means of testing for what counts techniques. Cepeda-Benito and Whenever possible, it is
as an unacceptable difference colleagues (Cepeda-Benito et al., recommended to use mea-
(Van de Vijver, 2003). Moreover, 2004) provide a recent example of sures that have been
these techniques generally as- the use of these models to appropriately validated for the
sume normally distributed, compare the structure of the populations in which the
continuous variables, and survey Questionnaire of Smoking Urges questionnaire will be adminis-
indicators often violate these survey instrument across samples tered. Even when a measure
assumptions. of American and Spanish smokers. has been validated within one
Latent variable structural population group, its validity
equation modelling (SEM) offers a Summary and Recommen- may not extend to other
more direct means of testing dations groups, and additional steps
invariance of construct para- may be necessary to increase
meters and measurement pro- Evaluation of tobacco control validity and improve the value
perties across groups (Bollen, policies and other population-level of comparisons across groups.
1989, 2002; Joreskog & Sorbom, interventions often involves data Translation of questionnaire
1996). As with EFA, the dimen- collection efforts across diverse items from one language to
sionality of different concepts can national, cultural, linguistic and another should involve ex-
be examined. However, a key social groups. Comparison across perienced translators. Review
advantage of SEM concerns the such groups is often necessary to and adjudication of multiple,
ability to use statistical tests of clarify policy effects, how these independent translations of the
construct parameter equivalence effects happen, and how effects same items is currently
across groups. Moreover, whereas might differ across populations. considered the gold standard.
factor analysis parameter esti- The literature discussed in this If only one person translates

IARC Handbooks of Cancer Prevention

the questionnaire, then trans- All surveys, not just those that acquiescence, extreme res-
lation review should involve a are translated, should be pre- ponding) may influence res-
group of bilingual people who tested to assess compre- ponses.
are knowledgeable of ques- hension issues among the Researchers should docu-
tionnaire design principles and populations in which the sur- ment decisions related to
of key study concepts. Trans- vey will be administered. measurement development
lation assessment should not Ideally, pre-testing would in- and item wording, especially
merely consist of backtrans- volve cognitive interviewing where conceptual equivalence
lation. before a survey is fielded. is suspect, translation is dif-
Researchers should carefully Cognitive interviewing or other ficult, or where cognitive
select and translate items with pre-testing methods may also interviewing or other pre-
the goal of achieving equi- be used post-hoc to increase testing methods reveal sys-
valence of construct meaning the validity of comparisons or tematic differences in meaning.
across study populations. In to determine whether incon- Researchers should also
some cases, literal translation sistent results may be due to document issues around
of a questionnaire item across differential question com- survey administration.
linguistic variants of the survey prehension.
will not adequately capture the Researchers should consi-
construct of interest, and more der and seek solutions to
flexible translation and adap- minimise the ways in which
tation of the question will be culturally moderated response
necessary. factors (e.g. social desirability,

3.1 Measuring tobacco use behaviours

Introduction semination of tobacco-related Natura l history of toba cco use

surveillance data.
The majority of tobacco control In addition, Section 1-d of Article The natural history of tobacco use is
policies are designed to reduce 21 requires each ratifying nation to often conceptualized as a series of
tobacco use or exposure to tobacco provide periodic updates on sur- steps that can progress from never
smoke in the environment; stra- veillance and research as specified use, to trial, experimentation, estab-
tegies that are clearly supported by in Article 20. Article 22 calls for lished use, attempting to quit,
the scientific literature (US cooperation among the Parties to relapse, and/or maintenance of
Department of Health and Human promote the transfer of technical cessation (Figure 3.1 and Table 3.1)
Services, 2004, 2006; IARC, 2004, and scientific expertise on sur- (US Department of Health and
2007a). Preventing initiation and veillance and evaluation, among Human Services, 1990, 1994;
promoting quitting are the two major other topics (WHO, 2003). Marcus et al., 1993; Pierce et al.,
tobacco control strategies designed This section will first review the 1998b; Mayhew et al., 2000; Choi et
to reduce use. To facilitate pro- natural history of tobacco use (e.g. al., 2001; Hughes et al., 2003). Prior
gress, article 20 of the WHO initiation, current use, cessation). In to actual initiation of use, never
Framework Convention on Tobacco epidemiologic studies of disease users often think about use, a step
Control (FCTC) calls for Parties to: etiology, such as those discussed in in the process that is described in
IARC Monographs (e.g. IARC 2004) Section 3.2. After initial trial, users
(a) establish progressively a national and reports of the Surgeon General can either continue to experiment or
system for the epidemiological (US Department of Health and discontinue and become former
surveillance of tobacco con- Human Services, 2004), tobacco use triers. Experimenters can either
sumption and related social, behaviours (e.g. number of years progress to established user or
economic and health indicators smoked, number of cigarettes con- discontinue use and become former
(b) cooperate with competent inter- sumed each day) serve as inde- experimenters. Recent research
national and regional inter- pendent variables. In the evaluation suggests that nicotine dependence
governmental organizations and of the tobacco policies discussed in may appear during the experi-
other bodies, including govern- this Handbook, tobacco use mentation phase, before use
mental and nongovernmental behaviours serve as dependent becomes established (DiFranza et
agencies, in regional and global variables. The section will then al., 2002a; OLoughlin et al., 2003;
tobacco surveillance and ex- discuss factors that can influence the Fidler et al., 2006). Use becomes
change of information on the validity of self-report and factors that established when a threshold of
indicators specified in para- can influence comparability across cumulative lifetime exposure is
graph 3(a) of this Article surveys. The section will end by surpassed. The exact threshold of
(c) cooperate with the World Health describing several measures to established use is unknown and
Organization in the develop- assess use, providing examples likely varies considerably, but is
ment of general guidelines or from cross-national surveillance and often considered as having smoked
procedures for defining the evaluation systems (Section 4.3), as at least 100 lifetime cigarettes, or
collection, analysis and dis- well as national sources. being exposed to a similar amount

IARC Handbooks of Cancer Prevention

Never user

Trier Former trier

Experimenter Former

Transition to established
use (100 cigarettes)

Non daily user

Daily user

Quit attempt

Former user

Note: Use involves consumption of cigarettes, other forms of smoked tobacco products, and/or various
smokeless tobacco products.

Figure 3.1 The natural history of tobacco use

Measuring tobacco use behaviours

I. Initiation
a. Intention to try (Section 3.2)
b. Initial trial
i. Discontinuation after initial trial
c. Experimentation
i. Discontinuation of experimentation

II. Transition to established use

a. Ever daily versus never-daily

III. Current use

a. Frequency of use (daily versus non-daily)
b. Type of product used
c. Brand used
d. Intensity of use (units/day)
e. Topography (for smoked products)
f. Purchase patterns (partly covered in Section 5.1)

IV . Cessation
a. Intention to quit (Section 3.2)
b. Quit attempt
i. Intentionality
1. Planned
2. Spontaneous
ii. Dose management
1. Abrupt discontinuance
2. Gradual reduction
iii. Methods (Section 5.7)
1. Assisted
2. Unassisted
c. Maintenance of abstinence versus return to use

Here the term use means consumption of cigarettes, other forms of smoked tobacco products, and/or various forms of smokeless

Table 3.1 The Natural History of Tobacco Use: Key Constructs

of other tobacco products. Estab- Services, 1990; Gilpin & Pierce, Validity of self-repor t of cur-
lished use is generally manifested 1994; Hughes et al., 2003; West, rent toba cco use behaviours
as daily use. However, persistent, 2006). Quit attempts can be
regular non-daily use can also planned or spontaneous, involve Survey-based measures of cur-
take place (Evans et al.,1992; abrupt discontinuance or gradual rent tobacco use behaviours,
Husten et al., 1998; Trosclair et reduction in use before quitting, assessed in samples that are
al., 2005). Once past the threshold and may or may not be assisted representative of a given popu-
of established use, discontinuance by one or more of several lation, allow researchers and
involves an attempt to quit, with available treatment strategies policy-makers to estimate patterns
the outcome of each quit attempt (Fiore et al., 1990; Giovino et al., of and trends in use overall and for
being either relapse or main- 1993; West, et al., 2001). subgroups in the population.
tenance of cessation (US Depart- National prevalence estimates
ment of Health and Human have, in the vast majority of cases,

IARC Handbooks of Cancer Prevention

been based on self-reports of Saliva is the biological fluid of natively, some actual non-users of
personal behaviours. Self-report, choice in population-based sur- a product (e.g. cigarettes) may be
however, may be subject to veys, because it is the easiest to exposed to extremely high doses
misclassification bias. Survey res- obtain. Hair nicotine levels reflect of secondhand smoke, or they
pondents can either state that they exposure over a longer period of may use other tobacco products
do not currently use tobacco, time (Al-Delaimy, 2002). Hair or nicotine replacement therapy,
when in fact they do (mis- samples are even easier to obtain and thus may test positive for
classification of use as non-use), than saliva. However, measure- cotinine. Exposure to secondhand
or that they do currently use ment of nicotine in hair can be smoke, and use of other tobacco
tobacco when, in fact they do not influenced by hair color, treatment, products that are available in a
(misclassification of non-use as and growth rate and identifying given nation, should be deter-
use). Each of these misclassi- nicotine from actual tobacco use mined by questionnaire assess-
fication biases can compromise versus exposure to environmental ment and accounted for in validity
the validity of a survey estimate. sources can be problematic (Al- assessments. In addition, cotinine
Delaimy, 2002). levels may be influenced by
Determining validity: Unfortunately, the use of racial/ethnic differences in the rate
biomarkers as indicators of actual of nicotine metabolism and intake
Validation of self-report is generally use is also subject to error. of nicotine per cigarette smoked
conducted using biomarkers of Studies using cotinine to validate (Caraballo et al., 1998; Perez-
exposure to tobacco or tobacco self-report must determine a cut- Stable et al., 1998; Benowitz et al.,
smoke as criteria. Biomarkers of off for discriminating users from 2002), suggesting that different
exposure that have been used in non-users. Cut-offs generally cut-offs may be needed for
studies include nicotine; cotinine, a range from 10.0-20.0 ng/ml for different racial/ethnic groups.
major metabolite of nicotine; car- serum or saliva cotinine among Furthermore, the cut-off for
bon monoxide; and thiocyanate adults (Pirkle et al., 1996; Cara- pregnant women is lower (e.g. 10
(Society for Research on Nicotine ballo et al., 2001, 2004; Society for ng/ml) than for the general adult
and Tobacco, 2002; Al-Delaimy, Research on Nicotine and Tobac- population (Rebagliato et al.,
2002). Nicotine and cotinine are co, 2002) and 5.0-11.4 ng/ml 1998; Owen & McNeil, 2001;
almost exclusively specific to saliva or serum for adolescents Society for Research on Nicotine
tobacco products. Very low levels (McNeill et al., 1987; Caraballo et and Tobacco, 2002).
of nicotine can be found in some al., 2004; Post et al., 2005). Self-reports from studies with a
vegetables, but their impact on Optimally, a cut-off is selected in a high demand for abstinence can
cotinine levels is insignificant manner that results in the highest be biased (Velicer et al., 1992;
(Pirkle et al., 1996; Society for accuracy, defined as the best Patrick et al., 1994; Benowitz et
Research on Nicotine and Tobac- combination of sensitivity and al., 2002). Misclassification of use
co, 2002). Cotinine is preferred specificity (Caraballo et al., 2001, and non-use has been observed
over nicotine as a biomarker, 2004). However, actual users may in clinical studies of adult smokers
because it has a longer half-life in have cotinine levels below the cut- who have been advised to quit
biological fluids than nicotine (~16 off if their most recent use was not and subsequently interviewed
hours versus ~2 hours), thus recent enough or of sufficient about their smoking, often times
reflecting use over the previous intensity (in terms of units/day) to by persons associated with the
three days for the general generate adequate levels of intervention. This is particularly
population (Society for Research cotinine to exceed the cut-off, and true among subjects who have
on Nicotine and Tobacco, 2002). thus be incorrectly classified as diseases or conditions that would
Cotinine can be obtained from deceivers (Dolcini et al., 1996; benefit from quitting. For example,
saliva, urine, and blood (serum). Caraballo et al., 2004). Alter- it was reported that 15 (65%) of 23

Measuring tobacco use behaviours

self-reported quitters in a vention took place, compared to report (27.5% versus 24.7%); for
cessation trial of chronic ob- three other Finnish communities Poland, the difference was 4.2
structive pulmonary disease (Vartiainen et al., 2002). percentage points (41.8% versus
patients in the Netherlands mis- However, in cultures in which 37.6%).
reported use as non-use smoking among women is socially Misclassification of use as non-
(Monninkhof et al., 2004). In a US unacceptable, misclassification use is also more likely in
study to increase smoking ces- appears to be more common. household interviews with ado-
sation among pregnant women, Household interviews were con- lescents, where privacy may be
49% of self-reported quitters ducted on 1403 Southeast Asian compromised and disclosure is
receiving the intervention mis- adult immigrants who resided in lessened among those who do not
classified use as non-use the USA (Wewers et al., 1995). want their parents to learn about
(Kendrick et al., 1995). In the UK, The cotinine-adjusted estimates of their behaviour (Turner et al.,
11 (22%) of 51 myocardial current smoking prevalence were 1992; US Department of Health
infarction survivors who had been substantially higher than those and Human Services, 1994;
advised to quit smoking mis- based on self-report for Cam- Brittingham et al., 1998; Fowler &
classified use as non-use when bodian females (21.5% versus Stringfellow, 2001; Kann et al.,
followed-up during the year after 6.6%) and Laotian females 2002). The prevalence of seven
infarction (Sillet et al., 1978). In the (10.8% versus 4.2%). In 1992, tobacco use behaviours was
same report, 40% of subjects in a health surveys were conducted studied (e.g. lifetime cigarette use,
trial of nicotine gum misclassified among 1000 adults residing in current cigarette use, current
their use as non-use. Pitkranta in the District of Karelia, smokeless tobacco use, current
Population-based surveys, how- Russia and among 2000 adults cigar use) in an experiment that
ever, are, in general, comprised of residing in North Karelia, Finland varied mode of administration (pa-
people who experience smoking- (Laatikainen et al., 1999). The per-and-pencil instrument (PAPI)
attributable morbidity at approxi- cotinine-adjusted estimates of with computer-assisted self-
mately the rate of the general current smoking prevalence were interview (CASI) and survey
population, are not linked to substantially higher than esti- setting (school versus home))
advice to quit, and administered mates based only on self-report (Brener et al., 2006). Prevalence
by interviewers or data collectors among women from Pitkranta differed only for smoking a whole
who are not known to the res- (21% versus 10%) than among cigarette before age 13 (lower in
pondent. In general, self-reports of women from North Karelia (16% the PAPI condition) and current
current use from surveys are versus 13%). The researchers smokeless tobacco use (higher in
reasonably accurate, providing attributed the difference to mis- the school setting). Thus, for most
estimates of prevalence that are classification of actual use as of the tobacco-use behaviours
comparable to those obtained non-use, most likely because of measured, home settings can
from use of a biomarker (Pierce et the social unacceptability of provide prevalence estimates as
al., 1987; Velicer et al., 1992; smoking among women in that high as school settings if privacy is
Patrick et al., 1994; Caraballo et region of Russia. More recently, increased (both PAPI and CASI
al., 2001, 2004; Vartiainen et al., concerns were raised about mis- afford more privacy than either
2002). Data from the surveys used classification of use as non-use in face-to-face or telephone inter-
to evaluate the North Karelia population-based surveys conduc- views). It was also demonstrated
project indicate very little mis- ted in the UK and Poland (West et that when adequate privacy is
classification of use as non-use, al., 2007). For the UK, cotinine- provided, estimates of cigarette
with no difference in mis- adjusted prevalence estimates smoking from adolescent surveys
classification in North Karelia, were 2.8 percentage points higher conducted in households are
where the community-based inter- than estimates based on self- similar to those obtained from

IARC Handbooks of Cancer Prevention

surveys conducted in school they have smoked during a recent if estimates of adolescent drug
settings (Gfroerer et al., 1997). period of time, even when cotinine use obtained from data collected
Privacy in these studies is afforded levels are below threshold values, confidentially would differ from
by computer-assisted technology, may still be accurate, because those based on data that were
which may not be available in all nicotine dosing from infrequent collected anonymously (OMalley
countries. The four major surveys smoking may not result in levels of et al., 2000). They observed no
of adolescents discussed in this cotinine that are high enough to differences in prevalence esti-
Handbook (see Section 4.3) are exceed the cut-off value (Cara- mates, but cautioned that any
conducted in schools, which afford ballo et al., 2004, Dolcini et al., work conducted without anonymity
even more privacy than homes and 1996). The Centers for Disease must convince respondents that
provide more efficient venues for Control and Prevention conducted all their answers will be kept
data collection. a test-retest study of reporting and completely confidential. If a survey
Self-reports of the number of found that answers were reaso- respondent believes that the
cigarettes smoked each day nably stable over a two-week veracity of their self-report will be
appear to be underreported in period, with estimates of pre- checked biochemically, then they
surveys (Hatziandreu et al., 1989; valence being virtually identical may be more likely to disclose use
Section 4.2). Even though cotinine (Fowler & Stringfellow, 2001; (Murray & Perry, 1987; Cohen et
levels increase with increasing Brener et al., 1995). The reliability al., 1988; Aguinis et al., 1993).
number of cigarettes smoked each of answers does not prove that Question wording can also
day (Caraballo et al., 2001; they were not distorted on both influence the validity of self-report
Blackford et al., 2006), survey occasions, but remembering an (Babor et al., 1990; Brener et al.,
respondents demonstrate evi- exaggerated answer is likely more 2003; Section 2.2). Survey res-
dence of digit bias towards round difficult than remembering a true pondents must first understand a
numbers (e.g. 10, 15, 20, 30 one (Fowler & Stringfellow, 2001). question, interpret it properly, and
cigarettes per day) (Klesges et al., then encode it into memory. The
1995), and appear to round down Methods to enhance validity: outputs from this process are then
more often than they round up. used to search memory and
Comparisons between consump- Methodological techniques have retrieve relevant information,
tion data and survey-based esti- been developed to enhance pri- which is evaluated in the decision-
mates of consumption should be vacy in survey settings, such as making stage of the process. If the
conducted routinely in countries to having the respondent complete a information retrieved is considered
provide a crude indicator of the paper-and-pencil survey form in- to be an adequate response, then
discrepancies between the two stead of answering a face-to-face a response will be generated. If
sources of information. interview, which can be overheard not, then additional retrieval
Some adolescent survey res- (Brittingham et al., 1998); listen to attempts will be made, sometimes
pondents may indicate they survey questions using head- involving estimation strategies or
smoke or use smokeless tobacco phones connected to a laptop adoption of simple rules of thumb
when they actually do not, per- computer, providing answers via that people use to make judge-
haps to impress their friends the keyboard (Horm et al., 1996; ments quickly and efficiently.
(Cohen et al., 1988; Fowler & Brener et al., 2006); and respond If questions are difficult to
Stringfellow, 2001; Stein et al., to questions posed in a telephone understand, for example by asking
2002). However, misclassifying interview by pressing the appro- about more than one concept,
non-use as use appears to be far priate number button on the key then the accuracy of response will
less common than misclassifying pad instead of replying verbally be compromised. If questions are
use as non-use (Stein et al., (Biener et al., 2004). An experi- biased, for example by presenting
2002). Adolescent reports that ment was conducted to determine tobacco use in a negative context,

Measuring tobacco use behaviours

then answers will also likely be being measured. Adult respon- The prevalence of smoking was
biased. Survey questions must be dents to the 1992 National Health higher in the group given multiple
clear and objective, and con- Interview Survey (NHIS) who had response options (14.0%), com-
structed in a manner that involves ever smoked 100 lifetime ciga- pared to the group given the usual
the use of cognitive interviewing rettes were randomly assigned to question with the dichotomous re-
techniques, such as those des- be asked, Do you smoke now? sponse categories (9.2%). Most of
cribed in Section 2.2. (the question used prior to 1992) the women given the multiple
In an experiment involving the or Do you now smoke cigarettes choice question reported that they
use of three different sets of every day, some days, or not at had cut down since learning that
questions assessing smoking all? (the question used since they were pregnant, a response
behaviours that held all other 1992). Prevalence was 25.6% for option that allows them to disclose
conditions constant, researchers those who were asked the first their smoking and still display a
obtained similar estimates of question and 26.5% for those partially positive image. The re-
adolescent smoking prevalence asked the second (Centers for searchers estimated that this in-
from the three conditions (Brener et Disease Control and Prevention, crease in disclosure would identify
al., 2004). Using a convenience 1994a). Including an option on an additional 55000 pregnant
sample of 4140 high school non-daily smoking expanded the smokers in the USA each year. In
students (most were 14-18 years range of possible affirmative op- a survey conducted among preg-
old), approximately equal numbers tions, and by doing so provided nant women in the UK, cigarette
were randomly assigned to receive data on an important behaviour, smokers were identified as those
questions assessing 14 tobacco that of occasional smoking. who answered yes to the ques-
use behaviours, based on the The effect of question wording tion, Do you smoke at all nowa-
actual questions or adapting the on self-disclosure of smoking in a days? Approximately 4% of
question styles of one of these multiethnic prenatal population in pregnant women misclassified use
three US surveys: Monitoring the the USA was studied (Mullen et as non-use (Owen & McNeill,
Future Survey, Youth Risk Be- al., 1991). Questions about smo- 2001). Widespread adoption of
haviour Survey, or National king were embedded in a survey the question used by Mullen and
Household Survey on Drug Abuse. instrument assessing multiple risk col-leagues might reduce such
Questionnaire type was signifi- behaviours. In one condition, sub- misclassification.
cantly associated with three jects were asked Do you smoke? The overall content of a ques-
tobacco-use behaviours: lifetime and were forced to answer either tionnaire may also influence
cigarette use, smoking a whole yes or no. All other subjects disclosure. Respondents ans-
cigarette before age 13, and were asked, Which of the follow- wering a questionnaire that allows
purchasing cigarettes at a store or ing statements best describes them to portray some positive
gas station. Nine other measures, your cigarette smoking. Would attributes may be more likely to
including those assessing pre- you say: 1) I smoke regularly now, disclose negative attributes, than
valence of cigarette smoking and at about the same amount as be- if they were answering a ques-
smokeless tobacco use, did not fore finding out I was pregnant; 2) tionnaire that only assessed
vary by questionnaire type. No one I smoke regularly now, but Ive cut negative attributes (Fowler &
questionnaire type proved superior down since I found out I was preg- Stringfellow, 2001).
in this experiment. Each set of nant; 3) I smoke every once in a In 2002, the Society for Re-
questions was written in a clear while; 4) I have quit smoking since search on Nicotine and Tobacco
and objective manner. finding out I was pregnant; or 5) I Subcommittee on Biochemical
Question wording can also wasnt smoking around the time I Verification concluded that the
influence the prevalence estimate found out I was pregnant, and I added precision gained by
obtained depending on what is dont currently smoke cigarettes. biochemical verification is not

IARC Handbooks of Cancer Prevention

required and may not be feasible prevalence of use. For example, in veys is that they are less expen-
in large-scale population-based a country where multiple forms of sive to conduct than household in-
studies with limited face-to-face tobacco are available, as in India terviews. Telephone surveys are
contact (Society for Research on and the USA, a survey providing an generally not conducted in devel-
Nicotine and Tobacco, 2002). estimate of a tobacco use would oping countries, where coverage
Nevertheless, strategic assess- result in a higher estimate of does not permit the drawing of a
ment of validity in situations in prevalence than one that only representative sample. In de-
which social desirability may lead reports on the prevalence of veloped countries, however, the
to substantial underreporting, tobacco smoking. Similarly, an increasing prevalence of adults
could be beneficial (Wewers et al., estimate of cigarette smoking who own a wireless telephone, but
1995; Laatikainen et al., 1999). In prevalence would be lower than live in a household with no land-
addition, data collected in coun- estimates of tobacco use and of line telephone, presents a poten-
tries that routinely gather bio- tobacco smoking. In the same way, tial for bias, because sample
specimens for cotinine validation estimates of current daily smoking frames for telephone surveys are
and assessment of exposure to would be lower than estimates of drawn from numbers for landline
secondhand smoke, could provide current smoking, which include telephones. According to data
a sense of the scope and nature both daily and non-daily smoking. from the 2004 and 2005 US Na-
of underreporting, especially as tional Health Interview Survey
tobacco control progresses and Sample frame: (NHIS), approximately 1.7% of
tobacco use becomes increasingly adults lived in households that did
undesirable in a given society. The sample frame of a survey can not have any telephone service,
influence the prevalence esti- 5.6% of adults lived in households
Iss ues to consider when com- mates generated. For example, with only wireless telephones, and
paring dif ferent survey e sti- prevalence could differ sub- 92.8% of adults lived in house-
mates stantially for surveys of persons holds with landline telephones
aged 15 years and older, aged 25 (Blumberg et al., 2006). The pre-
Surveillance and evaluation years and older, and 25 to 64 valence of cigarette smoking was
systems will provide comparable years old. Likewise, a frame 19.7% (95% CI: 19.2-20.2) among
estimates of tobacco use be- drawn only from major metro- adults living in households with
haviours to the extent that they use politan areas in a given country landline telephones, 32.9% (95%
similar methods. The factors that would likely produce substantially CI: 30.9-35.0) among adults in
influence validity (e.g. assurance of different prevalence estimates households with only wireless
privacy and that answers will than if the entire population were telephones, and 36.9% (95% CI:
remain completely confidential, sampled. Each of the estimates 33.4-40.3) among adults in house-
question wording, social desira- from the sample frames discussed holds with no telephone service.
bility) will influence estimates of here could be valid for the popu- Thus, all other things being equal,
prevalence and thus comparisons lation covered by the respective the prevalence of cigarette smok-
between surveys. Factors that can sample frame. Thus, knowledge of ing that would have been esti-
influence prevalence estimates in each surveys sample frame is mated from a telephone survey,
ways that do not influence validity important when making com- that only reached households with
are described below. parisons across surveys. landline telephones, would have
Another sample frame issue been 19.7%, whereas the preva-
Definition of a user: deals with telephone coverage. lence in all households in the
Telephone surveys are frequently NHIS was 20.9%, a difference of
Differing definitions of a user will conducted in developed countries. 1.2 percentage points (P < 0.05).
often yield differing estimates of The major advantage of such sur- Telephone surveys provide valu-

Measuring tobacco use behaviours

able information. Rates of cover- Editing procedures: are lower in surveys with a tobacco
age will likely vary across nations. focus than in general health
The small difference in cigarette Surveys that are administered via surveys. The phenomenon was
smoking prevalence estimates self-administered questionnaires, studied using a factorial design and
seen in the USA suggest that com- such as the youth surveys des- concluded, after a series of multi-
parisons of prevalence estimates cribed in Section 4.3, require variate analyses, that the intro-
from telephone and household sur- decision rules for dealing with duction to the tobacco survey cued
veys should consider the possible inconsistent answers. The effects some people, mainly women, who
influence of coverage bias. of five approaches for handling didnt want to spend the time on the
Samples for surveys of ado- such inconsistencies in the 1998 survey, to misclassify themselves
lescents are drawn either from Florida Youth Tobacco Survey as non-users (Cowling et al.,
school-based frames, providing were described (Bauer & John- 2003). The researchers argued
access to enrolled students, or son, 2000). The approaches that the social stigmatization of
from household lists and subse- ranged from doing nothing, which tobacco use in California may have
quent enumerations of house- ignored inconsistencies and contributed to the misclassification
hold members. Only household analyzed each item as a separate bias they observed.
frames provide access to school entity, to a preponderance ap-
dropouts, who are more likely to proach, which evaluated each Type of parental consent in
smoke cigarettes than students of record and assigned values based school-based surveys of adoles-
the same age (Gfroerer et al., on the weight of the evidence for cents:
1997). This issue poses greater each respondent. The cigarette
concern for older (i.e. ages 16-17 smoking prevalence estimates In most countries, letters are sent
years) adolescents than for their generated from these approaches home notifying parents that their
younger counterparts, who are ranged from 25.6% (95% CI: 24.1- children will participate in a survey
less likely to have dropped out of 27.1) to 29.7% (95% CI: (parental notification). In some
school. Another comparability is- 28.2-31.2). Boys exhibited more countries, such as the USA and
sue is that household surveys may inconsistencies and therefore more Australia, two types of parental
not report data for an age group variability across approaches. permission are required for
that is comparable to one found in While recognizing the impossibility school-based survey research. In
a school survey. For example, if a of discerning which approach is the both systems, a letter is sent to
household survey reports esti- most valid, the authors suggested parents describing the upcoming
mates for young people who are that editing procedures be survey research project and
12-17 years old, and a school sur- described when findings are requesting their childs parti-
vey reports estimates for students reported. Approaches for handling cipation. In active parental per-
enrolled in grades 9-12 (most of inconsistencies can influence pre- mission, a form must be returned,
whom are 14-18 years old), then valence estimates and survey signed by a parent, granting the
the school survey will likely have comparability (Brittingham et al., child permission to participate. If
higher prevalence estimates sim- 1998; Bauer & Johnson, 2000). no signed form is returned,
ply because there are no 12-13 disapproval is assumed. In pas-
year olds enrolled in schools in Type of survey: sive permission, parents send
this frame, and the household age back a signed form only if they do
group does not include 18 year Recent reports indicate that pre- not want their child to participate. If
olds. Consumers of survey data valence estimates obtained from no form is returned, parental
should consider these and other surveys in California (Cowling et approval is assumed. In the USA,
factors when comparing data from al., 2003) and New Hampshire selected state and municipal
school and household surveys. (Ramsey et al., 2004) in the USA governments require active

IARC Handbooks of Cancer Prevention

permission. Three US reports lessened any concerns students diligently to maximize response
have noted that estimates of had about their parents negative rates, and continue to monitor res-
tobacco use are lower when active attitudes about certain risk be- ponse rates, sample characteristics,
parental permission is required haviours and facilitated disclosure. and prevalence estimates across
(Severson & Ary, 1983; Dent et Thus, comparisons of estimates surveys with differing response
al., 1993; Anderman et al., 1995). from school surveys in various rates to identify variables that might
It is suggested that active countries should assess the compromise comparisons.
permission laws exclude high risk degree to which active consent is
students because they are less required and the participation rate Survey-base d measures of
likely to return signed permission in each condition. tobacco use behaviours
forms. Differences were not ob-
served in ever smoking or Response rates: A general outline of the variables
smoking during the previous week used to monitor the natural history
in a study of active versus passive Concern has been raised about of tobacco use is presented in
consent conditions in Australia the effects of declining response Table 3.1. A description of de-
(White et al., 2004). rates in telephone surveys, tailed question items for almost
An analysis of the 2001 Youth especially in the USA. As the US every component of the process,
Risk Behaviour Survey (YRBS) rates declined in the 1990s, no dif- and some commentary on each,
data was undertaken to determine ferences in the degree of are provided in Tables 3.2 through
if type of parental consent was representation in samples of 3.18. Intention to try (I.a. in Table
related to the magnitude of esti- population subgroups were ob- 3.1) and intention to quit (IV.a. in
mates for 26 behaviours, including served (Biener et al., 2004). The Table 3.1) are discussed in
lifetime cigarette smoking, current researchers also compared ciga- Section 3.2. The methods used in
cigarette smoking, and current rette smoking prevalence esti- cessation attempts (IV.b.iii. in
smokeless tobacco use (Eaton et mates from telephone surveys Table 3.1) are discussed in
al., 2004). Of 13195 eligible conducted in Massachusetts and Section 5.7. Topography (as an
students, 65% lived in passive California, where response rates indicator of smoke intake) (III.e. in
conditions. In passive condition dropped substantially, with those Table 3.1) is discussed in the text
schools, 86.7% of sampled stu- from the Tobacco Use Supplement below; however, no survey items
dents participated; 77.3% of stu- to the Current Population Survey are recommended for this topic,
dents in active condition schools (TUS-CPS), in which response as questionnaire assessments of
did so. The difference was due to rates dropped only very slightly and smoking topography have not
the 9.5% of students in the active were substantially higher in 1998- been shown to be valid.
condition who did not return a 1999 (76%-81% in the TUS-CPS Tables 3.2 through 3.18 list
permission form. Type of consent versus 69% in Massachusetts and questions relevant for each topic
did not influence any of the 51% in California). The smoking that is either used in the cross-
tobacco measures; in fact, it was prevalence estimates obtained national surveys described in
related to only two of the 26 from the Massachusetts and Section 4.3, or in country-specific
behaviours measured. The con- California surveys remained rea- surveys. The latter are added in
clusion was that the requirement sonably close (as judged by over- instances where they supplement
for active consent will not lapping confidence intervals) to the items used in the cross-
influence prevalence estimates if those from the TUS-CPS, with no national surveys. In reliability
participation rates are sufficiently evidence of an increasing disparity assessments shown in the tables,
high (Eaton et al., 2004). It was over time. kappa statistics of 61-80% were
also argued that the anonymity Despite the findings from this considered substantial and 81-
offered by the YRBS might have study, researchers should work 100% were almost perfect (Brener

Measuring tobacco use behaviours

Construct Construct I.b. on Table 3.1 (Initial Trial)

Measure On how many occasions (if any) during your lifetime have you smoked cigarettes? Number of
occasions: 0, 1-2, 3-5, 6-9, 10-19, 20-39, 40 or more (ESPAD)

How old were you when you first tried a cigarette? I have never smoked cigarettes; 7 years old or
younger; 8 or 9 years old; 10 or 11 years old; 12 or 13 years old; 14 or 15 years old; 16 years old or
older (GSHS)

Have you ever tried or experimented with cigarette smoking, even one or two puffs? (GYTS)

Have you ever smoked tobacco? (at least one cigarette, cigar or pipe) (HBSC)


Validity Face validity. Kappa for ever use of cigarettes was 83.8% in CDC 14-day reliability study among high
school students (Brener et al., 1995). 81.5% agreement in a two year study (Shillington & Clapp, 2000).
92.3% of baseline ever users reported consistently at follow-up survey, with consistency decreasing with
increasing time between assessments (Huerta et al., 2005).

Variation Items are adaptable for assessments of other tobacco products. For example, a survey could ask, On
how many occasions (if any) during your lifetime have you used smokeless tobacco? Number of
occasions: 0, 1, 2-3, 4-9, 10-19, 20-39, 40 or more

Comments This variable is assessed mostly in youth surveys. The only cross-national adult survey which
conceptually can indicate ever use is the GATS, which asks non-current users: In the past, have you
smoked tobacco (cigarettes, cigars or pipes) on a daily basis, less than daily, or not at all?

Definitions Ever users have tried one or more smoke or smokeless tobacco products. Never users have not tried
tobacco, even the least amount asked about. Definitions more specific to product type(s) can be
employed (e.g. ever smoker, ever cigarette smoker, ever user of smokeless tobacco, ever user of betel

GYTS: Global Youth Tobacco Survey

HBSC: Health Behaviour of School-aged Children
ESPAD: European School Survey Project on Alcohol and Other Drugs
GSHS: Global School Health Survey
GATS: Global Adult Tobacco Survey
CDC: Centers for Disease Control and Prevention

Table 3.2 Initial Trial - Ever Use of Cigarettes or Smoked Tobacco

et al., 1995). Also, intraclass Initial trial: indicators (Starr et al., 2005).
correlation coefficients (ICC) of Reducing the number of people
0.75 and higher were considered This construct distinguishes who ever try tobacco will reduce
excellent, and 0.60 to 0.74 were persons who have never used the number who become estab-
considered good (Johnson & Mott, from those who have ever used lished users (US Department of
2001). Most of the measures are tobacco (Table 3.2). The propor- Health and Human Services,
listed in terms of smoking tion of young people who have 1994; Starr et al., 2005). Best
behaviour. Modifications of each never tried a cigarette is one of the measured in school surveys of
item can be made for smokeless Center for Disease Control and adolescents, initial trial can be
tobacco use. Preventions (CDC) key outcome assessed for whichever tobacco

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 86

IARC Handbooks of Cancer Prevention

products are of most relevance in Tobacco Surveillance System and Human Services, 1994).
a particular country. Trends in this Collaborating Group, 2005; White Trends over time in average age or
measure have been studied for & Hayman, 2006). Here we define grade of first use have been
more than 30 years in the USA, a trier as someone who has tried reported (Kopstein, 2001; John-
where lifetime use of cigarettes smoking, but has only taken one or ston et al., 2006). Measures of
among high school seniors (i.e. more puffs, but never a whole actual age of first use have been
12th grade students, the vast cigarette/cigar/pipe, or as some- used to calculate the incidence of
majority being 17-18 years old) one who has tried smokeless initiation of first use (Centers for
was 73.6% in 1975 and 50% in tobacco, but only on one occasion Disease Control and Prevention,
2005 (Johnston et al., 2006). (Table 3.3). 1998; Kopstein, 2001). The ave-
Cross-national findings on initial The age of first use is another rage age of first use varies across
use have been reported in several CDC key outcome indicator (Starr countries, likely reflecting the
reports (Warren et al., 2000; Global et al., 2005). The younger people influence of media and of cultural
Youth Tobacco Survey Colla- are when they start using tobacco, values (Warren et al., 2000; Global
borative Group, 2002; Godeau et the more likely they are to use it as Youth Tobacco Survey Colla-
al., 2004; Hibell et al., 2004; Global adults (US Department of Health borative Group, 2002; Global

Construct Construct I.b. and I.c. on Table 3.1 (Initial Trial and Experimentation)

Measure How many cigarettes have you smoked in your entire life? None; 1 or more puffs, but never a whole
cigarette; 1 cigarette; 2 to 5 cigarettes; 6 to 15 cigarettes (about pack total); 16 to 25 cigarettes (about
1 pack total); 26 to 99 cigarettes (more than 1 pack but less than 5 packs); 100 or more cigarettes (5
or more packs) (GYTS OPTIONAL)

Source GYTS

Validity Face validity. 10-18 year old US smokers who had smoked 20-98 lifetime cigarettes were more likely
to report that they smoked because it relaxes or calms them and because its really hard to quit than
were smokers who had smoked fewer than 20 lifetime cigarettes (Centers for Disease Control and
Prevention, 1994a).

Variation Items are adaptable for assessments of other tobacco products. For example, a survey could ask, On
how many occasions (if any) during your lifetime have you used smokeless tobacco? Number of
occasions: 0, 1, 2-3, 4-9, 10-19, 20-39, 40 or more

The parenthetical examples of the number of packs listed in the item above for cigarettes apply only in
countries in which there are 20 cigarettes in each package.

Comments Definitions for cigarette smoking are based on Choi et al., 2001.

Definitions A trier is someone who has tried smoking, but has only taken a few puffs or someone who has tried
smokeless tobacco, but only once. An experimenter is someone who has smoked more than a few
puffs, but fewer than 100 cigarettes. For other tobacco products, the US National Center for Health
Statistics uses cut-offs of from 1-49 cigars or pipes full of tobacco or having used smokeless tobacco
on from 1-19 occasions.

GYTS: Global Youth Tobacco Survey

Table 3.3 Trial versus Experimentation

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 87

Measuring tobacco use behaviours

Tobacco Surveillance System policies aim first to prevent initial lished use. The question
Collaborating Group, 2005). Table trial and, if initial use has occurred, recommended in Table 3.5 per-
3.4 describes the construct Age of to prevent progression beyond mits use of other time periods after
First Use. such use. Researchers used one initial trial. Three months since
month with or without use to initial use can be used to define
Discontinuation after initial trial: distinguish recent from non- former triers. This strategy, while
recent experimenters (Choi et al., somewhat arbitrary, is based on
Some young people will try 2001). However, approximately the assumption that triers who
tobacco, for example, by taking a three in 10 non-recent experi- have not used for at least three
few puffs on a cigarette, and then menters, according to their months, would be less likely to
never use again. Tobacco control definition, progressed to estab- progress to established user than

Construct Construct I.b. on Table 3.1 (Inital Trial)

Measure When (if ever) did you first do each of the following things? A) Smoke your first cigarette? Never; 9
years old or less; 10 years old; 11 years old; 12 years old; 13 years old; 14 years old; 15 years old; 16
years or older (ESPAD)

How old were you when you first tried a cigarette? I have never smoked cigarettes; 7 years old or
younger; 8 or 9 years old; 10 or 11 years old; 12 or 13 years old; 14 or 15 years old; 16 years old or
older (GSHS)

How old were you when you first tried a cigarette? I have never smoked cigarettes; 7 years old or
younger; 8 or 9 years old; 10 or 11 years old; 12 or 13 years old; 14 or 15 years old; 16 years old or
older (GYTS)

At what age did you first do the following things? Smoke a cigarette: Never, ___ (write in age). (HBSC)


Validity Face validity. Kappa for smoking first whole cigarette before age 13 years was 68.1% in CDC 14-day
reliability study among high school students (Brener et al., 1995). Intraclass correlation coefficient (ICC)
was good (range = .637 - .666) in three tests of children and moderate (0.517) in a fourth in a two year
reliability study (Johnson & Mott, 2001). The ICC was 0.73 for males and 0.76 for females in an Israeli
study (Huerta et al., 2005). Forward telescoping (producing older estimates of age of first use upon
re-interview) has been observed (Shillington & Clapp, 2000; Johnson & Mott, 2001).

Variation Items are adaptable for assessments of other tobacco products.

Comments The NSDUH asks adolescents and adults, How old were you the first time you smoked part or all of a
cigarette? ( This measure has been used to assess incidence of
initiation (Centers for Disease Control and Prevention, 1998); NSDUH even assesses month of first
use in recent initiators (

ESPAD: European School Survey Project on Alcohol and Other Drugs

GSHS: Global School Health Survey
GYTS: Global Youth Tobacco Survey
HBSC: Health Behaviour of School-aged Children
CDC: Centers for Disease Control and Prevention
NSDUH: US National Survey on Drug Use and Health

Table 3.4 Age of First Use

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 88

IARC Handbooks of Cancer Prevention

would those abstinent for less and use of smokeless tobacco on Transition to established use:
than three months. at least 20 occasions to measure
established use in a manner Young people who have become
Experimentation: similar to the 100 cigarette ques- established users are, compared to
tion. Indicators of nicotine depen- those who have not, at far greater
Experimentation occurs when dence have been observed during risk of continuing to smoke as
someone progresses beyond the experimentation process adults (US Department of Health
initial trial. Experimentation with (Centers for Disease Control and and Human Services, 1994; Choi
cigarettes can be distinguished Prevention, 1994b; DiFranza et et al., 2001). Preventing pro-
from initial trial and from estab- al., 2002b; OLoughlin et al., gression to established use is a
lished use with the question 2003). goal of tobacco control. CDC has
recommended in Tables 3.3 and identified the proportion of young
3.6. Experimenters are those who Discontinuation of experimenta- people who have smoked 100
have consumed from 1-99 ciga- tion: cigarettes or more during their
rettes. Regarding the use of other lifetimes as a key outcome indi-
tobacco products, experimen- Another goal of tobacco control is cator for evaluating comprehensive
tation can be operationalised as to prevent the progression from tobacco control programmes (Starr
smoking from 1-49 cigars or pipes experimentation to established et al., 2005). Similar indicators for
full of tobacco, or having used use. As discussed above, a cut-off other tobacco products are recom-
smokeless tobacco on from 2-19 of three months of abstinence mended in Table 3.6. Several other
occasions. These are somewhat since experimenting can be used measures of transition have been
arbitrary cut-offs; the US National to define former experimenters described as well (Johnston,
Center for Health Statistics uses (see Table 3.5). 2001).
50 cigars, 50 pipes full of tobacco,

Construct Construct I.b.i and I.c.i. on Table 3.1 (Discontinuation)

Measure When was the last time you smoked a cigarette, even one or two puffs? I have never smoked a
cigarette; today; not today, but some time during the past week; not in the past week, but some time in
the past month; 2-3 months ago; 4-6 months ago; 7-12 months ago; 1 or more years ago (GYTS

Source GYTS

Validity Face validity. In one study, non-recent experimenters (those experimenters who had not smoked within
the previous 30 days) were less likely to progress to established smoking than were current
experimenters (Choi et al., 2001).

Variation Items are adaptable for assessments of other tobacco products.

Definitions A former trier is someone who has smoked only a few puffs or who has tried smokeless tobacco only
once who has not used it for > 3 months. A former experimenter is someone who has experimented
(defined in Table 3.3) and has not smoked/used tobacco for > 3 months.

GYTS: Global Youth Tobacco Survey

Table 3.5 Time Since Last Use Among Triers or Experimenters

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 89

Measuring tobacco use behaviours

Construct Construct II. on Table 3.1(Transition to established use)

Measure How many cigarettes have you smoked in your entire life? None; 1 or more puffs, but never a whole
cigarette; 1 cigarette; 2 to 5 cigarettes; 6 to 15 cigarettes (about pack total); 16 to 25 cigarettes (about
1 pack total); 26 to 99 cigarettes (more than 1 pack but less than 5 packs); 100 or more cigarettes (5
or more packs) (GYTS OPTIONAL)

Have you smoked 100 cigarettes or more in your lifetime? (ITC)

Have you smoked at least 100 cigarettes in your entire life? (NHIS, BRFSS, NSDUH, ATS, TUS-CPS)


Validity Evidence of utility predictive validity. Adolescents who have smoked at least 100 lifetime cigarettes
are more likely to be established smokers in the future than those who have not (Choi et al., 2001).

Variation Items are adaptable for assessments of other tobacco products. On how many occasions (if any) during
your lifetime have you used smokeless tobacco? Number of occasions: 0, 1, 2-3, 4-9, 10-19, 20-39,
40 or more

Comments Having ever smoked 100 cigarettes is considered established use (Choi et al., 2001; Starr et al., 2005).
It is a useful measure because it can be used as a marker for a threshold even for never daily users.
However, some people have difficulty understanding the concept of having ever smoked a total of 100
lifetime cigarettes. For other tobacco products, the use of > 50 cigars or pipes full of tobacco or having
used smokeless tobacco on > 20 or more occasions can be used as cut-offs to define established use.

GYTS: Global Youth Tobacco Survey

ITC: International Tobacco Control Policy Evaluation Survey
NHIS: US National Health Interview Survey
BRFSS: US Behavioural Risk Factor Surveillance System
NSDUH: US National Survey on Drug Use and Health
ATS: US Adult Tobacco Survey
TUS-CPS: US Tobacco Use Supplement to the Current Population Survey

Table 3.6 Threshold for Transition to Regular Use

Ever daily versus never-daily: The average age of first daily Current use:
use can vary among ethnic groups
In the USA in 1991, approximately within a country and over time Current use is influenced primarily
7.5% of established smokers had (Centers for Disease Control and by rates of initiation and quitting, as
never smoked on a daily basis Prevention, 1991). Compared with well as by mortality, and to a far
(Husten et al., 1998). Among all younger age of first daily use, lesser extent, immigration into and
established smokers, never daily starting at an older age has been emigration out of a given popu-
smoking was more common associated with slightly lower rates lation. Current use is the most
among non-Whites (range = 12- of subsequently developing tob- important construct because of its
17%) than among Whites (6%); acco-attributable disease (US importance as an outcome variable
among current smokers, never Department of Health and Human in policy evaluation studies. CDC
daily smoking was also more Services, 2004). Description of rates it a key outcome indicator
common among non-Whites ever daily use constructs and age (Starr et al., 2005).
(range = 11-17%) than among of first daily use are found in Each of the seven surveys
Whites (4%). Tables 3.7 and 3.8. described in Section 4.3 mea-
sures current use (Table 3.9). In

IARC Handbooks of Cancer Prevention

three (European School Survey the adult surveys. In the Global who had ever smoked > 100
Project on Alcohol and Other Adult Tobacco Survey (GATS) lifetime cigarettes who currently
Drugs (ESPAD), Global School and the STEPwise Approach to smoke daily, weekly, or monthly.
Health Survey (GSHS), Global Chronic Disease Factor Sur- Trends in and patterns of
Youth Tobacco Survey (GYTS)) of veillance (STEPS) survey, a current use have been reported in
the four surveys of young people, current smoker is someone who numerous reports and publi-
a current user is someone who currently smokes tobacco pro- cations (US Department of Health
used tobacco at least once during ducts daily or less than daily. and Human Services, 1994,1998,
the previous 30 days (month) GATS and STEPS can distinguish 2001; Warren et al., 2000;
(Warren et al., 2000, 2006; Hibell between current daily and current Kopstein, 2001; Giovino, 2002;
et al., 2004; WHO, 2007a). In the non-daily smoking (Table 3.9). White & Hayman, 2006). The
Health Behaviour of School-aged GATS can also classify current WHO Global InfoBase documents
Children (HBSC) survey, a current non-daily smokers as ever daily or prevalence of current use of
user is someone who uses either never daily smokers. The Inter- various indicators, including cur-
daily or weekly (Godeau et al., national Tobacco Control Policy rent smoking, current daily
2004; Hublet et al., 2006). Current Evaluation Survey (ITC) classifies smoking, and current tobacco use
use is defined slightly differently in current cigarette smokers as those for countries throughout the world

Construct Construct II.a. on Table 3.1 (Ever daily and never daily)

Measure When (if ever) did you first do each of the following things? B) Smoke cigarettes on a daily basis:
Never; 9 years old or less; 10 years old; 11 years old; 12 years old; 13 years old; 14 years old; 15 years
old; 16 years or older (ESPAD)

Have you ever smoked cigarettes daily, that is, at least one cigarette every day for 30 days? (NYTS)

In the past, have you smoked tobacco (cigarettes, cigars or pipes) on a daily basis, less than daily, or
not at all? (GATS)

In the past, did you ever smoke daily? (STEPS)


Validity Face validity. Kappa for ever daily use was 86.6% in CDC 14-day reliability study among high school
students (Brener et al., 1995).

Variation In GATS, current non-daily smokers are asked, Have you smoked tobacco daily in the past? Items are
adaptable for assessments of other tobacco products.

Comments The prevalence of never daily smoking among adult smokers in the USA was documented (Husten et
al., 1998).

Definitions An ever daily user is someone who has ever smoked tobacco or used smokeless tobacco on a daily
basis. A never daily user has never smoked tobacco or used smokeless tobacco on a daily basis.

ESPAD: European School Survey Project on Alcohol and Other Drugs

NYTS: National Youth Tobacco Survey
GATS: Global Adult Tobacco Survey
STEPS: STEPwise Approach to Chronic Disease Factor Surveillance
CDC: Centers for Disease Control and Prevention

Table 3.7 Ever daily versus Never Daily Use

Measuring tobacco use behaviours

Construct Construct II.a. on Table 3.1 (Ever daily and Never Daily)

Measure When (if ever) did you first do each of the following things? Smoke cigarettes on a daily basis: Never;
9 years old or less; 10 years old; 11 years old; 12 years old; 13 years old; 14 years old; 15 years old;
16 years or older (ESPAD)

How old were you when you first started smoking daily? (GATS, STEPS)


Validity Face validity. Kappa for first smoking daily before age 13 years was 71.8% in CDC 14-day reliability
study among high school students (Brener et al., 1995). ICC was excellent for adults assessments of
age of first daily use (.815) in a two year reliability study (Johnson & Mott., 2001). Forward telescoping
(producing older estimates of age of first daily use upon re-interview) has been observed (Johnson &
Mott., 2001).

Variation Items are adaptable for assessments of other tobacco products.

Comments The NSDUH asks adolescents and adults, How old were you when you first started smoking every
day? ( This measure has been used to assess incidence of initiation
of daily use (Centers for Disease Control and Prevention, 1998). Measures like this have been used to
calculate incidence of initiation of cigarette smoking (Pierce et al., 1994; Pierce & Gilpin, 1995; Centers
for Disease Control and Prevention, 1998).

ESPAD: European School Survey Project on Alcohol and Other Drugs

GATS: Global Adult Tobacco Survey
STEPS: STEPwise Approach to Chronic Disease Factor Surveillance
CDC: Centers for Disease Control and Prevention
NSDUH: US National Survey on Drug Use and Health

Table 3.8 Age at first daily use

( non-daily smoking remained sta- in countries, such as India, where

e/infobase/web/InfoBaseCommon). ble at about 18-19% of all current there exists a variety of commonly
smokers from 1993 to 2004 used forms of tobacco products.
Frequency of use: (Trosclair et al., 2005). The variety of forms available, and
In surveys of young people, the possibility of switching or
Frequency of use refers to the current frequent users are those multiple concurrent uses may
number of days when tobacco is who smoked on > 20 or more of the influence the probabilities of
used during a given time period previous 30 days. Frequency of quitting and of disease risk.
(e.g. the previous seven days or use is a predictor of quitting (with Country-specific lists of products
the previous 30 days). Frequency more frequent use associated with to be monitored should be in-
of use is often dichotomized as a lower probability of subsequent corporated into each countrys
either current daily or current non- quitting than less frequent use) survey. Examples of items used in
daily use (Table 3.9). In the USA, (Hyland et al., 2004). the various cross-national surveys
current non-daily smoking is more are provided in Table 3.10.
common among African Ameri- Per capita consumption (by
cans and Hispanics than it is Type of product used: weight) of various tobacco
among non-Hispanic Whites (US products is often documented by
Department of Health and Human It is important to measure the type government agricultural agencies
Services, 1998). Overall, current of product consumed, particularly (Capehart, 2007). A useful rule of

IARC Handbooks of Cancer Prevention

Construct Constructs III. and III.a. on Table 3.1 (Current use)

Measure Surveys of Youth

How frequently have you smoked cigarettes during the LAST 30 DAYS? Not at all; less than 1 cigarette
per week; less than 1 cigarette per day; 1-5 cigarettes per day; 6-10 cigarettes per day; 11-20 cigarettes
per day; more than 20 cigarettes per day (ESPAD)

During the past 30 days, on how many days did you smoke cigarettes? 0 days; 1 or 2 days; 3 to 5 days;
6 to 9 days; 10 to 19 days; 20 to 29 days; all 30 days (GSHS)

During the past 30 days (one month), on how many days did you smoke cigarettes? 0 days; 1 or 2
days; 3 to 5 days; 6 to 9 days; 10 to 19 days; 20 to 29 days; all 30 days (GYTS)

Do you smoke now? Not at all; occasionally, but less than once a month; some time each month, but
less than one cigarette per week; sometime per week, but less than one cigarette per day; every day
at least one cigarette? (GYTS OPTIONAL)

How often do you smoke at present? Every day; at least once a week, but not every day; less than
once a week; I do not smoke (HBSC)

Surveys of Adults

Do you currently smoke tobacco (cigarettes, cigars or pipes) on a daily basis, less than daily, or not at
all? (GATS)

Do you smoke every day, less than every day, or not at all? (including factory-made cigarettes or
hand-rolled cigarettes). NON-DAILY SMOKERS ARE ASKED: Do you smoke at least once a week?
THOSE WHO ANSWER NO ARE ASKED: Do you smoke at least once a month? (ITC)

Do you currently smoke any tobacco products, such as cigarettes, cigars, or pipes? IF YES: Do you
currently smoke tobacco products daily? (STEPS)


Validity Evidence of utility. Self-reports of current use have been shown to be reasonably valid for adults and
youths, when adequate privacy is afforded (Turner et al., 1992; Velicer et al., 1992; Patrick et al., 1994;
US Department of Health and Human Services, 1994; Gfroerer et al., 1997; Brittingham et al., 1998;
Caraballo et al., 2001; Fowler & Stringfellow, 2001; Kann et al., 2002; Caraballo et al., 2004; Brener et
al., 2006). Kappa for smoking on > 14 days during the previous 30 days was 80.1% in CDC 14-day
reliability study among high school students (Brener et al., 1995). Evidence indicated that for persons
aged > 18 years, current smoking prevalence estimates based on proxy reports are virtually identical
to those based on self-report (Gilpin et al., 1994).

Variation Items are adaptable for assessments of other tobacco products.

Definitions Among Youth: A current user is someone who used tobacco at least once during the previous 30 days
(month). A current frequent user is someone who used tobacco on > 20 of the previous 30 days. Among
Adults: A current user is someone who consumes tobacco daily or less than daily (GATS, STEPS) or
someone who consumes tobacco daily or less than daily during the previous month (ITC). A current daily
user is someone who reports using on a daily basis.
Among both Youth and Adults: Frequency refers to the number of days smoked each month.

Table 3.9 Current Use (Daily versus Non-Daily)

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 93

Measuring tobacco use behaviours

Comments Comparisons of adolescent prevalence estimates with those of adults can be problematic. For example,
estimates of current use among adolescents are often considerably higher than those among adults.
However, adolescents who smoke generally do so on fewer days each month than do adult smokers.
Ideally, comparisons of use among youth and adults would be made with a measure of the number of
days smoked during the previous 30 days (e.g. > 20 of 30 days). In countries where adult surveys do
not measure the number of days smoked out of the previous 30 days, then comparing adult prevalence
of current use with the prevalence of current frequent use among adolescents would be preferred to
comparisons of past month use, because the vast majority of adult users consume tobacco on > 20 of
the previous 30 days. Some countries measure use during the previous week. Comparisons of weekly
use among adolescents and adults would provide more comparable estimates than past month use.

ESPAD: European School Survey Project on Alcohol and Other Drugs

GSHS: Global School Health Survey
GYTS: Global Youth Tobacco Survey
HBSC: Health Behaviour of School-aged Children
GATS: Global Adult Tobacco Survey
ITC: International Tobacco Control Policy Evaluation Survey
STEPS: STEPwise Approach to Chronic Disease Factor Surveillance
CDC: Centers for Disease Control and Prevention

thumb is that when the amount of (Centers for Disease Control and 2001), they have been banned in a
tobacco consumed in a particular Prevention, 1994c; Tomar et al., number of countries (e.g. Euro-
product (e.g. snuff) comprises less 1995; Slade, 2001; Cummings et pean Union countries, Australia)
than 1% of total tobacco con- al., 2002a; Wayne & Connolly and replaced either by other terms
sumed, then use of that product 2002; Carpenter et al., 2005; or specific color schemes that
need not be assessed in surveys. Lewis & Wackowski, 2006). indicate strength based on
Exceptions to that rule may occur Tobacco control practitioners can machine-measured yields. All of
when use of a product that is rarely use this information to implement these indicators are still mis-
consumed in the overall population policies (e.g. counter-marketing leading, since the tests used to
is more common among a sub- campaigns, tobacco product regu- determine strength do not reflect
group of the population. In the lation) designed to reduce overall actual human exposure (National
USA, for example, the use of bidis use. Survey-based measures of Cancer Institute, 2001; Hammond
is rare in the adult population, but brand used are presented in Table et al., 2006b). Thus, it is important
of concern among young people 3.11; measures of brand switching to capture the extent of use of
(National Youth Tobacco Survey are described in Table 3.12. these terms, either via survey-
(NYTS) data, US National Survey Sub-brand characteristics (e.g. based questions (Table 3.11), or
on Drug Use and Health (NSDUH) strength, flavoring, length) are via documentation of what is on the
data). often determined by either asking actual package.
for the name of the specific brand Detailed measurement of infor-
Brand used: purchased or asking the name of a mation about tobacco product
brand family, followed by each of packaging is important in order to
The prevalence of use of specific several possible sub-brand charac- determine the variant of product
brands among users of a par- teristics (Table 3.11). Strength has type used, movement between
ticular product type (e.g. often been described by industry price sectors, and, potentially, to
manufactured cigarettes) reflects terms such as light and mild. assess the use of tobacco from
the influence of both marketing Because these terms are mis- illicit sources. Interviewers can
campaigns and product design leading (National Cancer Institute, either collect empty packages or

IARC Handbooks of Cancer Prevention

take digital photographs of a given probability of developing a to- chase patterns after price in-
respondents current pack. Pac- bacco-attributable disease (US creases, may influence the
kage characteristics to document Department of Health and Human probability of subsequent quitting,
include: brand name, strength, Services, 2004; IARC, 2004). with those switching to less expen-
flavoring, length, pack type (hard sive cigarettes appearing to be
pack versus soft pack), package Smoke intake: less likely to quit than those who
color, color in words (e.g. Silk Cut do not (Hyland et al., 2005; see
Silver, Silk Cut Purple), filter (e.g. The intake of smoke from a Section 5.1 for items assessing
non-filter, charcoal [if designated]), cigarette is generally determined adult purchase patterns). Among
UPC code, number of cigarettes in laboratory studies of smoking young people, policies are often
per pack, constituents measured topography, which assess how enacted to reduce sales to minors
and levels, text, warning label(s) cigarettes are smoked. Variables (underage persons) (Lantz et al.,
(words, picture [if applicable], and measured include the number of 2000). These policies are not con-
location[s]), and the presence or puffs taken per cigarette, the sidered effective on their own
absence of a tax stamp. duration of each puff, inter-puff (Fichtenberg & Glantz, 2002b;
In addition to survey based interval, puff volume, the draw Fielding et al., 2005), in part
measures, governments should rate of each puff, the unsmoked because young people are more
make available to researchers and butt length, and the amount of likely to give other people money to
policy makers sub-brand-specific obstruction of filter ventilation purchase cigarettes for them when
sales data on a region-specific holes (Pechacek et al., 1984). restrictions on sales to minors are
basis. This will allow researchers Unfortunately, questionnaire as- implemented (Everett Jones et al.,
to better document the influence sessments of this construct have 2002; White & Hayman, 2006).
of tobacco product marketing not proven to be valid. Two See Table 3.14 for questionnaire
practices. alternative techniques have been items on adolescent purchase pat-
developed that estimate smoke terns.
Intensity of use: intake from the study of cigarette
filter butts: one measures the Quit attempts
Intensity of use reflects the amount of solanesol, a naturally
average number of cigarettes, occurring component of tobacco A key outcome indicator of a
cigars, or pipes full of tobacco that is deposited during smoking policy is whether it leads to an
smoked each day for daily in the cigarette filter butt (Watson attempt to discontinue use (Starr
smokers, or on the days during et al., 2004a); and the other et al., 2005; Fong et al., 2006a).
which the respondent smoked for studies the staining pattern on As shown in Table 3.15, ques-
non-daily smokers. Selected filter butts as a proxy measure for tionnaire items that assess
questionnaire items used to total smoke volume (OConnor et whether a respondent has ever
assess intensity are listed in Table al., 2005; Strasser et al., 2006; tried to quit, the number of lifetime
3.13. Intensity decreases following OConnor et al., 2007). Either of quit attempts, and the duration
the implementation of smoke-free these techniques would require and recency of the last quit
policies (Fichtenberg & Glantz, the collection of filter butts from attempt are drawn from the ITC
2002a; Section 5.2) and price survey respondents. baseline survey. ITC follow-up
increases (Chaloupka et al., 2001; assessments determine whether a
Warner, 2006; Section 5.1). Purchase patterns: respondent has tried to quit since
Intensity is inversely associated the prior assessment and the
with the probability that a Some policies influence how peo- longest period of abstinence
respondent will quit (Hyland et al., ple obtain cigarettes. The ways in during that time period. The GATS
2004), and is directly related to the which adults change their pur- question assesses whether a quit

Measuring tobacco use behaviours

Construct Construct III.b. on Table 3.1(Type of product use)

Measure During the past 30 days, on how many days did you use any other form of tobacco, such as [COUNTRY
SPECIFIC EXAMPLES]? 0 days; 1 or 2 days; 3 to 5 days; 6 to 9 days; 10 to 19 days; 20 to 29 days;
all 30 days (GSHS)

During the past 30 days (one month), did you use any form of smoked tobacco products other than
cigarettes (e.g. cigars, water pipe, cigarillos, little cigars, pipe)? (GYTS)

During the past 30 days (one month), did you use any form of smokeless tobacco products (e.g.
chewing tobacco, snuff, dip)? (GYTS)

Do you currently use smokeless tobacco on a daily basis, less than daily, or not at all? (GATS)

On average, how many times a day do you use the following: [snuff by mouth, snuff by nose, chewing
tobacco, betel quid, any others]? (GATS)

In the past month, have you used any other tobacco product besides cigarettes? IF YES: What did
you use? FOR EACH PRODUCT USED, How often do you currently smoke/use [PRODUCT]? Would
that be daily, less than daily but at least once a week, less than weekly but at least once a month, less
than monthly, or have you stopped altogether? (ITC)

Do you currently use any smokeless tobacco such as [snuff, chewing tobacco, betel quid]? IF YES:
Do you currently use smokeless tobacco products daily? (STEPS EXPANDED)

On average, how many times a day do you use [snuff by mouth, snuff by nose, chewing tobacco, betel
quid, other]? (STEPS EXPANDED)


Validity Evidence of utility. Only 2% of adolescents in Sweden who reported that they did not use cigarettes or
snus during the previous month had cotinine levels > 5 ng/ml (Post, 2005). It was shown that the use
of cotinine and thiocyanate could distinguish smokers from smokeless tobacco users (Noland et al.,
1988). Kappa for use of chewing tobacco during the previous 30 days was 72.3% in CDC 14-day
reliability study among high school students (Brener et al., 1995).

Variation Country-specific lists are used. In general, use of a product need not be measured in surveys if
consumption of tobacco in that product is by weight < 1% of the total tobacco consumed in the country,
as reported by government agricultural statistics. Exceptions to this rule can occur as, for example,
when use of a particular product among youth is of concern.

GSHS: Global School Health Survey

GYTS: Global Youth Tobacco Survey
GATS: Global Adult Tobacco Survey
ITC: International Tobacco Control Policy Evaluation Survey
STEPS: STEPwise Approach to Chronic Disease Factor Surveillance
CDC: Centers for Disease Control and Prevention

Table 3.10 Type of Tobacco Product Used

IARC Handbooks of Cancer Prevention

Construct Construct III.c. on Table 3.1(Brand use)

Measure During the past 30 days (one month), what brand of cigarettes did you usually smoke? (SELECT
ONLY ONE RESPONSE) Did not smoke cigarettes during the past 30 days; no usual brand; Add 5
most common brands; other (GYTS)
What brand did you buy when you last purchased cigarettes? Were these cigarettes filtered or non-
filtered? Were these cigarettes light, mild, or low-tar? (GATS)

Do you smoke factory-made cigarettes, roll-your-own cigarettes, or both? IF BOTH: For every 10
(ten) cigarettes you smoke, how many are roll-your-own? In the last month, what brand of [cigarettes/roll-
your-own cigarettes] did you smoke more than any other? [SUB-BRAND CHARACTERISTICS ARE


Validity Face validity.

Variation In ITC, sub-brand characteristics (e.g. length, filter versus non-filter) are identified in one of two possible
ways. In many countries, such as Canada, Australia, and the United Kingdom, lists of every possible
brand are developed and a code is given to each brand. The interviewer needs to determine the
complete name of the brand the respondent is using. Often, the prompt, How do you ask for your
specific brand in the store? is used to try to elicit the full name. In other countries (e.g. USA, China),
where the variety of sub-brands is too great, brand names are given specific codes and interviewers
determine specific sub-brand characteristics (e.g. menthol versus non-menthol, King Size, 100s, or
some other length).
Country-specific terms that communicate concepts similar to light, mild, or low-tar should be
substituted as appropriate. These can include colour, as well as terms such as Fine or Smooth.

Items are adaptable for assessments of other tobacco products and for non-cigarette potential reduced
exposure products (PREPs).

Comments If necessary, country representatives should generate a list of all the brands on the market and have it
available for interviewers to use to code answers. Observation of packaging to assess colour(s),
presence of a legal tax stamp, and/or counterfeit brands would complement self-report.

GYTS: Global Youth Tobacco Survey

GATS: Global Adult Tobacco Survey
ITC: International Tobacco Control Policy Evaluation Survey

attempt of at least 24 hours was tionnaires assess whether a than those that were planned
made during the previous 12 serious attempt was made during (Larabie, 2005; West & Sohal,
months. A baseline question from the previous 12 months, the 2006). Items assessing this
the Smoking Toolkit Study (West, number of attempts, and, for up to construct from ITC and from the
2006) assesses whether a serious three attempts, the recency and Smoking Toolkit Study (West,
quit attempt (i.e. whether the duration of each. 2006) are presented in Table 3.16.
person decided to make sure they
never smoked another cigarette) Intentionality: Dose management:
was ever made and, if so, the
duration and recency of the last Spontaneous quit attempts People who quit abruptly (some-
quit attempt. The follow-up ques- appeared to be more successful times referred to as cold turkey)

Measuring tobacco use behaviours

Construct Construct III.c. on Table 3.1(Brand Use)

Measure About how long have you been smoking [current brand]? IF UNKNOWN: Would that be less than one
year, or at least one year? (ITC)

Approximately how long have you been smoking [NAME OF CURRENT BRAND]? Before the [NAME
OF CURRENT BRAND] that you smoke now, what brand did you smoke? (AUTS)

Sources ITC, AUTS

Validity Face validity.

Variation Items are adaptable for assessments of other tobacco products.

Comments Using data from the USA, it was demonstrated that 9.2% of smokers switched cigarette brands and
6.7% switched companies during the previous year (Siegel et al., 1996). Rates of switching may be
higher in locations where high prices lead to smokers searching out less expensive brands. During a
three year cohort study, it was observed that US adolescents who used snuff were more likely to switch
from a brand with low nicotine dosage to a brand with high, than to switch from a high dosage brand to
a low dosage brand (Tomar et al., 1995).

AUTS: Adult Use Tobacco Survey

ITC: International Tobacco Control Policy Evaluation Survey

appear more likely to succeed Key constructs to measure prevalence surveys. The key
than those who gradually reduce constructs involve current use.
the number of cigarettes they Several reports describe important Since current use is influenced
smoke each day (Fiore et al., constructs for tracking progress in primarily by initiation and ces-
1990; Gritz et al., 1999). Items reducing smoking prevalence (US sation, these constructs are
assessing this construct from the Department of Health and Human included as well.
ITC and the Smoking Toolkit Services, 1989, 1990, 1994, 1998, Two constructs, both used in
Study (West, 2006) are presented 2001; WHO, 1998a; Husten et al., adult surveys, that are too
in Table 3.17. 1998; Pierce et al., 1998b; complex to include in Table 3.19
Warren et al., 2000; Burns et al., will be presented here. GATS
Maintenance of abstinence versus 2000; Johnston, 2001; Kopstein, questions permit a six category
return to use: 2001;Giovino, 2002; Global Youth classification of use status: 1)
Tobacco Survey Collaborating current daily use; 2) current non
Discontinuing use of tobacco and Group, 2002; Godeau et al., 2004; daily use formerly daily; 3) cur-
maintaining abstinence are the Hibell et al., 2004; Global Tobacco rent use - never daily; 4) former
most important disease preventing Surveillance System Collaborating daily use; 5) former use - never
actions a user can take (US Group, 2005; Starr et al., 2005; daily; and 6) never used. These
Department of Health and Human Trosclair et al., 2005; Hublet et al., categories can be defined based
Services, 2004; Dresler et al., 2006; Johnston et al., 2006; on answers to three questions: 1)
2006). Items assessing duration of Mochizuki-Kobayashi et al., 2006; Do you currently smoke [use
abstinence are presented in Table Warren et al., 2006; White & smokeless] tobacco on a daily
3.18. Hayman, 2006; WHO, 2007a). basis, less than daily, or not at
Table 3.19 contains a list of key all?; 2) Have you smoked [used
constructs to measure in smokeless] tobacco daily in the

IARC Handbooks of Cancer Prevention

Construct Construct III.D. on Table 3.1(Intensity of use)

Measure Youth Surveys

How frequently have you smoked cigarettes during the LAST 30 DAYS? Not at all; less than 1 cigarette
per week; less than 1 cigarette per day; 1-5 cigarettes per day; 6-10 cigarettes per day; 11-20 cigarettes
per day; more than 20 cigarettes per day (ESPAD)

During the past 30 days (one month), on the days you smoked, how many cigarettes did you usually
smoke? I did not smoke cigarettes during the past 30 days (one month); less than 1 cigarette per day;
1 cigarette per day; 2 to 5 cigarettes per day; 6 to 10 cigarettes per day; 11 to 20 cigarettes per day;
more than 20 cigarettes per day (GYTS)

Adult Surveys

On average, how many of the following do you smoke each <day/week>? Manufactured cigarettes;
hand-rolled cigarettes; pipes full of tobacco; cigars, cheroots, cigarillos; water pipe rocks (GATS)

On average, how many cigarettes do you smoke each <day/week/month>, including factory-made
cigarettes and roll-your-own cigarettes? (ITC)

On average, how many of the following do you smoke each day? Manufactured cigarettes; hand-
rolled cigarettes; pipes full of tobacco; cigars, cheroots, cigarillos; other (STEPS)


Validity Evidence of utility. In several countries, cotinine levels increased with increasing cigarettes per day
(CPD) and levelled off between 10-20 CPD (Caraballo et al., 1998; Blackford et al., 2006). Indicators
of nicotine dependence are associated with smoking intensity in adolescents (OLoughlin et al., 2003)
and adults (Shiffman et al., 2004). Kappa for smoking > 1 cigarette/day during the previous 30 days was
76.2% in CDC 14-day reliability study among high school students (Brener et al., 1995).

Variation Items are adaptable for assessments of other tobacco products. Smokeless tobacco is measured in
GATS in terms of the number of times the respondent uses a given product each day.

Comments Intensity is the number of cigarettes/cigars/pipes full of tobacco smoked each day for daily smokers
and on the days smoked for less than daily smokers (Marcus et al., 1993; Centers for Disease Control
and Prevention, 1994a).

ESPAD: European School Survey Project on Alcohol and Other Drugs

GYTS: Global Youth Tobacco Survey
GATS: Global Adult Tobacco Survey
ITC: International Tobacco Control Policy Evaluation Survey
STEPS: STEPwise Approach to Chronic Disease Factor Surveillance
CDC: Centers for Disease Control and Prevention

Table 3.13 Intensity of Use (Number of Cigarettes or Other Tobacco Products Smoked
During a Selected Time Period)

Measuring tobacco use behaviours

Construct Construct III.f. on Table 3.1(Purchase patterns)

Measure During the past 30 days (one month), how did you usually get your own cigarettes? (SELECT ONLY
ONE RESPONSE) I did not smoke cigarettes during the past 30 days (one month); I bought them in a
store, shop or from a street vendor; I bought them from a vending machine; I gave someone else money
to buy them for me; I borrowed them from someone else; I stole them; an older person gave them to
me; I got them some other way (GYTS)

During the past 30 days (one month), did anyone ever refuse to sell you cigarettes because of your
age? I did not try to buy cigarettes during the past 30 days (one month); yes, someone refused to sell
me cigarettes because of my age; no, my age did not keep me from buying cigarettes (GYTS)

In the area where you live, do you know of any places that sell single or loose cigarettes? Yes; No

Where, or from whom, did you get the last cigarette you smoked? Tick only one box: I didnt buy it
My parents gave it to me; my brother or sister gave it to me; I took it from home without my parent(s)
permission; friends gave it to me; I got someone to buy it for me; other (specify) OR I bought itat a
hotel, pub, bar, tavern, RSL club; at a supermarket; at a news agency; at a milk bar or delicatessen; at
a convenience store (e.g. Food Plus); at a tobacconist/tobacco shop; at a take-away food shop; at a
petrol station; through the internet; other (specify) (ASSAD)

If you bought your last cigarette, was it from a coin-operated (vending) machine? (ASSAD)

Sometimes people break open a packet of cigarettes and sell single cigarettes. In the last four weeks,
have you bought cigarettes that were not in a full packet (for example, buying one or more cigarette(s)
at a time)? IF YES: Thinking of the last time you bought cigarettes that were not in a full packet, where
did you buy the cigarette(s) from? I bought the cigarette(s) at a shop; I bought the cigarette(s) from a
friend or relative; I bought the cigarette(s) from someone else (ASSAD)

Sources GYTS, ASSAD (White & Hayman, 2006)

Validity Face validity.

Variation Items are adaptable for assessments of other tobacco products.

Comments Those who purchase in locations that provide less expensive cigarettes are less likely to quit (Hyland
et al., 2005). Young people are more likely to have other people purchase cigarettes for them in regions
where sales to minors are restricted (Everett Jones et al., 2002; White & Hayman, 2006).

GYTS: Global Youth Tobacco Survey

ASSAD: Australian Secondary Students Alcohol and Drug Survey

Table 3.14 Purchase Patterns

IARC Handbooks of Cancer Prevention

Construct Construct IV.b. on Table 3.1 (Quit attempts)

Measure Ever:
ITC BASELINE: Have you ever tried to quit smoking? IF YES: How many times have you ever tried
to quit smoking? How long ago did your most recent serious quit attempt end? Thinking about your last
serious quit attempt, how long did you stay smoke free? (ITC)

Have you ever made a serious attempt to stop smoking? By serious attempt I mean you decided that
you would try to make sure that you never smoked another cigarette. Yes; No; Dont know
IF YES: Thinking back to your most recent attempt to quit smoking, how long ago was it? SHOW
SCREEN: Within the last week; within the last 2-3 weeks; a month ago; more than 1 month and up
to 2 months; more than 2 months and up to 3 months; more than 3 months and up to 6 months; more
than 6 months and up to a year; more than one year and up to 5 years; longer than 5 years; dont
AND: How long ago did your most recent quit attempt last? Less than a day; more than a day but
less than 3 days; more than 3 days up to a week; more than a week up to a month; more than 1
month and up to 2 months; more than 2 months and up to 3 months; more than 3 months and up to
6 months; more than 6 months and up to a year; more than one year and up to 5 years; more than
5 years; dont know; I am still not smoking (STS Baseline Questionnaire)

Past 12 months:
During the past year, have you ever tried to stop smoking cigarettes? I have never smoked cigarettes;
I did not smoke during the past year; yes; no (GYTS)

During the past 12 months, have you tried to stop smoking? IF YES: Thinking about the last time you
tried to quit, how long did you stop smoking? (GATS)

Follow-up assessments in a cohort study:

made any attempts to stop smoking since we last spoke with you in [month of last interview]? IF YES:
Are you back smoking or are you still stopped? IF BACK SMOKING: What is the longest time that you
stayed smoke free since [month of last interview]? IF STILL STOPPED: When did you quit? (ITC)

with you in [month of last interview] you had quit smoking. Are you back smoking or are you still
stopped? IF BACK SMOKING: What is the longest time that you stayed smoke free since [month of
last interview]? IF STILL STOPPED: So you have quit smoking since [quit date reported previously]
is that correct? IF NO: When did you quit? (ITC)

Have you made a serious attempt to stop smoking in the past 12 months? By serious attempt I mean
you decided that you would try to make sure that you never smoked another cigarette. Please include
any attempt that you are currently making. Yes; no; dont know.
IF YES: How many serious attempts to stop smoking have you made in the last 12 months?
(Choose one option only) 1 attempt; 2 attempts; 3 attempts; more than 3 attempts; dont know. How
long ago did your quit attempt start? (assessments are made for up to 3 attempts). How long ago
did your quit attempt last before you went back to smoking? (assessments are made for up to 3
attempts; still not smoking is an option) (STS Wave 1 and 2 postal questionnaires)

Sources ITC; STS (West, 2006); GATS

Table 3.15 Quit Attempts

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 101

Measuring tobacco use behaviours

Validity Face validity. However, respondents appear to forget many short quit attempts, especially those that
took place more than three months before the interview (Gilpin & Pierce, 1994; West et al, 2007). Having
ever quit for > 12 months or having quit for > 7 days during the previous 12 months has been classified
as a strong quitting history and is predictive of subsequent cessation (Pierce et al., 1998b).

Variation Items are adaptable for assessments of other tobacco products.

Comments ITC items are specifically crafted to assess change in a cohort study.

Definitions A quit attempt is an activity by a user in which the person tries to stop using with the intention of never
using again. Some surveys only classify periods of abstinence as quit attempts that last for > 24 hours.

GYTS: Global Youth Tobacco Survey

GATS: Global Adult Tobacco Survey
ITC: International Tobacco Control Policy Evaluation Survey
STS Smoking Toolkit Study

past?; and 3) In the past, have Services, 1998; Burns et al., Detailed measurement of infor-
you smoked [used smokeless] 2000). mation about tobacco product
tobacco on a daily basis, less than packaging is important in order to
daily, or not at all? (Note that Summary determine the variant of product
respondents are skipped past type used, movement between
questions that do not apply to This section describes the key price sectors, and, potentially, to
them, as indicated by their an- concepts within the natural history assess the use of tobacco from
swer(s) to initial item(s).) of tobacco use, providing a illicit sources.
The second construct involves conceptual model to guide mea- Other important constructs in
a technique that assesses tobacco surement of key constructs. the measurement of tobacco use
use activity during the 12 months Current tobacco use is the most behaviour include early use, fre-
prior to being interviewed. The US important construct because of its quency and intensity of current
Tobacco-Use Supplement to the importance as an outcome in use, quit attempts, and duration of
Current Population Survey asks policy evaluation studies. Studies abstinence among former smo-
current daily smokers, current non- that have examined the validity of kers.
daily smokers, and former smokers self-reported measures of current Consumers of survey data, in
abstinent < 12 months, Around use generally find these measures which tobacco use measures are
this time 12 months ago were you to be valid, although there are included, should be aware of
smoking cigarettes every day, conditions where the validity may factors that can influence popu-
some days, or not at all? This be reduced. lation estimates of tobacco use
question, which can be adapted to It is important to measure the and take those into consideration
smokeless tobacco use, enables a type of tobacco used, particularly when comparing estimates from
retrospective cohort assessment of in those countries in which there surveys conducted within and
cessation activity, transitioning exists a variety of forms. The across countries.
from daily to non-daily use, transi- variety of forms available, and the
tioning from non-daily to daily use, possibility of switching, or multiple
and relapse to daily or non-daily concurrent use may influence the
use (Gilpin & Pierce, 1994; US probability of quitting and disease
Department of Health and Human risk.

IARC Handbooks of Cancer Prevention

Construct Construct IV.b.i on Table 3.1 (Intentionality)

Measure When you made your last quit attempt, when did you choose your quit day? Chose it on the actual
day when you stopped; chose it on the day before you stopped; chose it more than one day before; or
actually decided to quit after having not smoked for some other reason (ITC)
Had you been seriously thinking about quitting in the days before you finally decided to stop, or was it
a spur-of-the-moment decision? I had already been seriously thinking about quitting; it was a spur-of-
the-moment decision (ITC)
Which of the following statements best describes how your most recent quit attempt started? SHOW
SCREEN: I did not plan the quit attempt in advance; I just did it; I planned the quit attempt for later the
same day; I planned the quit attempt the day beforehand; I planned the quit attempt a few days
beforehand; I planned the quit attempt a few weeks beforehand; I planned the quit attempt a few months
beforehand; none of these (other specify) (STS Baseline Questionnaire)

Please circle which applies to each quit attempt. (Choose one response for each quit attempt) I planned
the quit for later the same day or for a date in the future; I planned to quit as soon as I made the decision
(STS Wave 1 & 2 postal questionnaires)

Sources ITC; STS

Validity Face validity. Unplanned quit attempts were more likely to succeed than planned attempts (Larabie,
2005; West & Sohal, 2006)
Variation Items are adaptable for assessments of other tobacco products.

ITC: International Tobacco Control Policy Evaluation Survey

STS: Smoking Toolkit Study

Table 3.16 Quit Attempts Intentionality

Construct Construct IV.b.ii on Table 3.1 (Dose management)

Measure On your most recent quit attempt, did you stop smoking suddenly or did you gradually cut down on the
number of cigarettes you smoked? Stopped suddenly; cut down gradually (ITC)

Did you cut down gradually by delaying the first cigarette you had each day for longer and longer, or
just by trying to smoke less and less? By delaying the first cigarette of the day; by trying to smoke less
and less; both (ITC)

Did you cut down the amount you smoked before trying to stop completely? (Choose one response
for each quit atempt) Cut down first; stopped without cutting down; cannot remember (STS)

Sources ITC; STS

Validity Face validity. Abstainers were more likely to stop without cutting down than were relapsers, who were
more likely to quit using gradual reduction (Fiore et al., 1990; Gritz et al., 1999).

Variation Items are adaptable for assessments of other tobacco products.

ITC: International Tobacco Control Policy Evaluation Survey

STS: Smoking Toolkit Study

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 103

Measuring tobacco use behaviours

Construct Construct IV.c. on Table 3.1 (Maintenance of abstinence)

Measure How long ago did you stop smoking? I have never smoked cigarettes; I have not stopped smoking;1-
3 months; 4-11 months; 1 year; 2 years; 3 years or longer (GYTS)

When was the last time you smoked a cigarette, even one or two puffs? I have never smoked a
cigarette; today; not today, but some time during the past week; not in the past week, but some time in
the past month; 2-3 months ago; 4-6 months ago; 7-12 months ago; 1 to 4 years ago; 5 or more years

How long has it been since you last smoked regularly? (GATS)


made any attempts to stop smoking since we last spoke with you in [month of last interview]? IF YES:
Are you back smoking or are you still stopped? IF BACK SMOKING: What is the longest time that you
stayed smoke free since [month of last interview]? IF STILL STOPPED: When did you quit? (ITC)
ALTERNATIVE METHOD: Have you made any attempts to stop smoking since we last spoke with you
in [month of last interview]? IF YES: The last time we spoke with you in [month of last interview] you
said that you smoked [daily/less than daily but at least once a week/less than once a week but at least
once a month]. Do you still smoke [daily/less than daily but at least once a week/less than once a week
but at least once a month]?
once a week, or less than once a week, but at least once a month?
daily or are you smoking less than once a week, but at least once a month?

daily or less than daily, but at least once a week?

with you in [month of last interview] you had quit smoking. Are you back smoking or are you still
stopped? IF BACK SMOKING: What is the longest time that you stayed smoke free since [month of
last interview]? IF STILL STOPPED: So you have quit smoking since [quit date reported previously]
is that correct? IF NO: When did you quit? (ITC)

How long ago did you stop smoking daily? (STEPS)


Validity Evidence of utility. Self-reports of having quit are reasonably valid when adequate privacy is afforded
and demand for abstinence is not high (Velicer et al., 1992).

Variation Items are adaptable for assessments of other tobacco products.

Comments ITC items are specifically crafted to assess change in a cohort study.

Definitions A former user is someone who has used more than the threshold level of established use and who no
longer uses. Sustained former use occurs when a former user has been abstinent for at least 12 months
(6 to 12 months, Starr et al., 2005; 12 months, Giovino & Borland, personal communication).

GYTS: Global Youth Tobacco Survey

GATS: Global Adult Tobacco Survey
ITC: International Tobacco Control Policy Evaluation Survey
STEPS: STEPwise Approach to Chronic Disease Factor Surveillance

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 104

IARC Handbooks of Cancer Prevention

Construct Numerator Denominator Comments

Initiation of Use

Ever use Number of ever users Total number of A similar construct could be assessed for
respondents ever daily use.

Early initiation Number of ever users who Number of ever users GYTS uses 10 years old as cut-off.
tried using before a given A similar constuct could be measured
age for initiation of daily use before a given

Transition to established Number of current daily Number of ever users Indicates probability of transition to and
use users maintenance of more established use.
(See Johnston, 2002 for other indicators
of transition)

Discontinuance Number of former triers Number of ever users A similar construct could be assessed for
former experimenters.

Maintenance of Use

Current use Number of current users Total number of Various measures include current
respondents smoking, current smokeless tobacco
use, current tobacco use, and current
use of individual products. Similar
constructs could be assessed for current
daily use.

Frequency of use Number of daily users Number of current users An inverse construct would define the
percentage of current users who do not
use on a daily basis. Some surveys
describe frequent use as use on > 20 of
the previous 30 days.

Intensity of use Number of current users Number of current users Cut-offs should be standardised to permit
who use more than a given comparisons. For example, for adult
amount cigarette smokers, use of > 15
cigarettes/day could serve as a measure
of heavy smoking. Mean numbers can
also be presented.

Brand use Number of current users Number of current users Variants could involve descriptors of roll-
who use a given brand your-own cigarettes, Western versus
domestic brands, and sub-brand
characteristics as appropriate to a given
nation (e.g. light/mild, menthol)

Purchase location Number of current users Number of current users For adults, type of venue could indicate
who purchase in a given tax avoidance strategies. For youth,
location source of tobacco could indicate efforts

Table 3.19 Suggested Prevalence Indicators of Tobacco Use Behaviours

chap3.1janvier13:Layout 1 13/01/2009 09:55 Page 105

Measuring tobacco use behaviours

Cessation of Use

Former use among ever Number of former uses Number of ever users Often called the quit ratio or
users prevalence of cessation this is a crude
measure of quitting (Pierce et al., 1989;
US Department of Health and Human
Services, 1989, 1990).

Sustained abstinence Number of former Number of ever users Relapse is less likely after being
users abstinent for > 6 abstinent for > 12 months.

Making a quit attempt Number of current users Number of current users Making a quit attempt is a dependent
who tried to quit during the plus the number of former variable in many policy analyses
previous 12 months plus users abstinent for <12
the number of former users months
abstinent for <12 months

Former use for > 1 months Number of former users Number of current users Indicates > 1 month of abstinence
among anyone who used abstinent for 1-12 months who tried to quit during the among those who tried to quit during
during the previous 12 previous 12 months plus the previous 12 months. People
months and made a quit the number of former users abstinent for < 1 month would be not
abstinent for 1-12 months included in this anlysis (Centers for
Disease Control and Prevention, 1993)

Notes: The numbers in the numerator and denominator could be either the actual number of respondents in the survey or the weighted population
estimate. Also, fractions would be multiplied by 100 to obtain percentages.

3.2 General mediators and moderators of

tobacco use behaviours

Introduction practical as a good theory that WG established a short list of the

explains what to measure, how to variables considered to be the most
Presented in this section are a core interpret the results, what course of relevant and useful for the
set of general mediator and action to take based on these evaluation of tobacco control poli-
moderator variables that should be results, and what consequences cies and interventions in general.
considered when evaluating tobac- can be expected from these actions. Researchers can complement this
co control programmes and policies. To establish a list of these me- list by adding other relevant
A brief description and assessment diators and moderators, the measures, depending on the aim
of several standard measures for Working Group (WG) drew on and cultural context of each study,
assessing these constructs are relevant behaviour theories (Conner and the specific interventions under
provided as well. Mediators are & Norman, 1996) including the evaluation.
variables situated on the causal Social Cognitive Theory (Bandura, Guiding principles in the
pathway between a policy and its 1986), the Health Belief Model establishment of this list were the
public health impact (i.e. variables (Janz & Becker, 1984), the Trans- usefulness of each measure, its
that are affected by policies and that theoretical Model of Change influence in the published literature,
in turn, influence health or (Prochaska et al., 1992), the Pro- and the availability of associated
behavioural outcomes). For in- tection Motivation Theory (Rogers, validation studies (which were not
stance, motivation to quit may 1975), the Theory of Planned always available). Some measures
increase after an anti-tobacco infor- Behavior (Ajzen, 1991), and the for which no psychometric tests of
mation campaign, and motivation in Prime Theory (West & Hardy, validity were available were never-
turn predicts whether smokers will 2006). In particular, readers are theless included because of their
quit. Moderators are factors not referred to the theoretical framework face validity and lack of alternative
directly affected by the specific of the International Tobacco Control validated measures. Efficiency was
policy under scrutiny, but that Policy Evaluation Survey (ITC), also an important criterion of
moderate the effect of that policy. which was developed specifically for selection: the WG chose instru-
For example, an information cam- the evaluation of the WHO ments that were both brief and
paign may be effective among one Framework Convention on Tobacco informative, excluding long instru-
age group while being ineffective in Control (FCTC), and within which ments, even if they were widely
another (Figure 3.2). Analyzing surveys can be developed and used. When several comparable
mediators sheds light on how poli- interpreted (Fong et al., 2006a; scales were available, the most
cies and interventions have an Thompson et al., 2006). A com- influential one was chosen, based
impact; analyzing moderators aids prehensive list of all the psycho- on the number of citations to the
in understanding under what con- social determinants of smoking original articles describing these
ditions and in which groups they behaviour would result in a long scales (Bakkalbasi et al., 2006).
work, or do not work. In the context questionnaire in the context of The psychological determinants
of policy evaluation, nothing is as policy evaluation. Therefore, the of tobacco use and cessation range

IARC Handbooks of Cancer Prevention


Sociodemographic characteristics
Mental health
Alcohol and substance use

Policy Policy Psychosocial mediators Policy relevant Public

specific outcomes health and
Knowledge economic
In particular, impact
Beliefs about risks, costs and
tobacco use
Self-exempting beliefs,
justifications, regret

Attitudes towards smoking

Functional utility of smoking

Anti tobacco industry attitudes

Concerns about SHS

Smoking susceptibility

Intention to quit, quit date

Recent quit attempts and duration


Social influences, perceived

social norms

Figure 3.2 The role of psychosocial variables in the causal chain between policy and public health impact

General mediators and moderators of tobacco use behaviours

from cognitive, motivational, and Depending on the context, eva- predictors of behaviour. Three
emotional variables to personality luators can also assess illicit drug questions are proposed to assess
traits, personal life events, and use, for instance by using the a respondents perceived risk of
psychopathology variables. It is WHO ASSIST questionnaire disease: How would you compare
important to note that many quit (WHO ASSIST Working Group, your chance of getting lung cancer
attempts are not planned (Larabie, 2002; Newcombe et al., 2005). compared to the chance of a
2005), that the triggers of relapse The set of general mediators nonsmoker? Do you worry that
are often quite contextual, and and moderators considered in this smoking will damage your
that the timely response of the section was derived from theory, health? How much do you think
subject in each specific situation is published research, and the WGs you would benefit from quitting
determinant (West & Hardy, subjective assessment of what is smoking? (Table 3.21). Additional
2006). Thus, ideally, measure- relevant for policy evaluation. This specific beliefs are covered in
ments should be both timely and list (Table 3.20), though not other sections of this Handbook.
contextual, which is not always comprehensive, is believed to
feasible. Therefore, the WG represent a core set of measures Validity: For the question on
excluded the assessment of tem- useful in explaining how policies worrying that smoking will
porary states of mind (e.g. the and interventions work, in which damage the smokers health, the
euphoria caused by an alcoholic population subgroups they work, test-retest intraclass correlation,
drink) that are good proximal and how to improve them. assessed eight months apart in
predictors of relapse, because daily smokers with no quit
their assessment requires specific Items and scales used to attempts, was r=0.59 (Yan, 2007).
techniques (ecological momentary assess the psychological In an analysis of daily smokers in
assessments) that are not easily determinants of smoking the ITC surveys, this question
implemented in the context of predicted whether participants
policy evaluation (Shiffman et al., Mediators made a quit attempt (very worried
2002). versus not at all worried, odds
Smoking prevalence is much Cognitive variables ratio (OR) = 3.24 for quit attempts,
higher in psychiatric patients than 95% confidence interval (CI):
in the general population, and on Perceived risk and outcome 2.67-3.94) (Thompson et al.,
average, smokers with psychiatric expectancies 2006; Yan, 2007). For the ques-
disorders are more dependent on tion on the benefits of quitting
tobacco than other smokers For many quitters, smoking ces- smoking, the test-retest intraclass
(Breslau, 1995). There is also a sation is preceded by a change in correlation was r=0.54, for
concern that, in countries where beliefs about the costs and assessments made eight months
smoking prevalence declines, an benefits of smoking and of quitting apart in daily smokers with no quit
increasing proportion of the (Etter et al., 2000a). These beliefs attempts (Yan, 2007). In an
remaining smokers have psy- are often the target of prevention analysis of daily smokers in the
chiatric disorders (Lasser et al., interventions, and it is therefore ITC surveys, the question on the
2000). Thus, an assessment of important to include them in benefits of quitting predicted
mental health is relevant to the programme evaluations. Asses- smoking cessation after eight
study of smoking behaviour. In sing personalized beliefs that the months (extremely versus not at
addition, it is suggested that respondent has about himself or all, OR = 2.11, 95% CI: 1.23-3.60)
alcohol use and abuse be as- herself is suggested, rather than (Yan, 2007). These questions
sessed, as both are strongly general awareness, since per- therefore have some evidence of
associated with tobacco use. sonalized beliefs are stronger validity.

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 110

IARC Handbooks of Cancer Prevention

I. Mediators

a. Cognitive variables:
Beliefs about the risks, costs, and benefits of smoking and of quitting
Self-exempting beliefs, justifications, regret
Attitudes towards smoking, functional utility of smoking
Anti-tobacco industry attitudes
Concerns about exposing others to secondhand smoke

b. Motivational variables:
Smoking susceptibility (adolescents)
Intention to quit and quit date
Recent quit attempts and duration of the last quit attempt

c. Self-efficacy

d. Social influences, perceived social norms

II. Moderators

a. Sociodemographic characteristics:
Socioeconomic status (education, income, occupation)
Ethnicity, primary language, minority group status
Family structure, peer and family smoking
Country of residence and language of the interview (recorded by the interviewer)

b. Personality

c. Mental health:
WHO-5 Well-Being Index
2-item screening for current symptoms of depression

d. Alcohol use and abuse:

Alcohol Use Disorders Identification Test (AUDIT-C)

Table 3.21 Measures of the Psychosocial Determinants of Smoking

IARC Handbooks of Cancer Prevention

Page 112

Page 113

Table 3.21 Measures of the Psychosocial Determinants of Smoking

 Motivational variables     


IARC Handbooks of Cancer Prevention

Table 3.21 Measures of the Psychosocial Determinants of Smoking

General mediators and moderators of tobacco use behaviours

IARC Handbooks of Cancer Prevention

Self-exempting beliefs, justifica- less likely to make a quit attempt in and a differential score (advan-
tions, and regret the next eight months than those tages minus drawbacks) pros-
who strongly agreed (OR = 0.42, pectively predicted both smoking
Smokers continue to smoke, and 95% CI: 0.24-0.75), but they were cessation in current smokers and
nonsmokers start to smoke even as likely to quit smoking (Yan, relapse in former smokers, with
though they are aware of the risks 2007). This question may never- differences between smokers and
of smoking, in part because of self- theless be retained because of its quitters ranging from 0.5 to 1.4
exempting beliefs and other face validity. standard deviation units of this
justifications (Chapman et al., 1993; scale (Etter et al., 2000a). This
Weinstein, 1999). Quitting smoking Attitudes towards smoking scale can therefore be considered
may require shedding such beliefs to have adequate validity (Table
and accepting information about Attitudes are defined as the 3.21).
the risks of smoking. The WG degree to which people have a
suggests including one question favorable or unfavorable evalu- Functional utility of smoking
derived from the ITC survey, on ation of smoking (Ajzen, 1991).
whether people think that the Among the main drawbacks of Many smokers use cigarettes to
medical evidence that smoking is smoking, as reported by smokers control their weight or as response
harmful is exaggerated (Table themselves, are the health risks, to stress, even though tobacco
3.21). the financial costs, the bad smell, withdrawal itself is a strong
and the fact that secondhand stressor. Two questions from the
Validity: In daily smokers in the ITC smoke (SHS) bothers other ITC survey, whether smoking
survey, the test-retest reliability on people (Etter et al., 2000a). helps smokers control their weight,
the question "the medical evi- Among the most frequently cited and whether smoking calms them
dence... is exaggerated" was 0.64 advantages of smoking are the down when they are stressed or
(Yan, 2007). This question pre- pleasure to smoke, its relaxing upset, should be included.
dicted smoking cessation after effects, and the relief of withdrawal
eight months (strongly disagree symptoms (Etter et al., 2000a). Validity: In a prospective sample
versus strongly agree, OR = 2.23, These elements are captured by of 272 current and former
95% CI: 1.17-4.23) (Yan, 2007). several scales, for instance the smokers, the item "smoking calms
This question has some evidence Attitudes Towards Smoking Scale me down when I am stressed or
of validity. (ATS-18) (Etter et al., 2000a); upset" had a test-retest correlation
using a few items from this scale of 0.8, and the item predicted
Regret is recommended. relapse in ex-smokers (difference
between abstainers and relapsers,
Many smokers express regret that Validity: The ATS-18 has a robust 2.3 standard deviation units,
they ever started to smoke. The factor structure across various p<0.001) (Etter et al., 2000a). This
WG suggests including one ques- samples, and test-retest correla- item can therefore be considered
tion on whether the respondent tions were high (in the range of 0.8 to have adequate validity.
would start smoking, if they had to to 0.9) (Etter & Perneger, 1999; For the question on whether
do it over again. Etter et al., 2000a; Christie & smoking helps smokers control
Etter, 2005). The hypothesized their weight, the test-retest relia-
Validity: In daily smokers in the association between attitudes and bility (eight months apart) in
ITC survey, the test-retest cor- intention to quit has been re- smokers in the ITC survey was
relation for this question was 0.62 produced in several studies (Etter r=0.74 (Yan, 2007). In the same
(Yan, 2007). Smokers who strongly & Perneger, 1999; Etter et al., sample, this question predicted
disagreed with this statement were 2000a; Christie & Etter, 2005), smoking cessation after eight

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 117

General mediators and moderators of tobacco use behaviours

months (strongly disagree versus tell the truth predicted smoking smokers with no quit attempts,
strongly agree, OR = 1.39, 95% CI: cessation after eight months (nei- was also moderate (r=0.50).
1.06-1.82) (Yan, 2007). Therefore, ther agree nor disagree versus However, in an analysis of daily
this question has some evidence of strongly agree, OR = 0.65, 95% CI: smokers, this question predicted
validity. 0.43-0.97). The question on whe- smoking cessation after eight
ther the industry tried to convince months (often or very often versus
Anti-tobacco industry attitudes the public that SHS carries no risk never, OR = 1.37, 95% CI: 1.16-
also predicted smoking cessation 1.62) (Yan, 2007). Therefore,
Criticism of tobacco companies is (disagree versus strongly agree, these questions have some
a strategy sometimes used in OR = 0.76, 95% CI: 0.61-0.93) evidence of validity.
prevention campaigns. Good cam- (Yan, 2007). These questions have
paigns can modify attitudes adequate evidence of validity. Motivational variables
towards these companies, which in
turn may lower the risk of youth Concerns about exposing others to Smoking susceptibility (adoles-
smoking initiation (Sly et al., secondhand smoke (SHS) cents)
2001a). Assessing anti-industry
attitudes is therefore relevant in the Decreasing exposure to second- To assess the susceptibility of
context of programme evaluation. hand smoke (SHS) is a priority of taking up smoking, Pierce's Smo-
Two suggested items derived from the FCTC. Policies targeting SHS king Susceptibility Scale, a brief,
the ITC surveys, whether tobacco may affect smokers' concerns three item, and widely cited mea-
companies can be trusted to tell about exposing others to it, which sure intended for adolescents, is
the truth about the dangers of their justifies including this topic. Two suggested (Pierce et al., 1996).
products, and whether they have suggested questions are whether
tried to convince the public that smokers think that their smoke is Validity: Pierce's Smoking Sus-
there is no health risk from SHS, dangerous to those around them, ceptibility Scale has good
should be included. and do smokers think about the predictive validity: in young never
harm their smoking might be doing smokers, 6.5% of those with
Validity: For the question on to other people. susceptibility ratings=0 had taken
whether the industry tells the up smoking four years later,
truth, the test-retest reliability in Validity: In the ITC surveys, the compared with 20.6% of those
smokers in the ITC survey was test-retest correlation for the item with ratings=3 (Pierce et al.,
r=0.59 (eight months apart) (Yan, your cigarette smoke is dan- 1996). This scale can therefore be
2007). For the question on gerous to those around you considered to have adequate
whether the industry tried to assessed eight months apart in validity, and the research papers
convince the public that SHS daily smokers with no quit describing it are widely cited
carries no risk, the test-retest attempts, was moderate (r=0.47) (Pierce et al., 1996; Choi et al.,
reliability was 0.45 (Yan, 2007). (Yan, 2007). However, in an 2001; Pierce et al., 2005).
The figures are lower than usually analysis of daily smokers, this
recommended (Nunnally & question predicted smoking ces- Intention to quit smoking
Bernstein, 1994), but eight months sation after eight months (strongly
may have been too long of an agree versus strongly disagree, Intention to quit is a key predictor
interval to assess test-retest for OR = 2.59, 95% CI: 1.03-6.46) of smoking abstinence, as well as a
opinion items. In an analysis of (Yan, 2007). The test-retest cor- key variable that policies and
daily smokers in the ITC surveys, relation for the item on the harm interventions intend to modify.
the question on whether the done to other people assessed Several approaches have been
tobacco industry can be trusted to eight months apart in daily used to assess intention or

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 118

IARC Handbooks of Cancer Prevention

motivation to quit (Prochaska et al., stages separately. Smoking status tion can nevertheless be retained
1992; Sciamanna et al., 2000). In and quit attempts are discussed in because of its face validity and
particular, the concept of stages of Section 3.1. Intentions may fluc- usefulness, and because eight
change has been widely used. It tuate even in short intervals of time months may have been too long of
proposes that people gradually (Hughes et al., 2005). Therefore, it an interval for analyses exploring
progress towards smoking ces- may be preferable to ask about this construct.
sation through a series of stages, immediate plans to stop, since
defined in particular by the level of reports of plans beyond the short- Previous quit attempts: Quit
motivation to quit (Prochaska et al., term may lack validity. A single attempts may be affected by
1992). Indeed, the two most widely question can be used on whether policy interventions, and are there-
cited papers in the smoking and smokers are seriously thinking of fore a relevant measure for policy
tobacco literature, as ranked in the quitting (No; Yes, but I have not evaluation. Having recently made
report by Byrne and Chapman decided when; Yes, I plan to quit a quit attempt predicts future
(2005), describe the stages of within the next 30 days) (Table cessation, and the duration of the
change theory (Prochaska et al., 3.21). longest time off smoking is a
1992, 1994). However, this theory particularly good predictor of
has been criticized on the grounds Validity: In daily smokers in the future cessation (Ferguson et al.,
that it does not accurately reflect ITC survey, those who were not 2003; Hyland et al., 2006). It is
reality, and that interventions planning to quit were much less worthwhile to ask smokers about
based on it are no more effective likely to have quit eight months the occurrence and duration of
than other interventions (West, later than those who planned to recent quit attempts.
2005a). Furthermore, in the case of quit in the next month (OR = 0.16,
smokers unmotivated to quit (pre- 95% CI: 0.11-0.23) (Yan, 2007). Self-efficacy
contemplators), the stage of
change theory recommends to Quit date Self-efficacy is the confidence in
prescribe interventions of doubtful one's ability to stop smoking or to
efficacy (e.g. information on health Setting a quit date and sticking to abstain from smoking in relapse
risks) instead of effective treat- it is a strategy recommended to situations (e.g. when having a
ments of dependence. This may be smokers in major guidelines (Fiore drink with smokers) (Bandura,
counterproductive if, for instance, et al., 2000). A question on the 1986). Self-efficacy predicts ces-
the lack of motivation is due to the planned quit date could be asked sation in current smokers (Etter et
severity of dependence and to the of those who plan to quit in the al., 2000b) and relapse to smoking
intensity of withdrawal symptoms next 30 days (Table 3.21). in former smokers (Gulliver et al.,
(West, 2005a). In addition, the 1995). There are several multi-
stage of change is presented as a Validity: In daily smokers in the ITC item scales measuring self-
single variable describing beha- survey with no quit attempts efficacy across various relapse
viour change, when in fact it is a between the two assessments situations that have satisfactory
haphazard mix of four different eight months apart, the test-retest validation data, in particular,
elements (smoking status, inten- reliability of the question on whe- predictive validity (De Vries et al.,
tion to quit, past quit attempts, and ther smokers willing to quit had set 1988; Velicer et al., 1990; Etter et
duration of abstinence). Because a quit date was low (r=0.43) (Yan, al., 2000b). However, these scales
this theory is so controversial, it 2007). In addition, having set a quit are too long for the purpose of
should be used with caution, and date was not a significant predictor policy evaluation, and single item
reliance should instead be placed of cessation after eight months (no measures may be preferable. A
on more face valid measures of versus yes, OR = 0.75, 95% CI: single item measure of self-
each of the four components of 0.47-1.17) (Yan, 2007). This ques- efficacy derived from the ITC

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 119

General mediators and moderators of tobacco use behaviours

survey that asks whether res- daily smokers, was moderate tions in each country or standard
pondents are sure that they would (r=0.42, r=0.40, and r=0.33, res- questions from the World Bank
succeed if they tried to quit, is pectively), but eight months may be surveys would be recommended
suggested (Table 3.21). too long of an interval to assess (Grosh & Glewwe, 1998).
test-retest reliability of opinion
Validity: The test-retest intraclass questions. In an analysis of daily Other smokers in the household,
correlation for this self-efficacy smokers in the ITC surveys, an- friends who smoke
item, assessed eight months apart swers to the first two questions
in daily smokers with no quit ("people believe..." and "fewer pla- Workplace and home smoking
attempts, was moderate (r=0.51) ces...") were not predictive of restrictions are important policy
(Yan, 2007). However, in an smoking cessation after eight outcomes, and in turn, they are
analysis of daily smokers in the months (Yan, 2007). However relevant determinants of smoking
ITC surveys, this question pre- people who agreed with "society behaviour. The presence of other
dicted smoking cessation after disapproves of smoking" were smokers in the household de-
eight months (extremely sure more likely to have quit eight creases the chances of quitting
versus not at all sure, OR = 2.46, months later than people who smoking (Hymowitz et al., 1997),
95% CI: 1.68-3.59) (Yan, 2007). disagreed with this affirmation (OR and increases the risk of smoking
Therefore, this question has = 1.34, 95% CI: 1.01-1.78) (Yan, initiation in nonsmokers (Conrad
adequate evidence of validity. 2007). In spite of their mixed per- et al., 1992; OLoughlin et al.,
formance on validation tests, these 1998; Tyas & Pederson, 1998). To
Social influences, perceived social questions can be included because assess this, it is recommended
norms of their face validity and utility. that questions about how many
people in the household are
Social influences are crucial in an Moderators smokers, and how many of the
adolescents decision to take up respondents five best friends are
smoking (De Vries et al., 1995). In Socio-demographic characteristics smokers, be used (Table 3.21).
many countries, social pressures
also make it less acceptable for Sociodemographic characteristics Validity: In the ITC survey, the
adults to smoke (Albers et al., are strong determinants of smo- test-retest intraclass correlation for
2004). Including three questions king behaviour (Townsend et al., the item on how many of their five
derived from the ITC survey to 1994). Relevant variables include: best friends smoke, assessed
assess social influences is age, sex, marital status and social eight months apart in daily
recommended. These questions support, socioeconomic status smokers, was r=0.64 (Yan, 2007).
cover whether others who are (education, income, occupation), In an analysis of daily smokers,
important to the respondent be- ethnicity, primary language, mino- this question predicted smoking
lieve that they should not smoke, rity group status, religion, family cessation after eight months
whether the respondent feels that structure, peer and family smoking, (four friends versus 0 friends OR
there are fewer places where they country of residence and language = 0.63, 95% CI: 0.43-0.92) (Yan,
feel comfortable smoking, and of the interview (recorded by 2007). Therefore, this question
the respondents perception of interviewer). has adequate evidence of validity.
the opinion that society disa- The most appropriate ques-
pproves of smoking. tions to assess sociodemo- Peer and family smoking (5-items),
graphic characteristics vary be- adolescents only
Validity: The test-retest intraclass tween countries (e.g. for ethnicity,
correlation for these three items, minority group status, education, Peer and family smoking predicts
assessed eight months apart in etc.). Using either census ques- smoking initiation in adolescents

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 120

IARC Handbooks of Cancer Prevention

(Conrad et al., 1992; OLoughlin et Mental health and can therefore be considered
al., 1998; Tyas & Pederson, 1998). to have adequate validity.
A useful 5-item scale developed to Smoking behaviour is strongly
assess the smoking status of associated with mental health, A 2-item screening test for depres-
family members and best friends including depression (Glassman et sion
has been developed (Pierce et al., al., 1990), which justifies the
1998c). This widely cited scale is inclusion of a brief assessment of A second way to assess de-
intended for adolescents ages 12- mental health in surveys of the pression in population surveys is to
17, and can be administered over general population. Among brief use a brief screening test, for
the phone (Table 3.21). assessments suitable for general instance, a widely cited 2-item test
population surveys, evaluators can (Whooley et al., 1997). This test
Validity: Peer and family smoking choose, according to their specific screens specifically for depres-
were not strong predictors of needs, between the WHO-5 Well- sion, whereas WHO-5 monitors a
susceptibility to smoke (Pierce et Being Index, which is a measure of broader index of mental health.
al., 1998c) (OR = 1.19, non signi- mental well-being (Bonsignore et Another possibility is to use
ficant). Nevertheless, this scale al., 2001), and a 2-item screening Kessler's K-6 scale (a 6-item
can be used, as several other test for depression (Whooley et al., measure of psychological distress)
studies have shown the impor- 1997). Mental health patients are (Kessler et al., 2002). Finally, a
tance of peer and family smoking often hard to reach and may not question on whether the res-
(Conrad et al., 1992; OLoughlin et take part in population surveys. pondent has ever been diagnosed
al., 1998; Tyas & Pederson, Because particular attention should or treated for depression could also
1998). Also because this scale is be paid to this group, population be included.
widely used (cited by at least 227 surveys should be supplemented
articles), it enables comparison with specific surveys of mental Validity: In patients without sub-
between samples. health patients. stance abuse, Whooleys 2-item
test had a sensitivity of 96%, a
Personality WHO-5 Well-Being Index (WHO-5) specificity of 66%, and an area
under the Receiver Operating
Personality traits affect smoking Being a WHO product, the 5-item Characteristic (ROC) curve of 0.84,
behaviour. For instance, a heri- WHO-5 Well-Being Index (WHO- using the Diagnostic Interview
table tendency for sensation 5) enables its users to compare Schedule (DIS-II-R) as the criterion
seeking or for novelty seeking pre- their results with other WHO (Whooley et al., 1997). The sensi-
dicts smoking behaviour (Zuc- surveys (Table 3.21) (Bonsignore tivity of this 2-item scale was better
kerman et al., 1990; Pomerleau et et al., 2001). than for the Center for Epidemio-
al., 1992; Etter et al., 2003a). Most logic Studies-Depression scale
personality questionnaires are too Validity: Using the Composite (CES-D short) (84%) and for the
long to be used in policy evaluation International Diagnostic Interview Beck Depression Inventory (BDI
surveys (Cloninger et al., 1993; (CIDI) as the measure, WHO-5 short) (87%), and its specificity was
Barrett et al., 1998); however, had a sensitivity of 93% and a similar or somewhat lower (CES-D
depending on the research goals, specificity of 64% to detect short=75%, BDI short=67%)
short versions of some personality depression in primary care pa- (Whooley et al., 1997). In another
questionnaires, such as for tients (Henkel et al., 2003). study conducted in primary care
sensation seeking, have been WHO-5 performed better than a patients, this 2 item test had a sim-
validated and could be considered clinical diagnosis to detect de- ilar area under the ROC curve
for inclusion (Hoyle et al., 2002; pression, using CIDI as the (0.859) compared with WHO-5
Stephenson et al., 2003). criterion (Henkel et al., 2004a), (0.862), and a comparable sensi-

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 121

General mediators and moderators of tobacco use behaviours

tivity (92% versus 93% for WHO-5) progress towards more effective project (Fong et al., 2006a;
and specificity (59% versus 64% and acceptable interventions. Thompson et al., 2006). This
for WHO-5), using CIDI as the cri- Importantly, analyzing psycho- model was developed specifically
terion (Henkel et al., 2004b). social factors is also an issue of for the evaluation of the FCTC,
Whooleys 2-item screening test social inequalities. Some inter- and it is therefore relevant for the
can therefore be considered to ventions may have adverse purpose of this Handbook. The
have adequate validity. effects in a number of subgroups, WG also included some elements
and interventions targeted at the believed to be important, such as
Alcohol use and abuse: Alcohol Use general population may not reach mental health and substance use.
Disorders Identification Test several subgroups in which Whenever possible, validated
(AUDIT-C) smoking prevalence is particularly measures were included (psycho-
high (e.g. mental health patients, metric validation studies were not
Alcohol use and abuse is strongly some minorities). always available). Some mea-
associated with tobacco use, and, The issue of translation and sures that were not well validated
in former smokers, with relapse cultural adaptation of the measures were nevertheless included be-
(Hymowitz et al., 1991). This described in this section are cause of their usefulness and face
justifies the inclusion of a well- addressed elsewhere in this validity. The WGs selection was
validated and widely cited test of Handbook (Section 2.2). Depen- also based on a subjective
alcohol use and abuse: the 3-item ding on the construct under assessment of what is useful and
Alcohol Use Disorders Identifi- scrutiny, even well-translated ques- important. Thus, this list should be
cation Test (AUDIT C) (Table tions may not be relevant, or may supplemented by other elements
3.21) (Bush et al., 1998; Reinert & not be understood in a culture according to the specific needs of
Allen, 2002; Rumpf et al., 2002). distant from where the instrument each study and country, and take
was initially developed (Beaton et into account new contributions to
Validity: The brief, 3-item version al., 2000). Many of the measures theory (West & Hardy, 2006).
(AUDIT-C) performs as well as the discussed here were developed in Even though this list is not
full version of AUDIT to detect at- high-income, English-speaking comprehensive, the WG believes
risk drinkers (Bush et al., 1998; countries, and there are very few that it represents a core set of
Reinert & Allen, 2002; Rumpf et data on their relevance or psycho- measures that are useful in
al., 2002). AUDIT-C has good metric properties in other cultures. analyzing how policies and
sensitivity (54% to 98%) and Establishing a list of the interventions work, in which
specificity (57% to 93%) for va- psychosocial determinants of population groups they work, and
rious definitions of heavy drinking. smoking is an impractical task that why some interventions do not
AUDIT-C can therefore be con- inevitably results in a list that is too work. Progress in this field is
sidered to have adequate validity. long for some purposes, and too possible only if thorough evalu-
short for others. Such a list is ations enlighten the path.
Discussion potentially endless. The WG
selected a core set of measures Summa ry and recommenda-
An assessment of the psycho- with general relevance for the tions
social determinants of smoking is evaluation of tobacco control
essential to understand how programmes and policies. Their This section describes mediators
policies and interventions produce choice was based on influential and moderators theorized to be
their effects, and how to improve theories of behaviour change, and important in understanding how
them. Evaluation studies that in particular on a model derived policies and interventions affect
neglect these elements loose an from these theories: the con- tobacco use behaviours, and
opportunity to help the field ceptual framework of the ITC under what circumstances they

chap3.2janvier12:Layout 1 12/01/2009 13:41 Page 122

IARC Handbooks of Cancer Prevention

have an impact. A core set of researchers should, whenever sensitive to wording and to cultural
measures likely to be important possible, use them rather than context; therefore, the methods for
has been identified. Researchers develop their own ad hoc mea- translations and cultural adap-
should select from this list and, sures. Investigators should report tations described in Section 2.2
when appropriate, supplement it the psychometric properties of should be utilised in populations
with other relevant measures, their measurement instruments, where these measures have not
depending on the specific context and at least the test-retest been previously validated.
and goals of each study. There reliability, convergent validity,
are validated measures of many of and/or predictive validity. Psycho-
the reviewed constructs, and logical measures are particularly

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 123

3.3 Measurement of nicotine dependence

Introduction (Shiffman et al., 2003). Also, long- Meas ures of cigarette-induced

term use of nicotine medications nicotine dependence
In this section, evidence of the has no documented untoward
validity of self-report measures of health effects, so therefore mea- The following section provides a
nicotine/tobacco dependence in surement of dependence to nicotine brief review of data on the
adults is examined. Measures are medications will not be included in measurement properties of seven
concentrated on that are potentially this review. Finally, while depen- self-report measures developed to
appropriate for population-based/ dence on tobacco products is assess the construct of cigarette-
epidemiologic research, as nicotine clearly evident among some youth, induced nicotine dependence: 1)
dependence is often assessed as a research on measures of nicotine Fagerstrm Test for Nicotine
potential moderator of programme dependence in adolescents is Dependence (FTND); 2) Heaviness
and policy effects. The Working limited, and will not be considered in of Smoking Index (HSI); 3) Diag-
Group (WG) has focused mainly on this section. For those interested in nostic and Statistical Manual-IV
scales measuring cigarette depen- a measure of nicotine dependence (DSM-IV) criterion of dependence;
dence, as cigarette smoking among youth, please refer to the 4) International Statistical Classi-
accounts for most of the health paper which describes the mea- fication and Related Health Prob-
damage caused by tobacco, and surement properties of the Hooked lems-10 (ICD-10) criteria; 5) Ci-
because the most widely used and on Nicotine Checklist (DiFranza et garette Dependence Scale (CDS);
best studied scales measure al., 2002b). 6) Nicotine Dependence Syndrome
cigarette dependence. This section Nicotine dependence is a hypo- Scale (NDSS); and 7) Wisconsin
has not attempted to review evi- thetical construct that is designed to Inventory of Smoking Dependence
dence evaluating measures to explain and predict societally- Motives (WISDM).
assess nicotine dependence of important outcomes, such as an Each measure will be evaluated
other types of smoked tobacco inability to quit smoking, heavy use, based on a review of the items that
products (e.g. cigars, pipe tobacco, and other problems occasioned by constitute the scales in terms of
bidis, hookah), although adaptations smoking or tobacco use (Piper et al., their reading level, face validity,
of measures used to assess 2006). Assessing tobacco depen- coverage of the dependence do-
cigarette smoking dependence dence is difficult and is made even main, and cross-cultural applica-
would be reasonable to consider. more so in population-based epi- bility. The WG will review the
The WG did include a review of demiologic research by the need for psychometrics of each scale,
measures of dependence on efficient assessment (valid and brief). including its reliability (e.g. internal
smokeless tobacco products, since Ideally, a measure should reflect the consistency) and factor structure,
the pattern of compulsive use of nature or domain of the construct of and will examine the predictive
these products is similar to that interest (i.e. tobacco dependence), validity of each measure, focusing
observed for cigarette smoking predict important outcomes (e.g. on two specific tobacco depen-
(IARC, 2007b). Persistent use of likelihood of quitting, problems en- dence criteria: a pattern of
nicotine medications has been countered through use), and be pervasive and heavy smoking and
described, but it is very rare relatively brief to assess. the ability to quit smoking.

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 124

IARC Handbooks of Cancer Prevention

Pervasive and heavy smoking use. Obviously, a pattern of of cigarettes smoked per day) can
could be assessed using self- heavy, pervasive smoking will predict outcomes, such as re-
report measures (e.g. cigarettes capture the degree of exposure to lapse, as well as longer measures
smoked per day or lifetime nicotine and the harmful (e.g. DSM-III-R, FTQ, and FTND)
cigarettes smoked), or using bio- constituents of tobacco/ciga- (Razavi et al., 1999; Breslau &
markers of exposure (e.g. carbon rettes. Moreover, a relative Johnson, 2000; Dale, et al., 2001).
monoxide (CO), cotinine, puff inability to quit smoking will When considering the infor-
topography) (see Section 3.1), and forecast the likely continued mation comprised here, it is
the ability to quit smoking could be exposure to such elements. important to remember that
assessed using a number of Evidence shows that past, current, reliability and validity are not
strategies as well (see Section and future use of tobacco directly inherent in measures. It can not be
3.1). These criteria reflect the predict outcomes of societal assumed that one can generalize
sheer volume of tobacco products import, such as money expended psychometric properties across
consumed and the intransigence of in buying tobacco products and different use contexts, or that
drug use, both of which have disease outcomes (and asso- validity for one use of a measure
significant effects on the health and ciated costs) caused by tobacco is generalizeable for a different
economics of both the individual use (US Department of Health and use (e.g. predicting relapse
and society. Although it is not a Human Services, 2004; Centers likelihood versus withdrawal
validation criterion, the evidence of for Disease Control and Pre- severity). Rather, these features
genetic linkages to the various vention, 2005). are estimated based on patterns
measures of tobacco dependence of statistical covariation and are
will be examined. This information Overarching issues: influenced by the nature of the
may be helpful for researchers who population being assessed
are interested in using epide- It is important to note that de- (Nunnally & Bernstein, 1994;
miological measures to make pendence is a construct (i.e. a McDonald, 1999). For instance,
inferences regarding etiology. hypothetical entity). It is not, in there may be less variance in item
It is important to note that other theory, equivalent to any single scores, or item scores might have
criteria could be used to evaluate measure or criterion (Piper et al., a less skewed distribution, when a
the performance of dependence 2006); although single items can dependence measure is used in a
measures. For instance, such be used to estimate a persons clinical population rather than a
measures could be evaluated with standing on the construct. Thus, nationally representative popu-
respect to prediction of withdrawal dependence is an inferred in- lation. This could easily affect both
severity or other outcomes fluence or force that produces the reliability and validity estimates.
theoretically linked to dependence outcomes associated with it (e.g. Different populations might yield
(Piper et al., 2006). However, high rates of smoking, relapse), different psychometric data be-
such outcomes seem less although it is not the only predictor cause of true differences in the
relevant than the ones selected for of such outcomes. Generally it severity or range of dependence.
measures to be used in epi- takes multiple variables or items to However, differences might also
demiologic research. For the adequately assess a complex, arise because of other factors,
purposes of epidemiologic re- hypothetical entity such as such as secular or environmental
search, a measure should reflect nicotine dependence (Clark & events that might affect scores on
or predict outcomes of societal Watson, 1995). In this section, dependence measures, while not
import, such as degree of tobacco however, considerable attention is actually changing the dependence
exposure and use, the intran- devoted to very brief measures of per se. One study showed that US
sigence of use, and the likelihood dependence, as evidence shows smokers had higher frequencies
of important negative outcomes of that such measures (i.e. number of severe nicotine dependence

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 125

Measurement of nicotine dependence

(FTND 6) than did Spanish response burden. In fact, as multidimensional measures tend
smokers (de Leon et al., 2002). It efficient as some of the uni- to ask about relatively discrete
is possible that such population dimensional measures are, some processes (e.g. a taste motive for
differences reflect different de- data suggest that particular items smoking) rather than global
grees or sources of error across from these measures possess consequences of smoking (e.g.
the two populations (restrictions in predictive validities that meet or smoking causing problems in life),
smoking in the home, the amount exceed those of the whole these multidimensional measures
of discretionary income, gender measure (Storr et al., 2005). Such may be more suitable for genetics
differences in smoking across the items might be especially valuable research, as they may tap pro-
populations, the ways the smo- for epidemiologic research. cesses that reflect a stronger
kers answer the questions and, A review of multidimensional genetic signal (Baker et al., in
indeed, understand them and so measures of nicotine dependence press). Finally, because multi-
on) rather than differences in the are included despite their length dimensional measures tend to ask
biological/psychological internal and reduced efficiency, because about internal and subjective
processes that make up depen- they have the potential to provide phenomena (e.g. role of affect
dence. There are numerous information about the mechanism regulation) rather than externally
environmental or social sources of underlying nicotine dependence referenced events (e.g. latency to
error variance that could dif- not supplied by unidimensional smoke in the morning, number of
ferentially affect the validity of a measures. For instance, multi- cigarettes consumed each day),
measure across populations: smo- dimensional measures are in- these measures may be less
king policies in the workplace, tended to assess particular facets susceptible to biasing by error due
taxes, religious or social norms, to of dependence or dependence to regional secular or policy
list few. processes (e.g. particular motives influences. Workplace smoking
In recognition of the depen- for drug use). Thus, these restrictions, for example, might
dence of psychometric properties measures may provide greater exert a more direct and larger
on the population being assessed, insights into the nature of tobacco effect on number of cigarettes
reliability and validity data from dependence than do unidimen- smoked per day than on the
both clinical trials and epi- sional measures. They also may smokers liking of the taste of
demiologic studies conducted provide greater discrimination cigarettes. On the other hand,
around the world, and present amongst smokers/tobacco users multidimensional scales tend to
data relating to the heritability of to the extent that smokers may be ask about relatively subtle, psy-
dependence as it is assessed distinguished on the basis of chological variables (e.g. asking
using the different measures, will something other than a single individuals to attribute smoking
be presented. The tobacco intensity dimension (which might urges or affect), and it is possible,
dependence measures will be be well captured by a single indeed probable, that cultures
divided into two groups: uni- severity dimension). For instance, may differ in how they make
dimensional and multidimen- some scales appear to reflect attributions or label internal phe-
sional. Unidimensional measures motives associated with initial nomena. Of course, while entire
are intended to assess depen- versus extensive use of tobacco multidimensional scales can be
dence as a single dimension (Piper et al., 2004), and other quite lengthy, individual items or
(although some, it turns out, may scales differ in sensitivity to use subscales can be selected for use
actually be multifactorial). Such patterns of highly dependent (Lerman et al., 2006); thus, this
measures are useful, because the users versus chippers (those section will review relevant
best of them are fairly efficient in who engage in periodic or light subscale data.
that they possess significant tobacco use) (Shiffman & Sayette, The foregoing discussion
validity given their length/ 2005). Since the subscales of should make clear that blanket

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 126

IARC Handbooks of Cancer Prevention

recommendations cannot be John et al., 2004a), Switzerland 0.70 (Etter et al, 1999), while a
given regarding dependence. Ra- (Etter et al., 1999), Australia study with a German population
ther, the investigator must both (Pergadia et al., 2006a), Canada found low internal consistency for
weigh practical issues (e.g. (Howard et al., 2003), Austria the FTND ( =.57) in two separate
response burden) and clearly (Lesch et al., 2004), and Brazil, samples (John et al., 2004b), and
identify the goals of assessment Mexico, Poland, and China a study in China found that FTQ
(e.g. predict probability of relapse) (Blackford et al., 2006; Huang et had low internal consistency as
in order to select an appropriate al., 2006). The HSI has also been well ( =.58) (Huang et al., 2006).
dependence instrument or as- used in research in Spain (Diaz et Some studies have shown that
sessment strategy. al., 2005), Australia, Canada, UK, the FTND has a two-factor
and the USA (Heatherton et al., structure, suggesting that it does
Unidimensional measures of 1991; Hymowitz et al., 1997; not measure a unitary construct of
tobacco dependence Hyland et al., 2006). One of the physical dependence (Payne et
questions on the FTND concerns al., 1994; Etter et al., 1999;
Fagerstrm Test for Nicotine smoking in forbidden places. The Haddock et al., 1999; Radzius et
Dependence and the Heaviness of validity of this question may be al., 2003; Breteler et al., 2004;
Smoking Index affected by regional differences in John et al., 2004b). A population-
environmental restrictions in based study in France found that
The first unidimensional measure of smoking (Huang et al., 2006). In while a two-factor model fit the
tobacco dependence is actually a addition, two questions in this data well, the two factors were
group of measures arising from the scale assume a pattern of daily highly correlated (Chabrol et al.,
Fagerstrm Tolerance Question- smoking (e.g. questions 1 & 4, the 2003). Inter-item correlations also
naire (FTQ) (Fagerstrm, 1978): two questions in the HSI). It is very reveal that not all items are highly
these comprise the FTQ itself, as likely that scores on these items related (r = 0.06-0.39) (Trans-
well as the 6-item Fagerstrm Test will have reduced validity if used disciplinary Tobacco Use Re-
for Nicotine Dependence (FTND) with non-daily smokers. An search Center (TTURC) Tobacco
(Heatherton et al., 1991) and the 2- important goal of future research Dependence Phenotype Work-
item Heaviness of Smoking Index is to identify dependence mea- group, 2007). These studies
(HSI) (Kozlowski et al., 1994). See sures that are appropriate for suggest that the two factors reflect
Appendix 1 for the items and non-daily smokers. morning smoking (i.e. whether
scoring. These measures are one smokes more in the morning
based on the construct of physical Reliability and structure: Com- and whether one would rather give
dependence, which includes facets pared with the FTQ, the FTND has up the first cigarette of the day or
such as the need to smoke early in demonstrated better psychometric all others), and smoking pattern
the morning to alleviate overnight properties, such as internal con- (i.e. the number of cigarettes
withdrawal, the need to smoke sistency (Payne et al., 1994; smoked per day, time to first
numerous cigarettes per day, and Pomerleau et al., 1994; Haddock cigarette, difficulty refraining from
the invariance of smoking beha- et al., 1999); however, these smoking, and smoking when ill),
viour (i.e. smoking even when you improved reliability coefficients are although some data indicate that
are ill) (Fagerstrm, 1978). The still low (Etter, 2005) and below time to first cigarette loaded on
Flesch-Kincaid Reading Grade traditionally accepted standards both factors (Radzius et al., 2003).
Level is 4.4 for the FTND and 4.2 for clinical use ( =0.80) (Nunnally Latent class analyses suggest that
for the HSI. & Bernstein, 1994). Using a the FTND divides smokers into
The FTND has been translated French translation of the FTND groups based on severity of
and used with population samples with light smokers found internal dependence (Storr et al., 2005);
in Germany (John et al., 2003a; consistencies of approximately that is the two factors do not

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 127

Measurement of nicotine dependence

appear to pick-out smokers who heightened risk for psychiatric morning) predicts relapse vul-
differ in terms of types of de- comorbidities in a large population nerability, as well as, or better
pendence. sample in Germany (John et al., than, much longer multidimen-
The HSI is comprised of only 2005). sional instruments (TTURC
two items, which limits the Some data indicate that the Tobacco Dependence Phenotype
relevance of internal consistency standard scoring method used Workgroup, 2007). Recent popu-
estimates. However, zero-order with the FTND (adding up item lation-based research shows that
correlations between the two responses) may not produce an a single item on the HSI (item #1)
items in the measure indicate optimal scaling of dependence is highly effective in predicting the
moderate levels of association level. Latent class analysis likelihood of future cessation
(e.g. rs 0.30) (TTURC Tobacco suggested that some items are (TTURC Tobacco Dependence
Dependence Phenotype Work- particularly important to the Phenotype Workgroup, 2007).
group, 2007). assessment of dependence level
(those that capture variance due Heritability: In a study of young
Validation: The FTND and HSI to morning smoking) and that they adult Australian Twins, HSI-
predict both behavioural and are relatively underweighted in the assessed dependence was found
biochemical indices of smoking in typical scoring method (Storr et to be highly heritable (71%)
Chinese-, English-, French-, and al., 2005). Therefore, investigators (Lessov et al., 2004). In addition,
German-speaking populations using the FTND may wish to the FTND and HSI were both
(e.g. CO, cotinine, lifetime amount explore alternative, empirically- related to the dopa decarboxylase
smoked) (Heatherton et al., 1989, based scoring or cut-score gene, which is involved in the
1991; Kozlowski et al., 1994; Etter determination methods (e.g. la- synthesis of dopamine, nore-
et al., 1999; John et al., 2003a; tent class analysis, Receiver pinephrine, and serotonin (Ma et
Huang et al., 2006). This should Operating Characteristic curves al., 2005). One haplotype was
not be surprising, given that the (Swets et al., 2000)). significantly related to depen-
FTND and HSI directly assess While the FTND certainly can dence in both African-American
smoking heaviness. However, it is predict future smoking or likeli- and Euro-American smokers,
encouraging to note that smokers hood of cessation, the HSI while another was related to
are indeed able to estimate their appears to account for much of dependence only in Euro-Ameri-
amount of smoking as indexed by the predictive validity of that can smokers (Ma et al., 2005).
biochemical tests in response to measure (Breslau & Johnson, Additional studies link FTND-
single items (e.g. Question #4 on 2000; Heatherton et al., 1989; defined dependence to particular
the FTND, How many ciga- TTURC Tobacco Dependence genetic variants (Bierut et al.,
rettes/day do you smoke?). The Phenotype Workgroup, 2007). 2007; Gelernter et al., 2007;
FTND has demonstrated an ability Population-based studies con- Saccone et al., 2007).
to predict cessation outcomes in ducted in Australia, Canada, the
smoking cessation studies (Camp- UK, and the USA found that the Summary: The FTND has been
bell et al., 1996; Westman et al., two HSI items (number of widely used in a number of
1997; Alterman et al., 1999; cigarettes smoked and time to first different countries and a number
Patten et al., 2001; TTURC cigarette in the morning) were the of different languages. It is short
Tobacco Dependence Phenotype strongest predictors of quitting and has an accessible reading
Workgroup, 2007), and with col- (Hymowitz et al., 1997; Hyland et level. In addition, while there are
lege students in a popu- al., 2006). Furthermore, recent concerns regarding its structure
lation-based study (Sledjeski et research has shown that a single and reliability, it has been found to
al., 2007). In addition, the FTND item on the FTND and HSI (Item predict smoking heaviness and
has been shown to index a #1 latency to first cigarette in the cessation outcome. However, it

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 128

IARC Handbooks of Cancer Prevention

appears that the HSI is a more ICD-10, DSM-III-R (the 1987 cluding Arabic, Chinese, English,
efficient predictor of outcome than revision of DSM-II), and DSM-IV French, Russian, and Spanish.
is the FTND (using only two items). symptoms of dependence with a The DIS, CIDI, and other diag-
FTND and HSI scores have also Flesch-Kincaid Reading Grade nostic interviews comprise a
been found to be heritable and Level of 8.1 (see Appendix 4 for series of branching questions that
related to specific dependence- items and scoring). To the best of are aimed at eliciting information
linked genetic variants. our knowledge, this is the only about features relevant to nicotine
published, self-report DSM/ICD dependence.
The Diagnostic and Statistical Man- questionnaire of tobacco/nicotine Some aspects of the DSM-
ual, International Statistical Classi- dependence. Most of the existing derived interviews and similar
fication of Diseases and Related research has utilised the DSM instruments may cause problems
Health Problems, 10th Revision and criteria, and that will be the focus in any sample, or when using the
the Tobacco Dependence Screener of this Handbooks review of instrument with culturally diverse
diagnostic classifications of tobac- populations. Another important
Two different diagnostic systems co dependence. caveat to observe, in regards to
are commonly used to diagnose DSM and ICD structured the DSM measure of dependence,
tobacco dependence: both are clinical interviews, such as the is that the scoring algorithm used
typically considered to be uni- World Mental Health Survey in establishing formal DSM
dimensional measures of tobacco Initiative version of the Composite diagnoses does not appear to
dependence. One is the Diag- International Diagnostic Interview yield decision rules that agree with
nostic and Statistical Manual of (CIDI), or the National Institute of empirical methods, such as latent
Mental Disorders, 4th Edition Mental Health Diagnostic Inter- class analysis (Muthen &
(DSM-IV) (American Psychiatric view Schedule (DIS), have been Asparouhov, 2006). Thus, the
Association, 1995)1 which is translated into various languages investigator may wish to explore
based on an empirically driven, and used in at least 11 population- different methods for item-
syndromal medical model, rather based studies (Hughes et al., weighting and cut-score estimation
than on a theoretical model of 2006) in countries including: if a categorical outcome is
dependence (see Appendix 2 for Germany (John et al., 2003b desired. In addition, it should be
the criteria). The second is the (DSM); John et al., 2004a (DSM); noted that the tobacco sections of
International Statistical Classi- Hoch et al., 2004 (DSM)), DIS and CIDI are quite long (over
fication of Diseases and Related Australia (Pergadia et al., 2006b 30 items), and were designed to
Health Problems, 10th Revision (DSM)), Canada and Taipei be administered either in a face-
(ICD-10), an international diag- (Howard et al., 2003 (DSM)), to-face interview or by a trained
nostic classification system that Spain (de Leon et al., 2002 professional. New technology has
was endorsed by the 43rd World (DSM)), Austria (Lesch et al., made it possible to have indivi-
Health Assembly in May 1990 and 2004 (DSM & ICD)), Switzerland duals respond to text-based
came into use by WHO Member (Angst et al., 2005 (DSM)), Japan presentations of the questions, but
States as of 1994 (see Appendix (Yoshimura, 2000 (ICD)), Korea it is unknown how valid this
3 for the criteria (WHO, 1993)). (Lee et al., 1990 (DSM)), and the presentation method would be
The Tobacco Dependence USA (Breslau et al., 2004 (DSM); and it would remain quite time
Screener (TDS) (Kawakami et al., Hughes et al., 2004a (DSM & consuming.
1999) is a 10-item, self-report ICD)). The ICD-10 criteria are
questionnaire designed to assess available in 42 languages, in-
There has been a text revision of the DSM-IV (American Psychiatric Association, 2000), however this revision did not alter any
diagnostic criteria for any diagnostic categories, including the substance dependence diagnosis

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 129

Measurement of nicotine dependence

Reliability and structure: Data on 2004). Investigators might wish to DSM-IV nicotine dependence
the reliability and structure of analyze these item parcels diagnosis is associated with
diagnostic interview measures of separately since they may be greater risk of psychiatric comor-
nicotine dependence arise from addressing somewhat distinct bidities in adults and youth (Grant
studies using face-to-face admi- constructs. et al., 2004; John et al., 2004a;
nistration strategies. Therefore, The TDS, a written ques- Dierker et al., 2006). In addition,
the following conclusions cannot tionnaire assessing the presence DSM diagnoses of nicotine
be generalized to a different of diagnostic criteria, has demon- dependence were significantly
administration format. There is strated acceptable internal associated with self-rated general
evidence that the various consistency in Japanese smokers health in a population sample in
structured diagnostic measures ( = 0.74-0.81) (Kawakami et al., Germany (John et al., 2005). In
yield reliable diagnoses as 1999), but was less internally sum, there is substantial evidence
assessed by test-retest reliability consistent among smokers in the that DSM/ICD diagnoses are
( = 0.63, Grant et al., 2004; = USA ( = 0.64) (Piper et al., 2008). meaningfully related to smoking
0.88, Hughes et al., 2004a; = To date, there have been no heaviness and a variety of health
0.73, Koenen et al., 2005). One- studies comparing the reliability of outcomes.
factor analysis indicated that the interview measures with the Studies have shown that the
responses to the CIDI had a paper-pencil measure. Therefore, TDS is associated with the
strong single factor structure one cannot assume that the smoking heaviness measures
(Strong et al., 2003); although psychometric data generated by (e.g. number of cigarettes smoked
other factor analyses of the the interview-format delivery of per day, CO levels) and years of
structured diagnostic items found DSM or ICD items would smoking (Kawakami et al., 1999;
that a two-factor structure was a generalize to a self-administered Piper et al., 2004). With respect to
better fit (Johnson et al., 1996; format. relapse, one study found that
Radzius et al., 2004; Muthen & Japanese smokers with lower
Asparouhov, 2006). Patterns of Validation: Evidence suggests that TDS scores were more likely to
covariation that were found the small set of dichotomous DSM quit smoking after a health risk
amongst the symptoms could be items can distinguish between appraisal (Kawakami et al., 1999).
best accounted for by two factors light versus heavy smoking However, data from smokers who
(Muthen et al., 2006). The first (Strong et al., 2003). An epi- participated in smoking cessation
accounted for covariance in the demiological study found that the studies in the USA, revealed that
tolerance, larger amounts, and DSM-III-R (as assessed by the the TDS did not predict abstinence
time spent using items (see DIS), was a significant, though at 1-week or 6-months post-quit
Appendix 2). Thus, this factor weak, predictor of cigarette (TTURC Tobacco Dependence
seems to be highly related to abstinence over one year, but that Phenotype Workgroup, 2007).
sheer amount smoked. The se- the FTND was a better predictor
cond factor was related to and that number of cigarettes Heritability: There has been
persistent desired/unsuccessful smoked per day was the best considerable research supporting
efforts to cut down or quit, and predictor (Breslau & Johnson, the heritability of DSM/ICD-
continued use despite emotional/- 2000). Another study showed that diagnosed nicotine dependence.
physical problems. Confidence in DSM-IV diagnoses of nicotine In the Australian Twin sample
this solution is bolstered by the dependence predicted heaviness study, analyses revealed that all of
fact that it was obtained in three of use and cessation outcome in a the DSM-IV symptoms and diag-
separate groups of individuals. It population-based study of college nosed DSM-IV dependence were
is also consistent with other recent students (Sledjeski et al., 2007). meaningfully heritable (45-73%),
factor analyses (Lessov et al., Several studies have shown that and that the DSM-IV criteria of

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 130

IARC Handbooks of Cancer Prevention

tolerance, withdrawal, and dif- outcomes as well as the and 0.83 for the full scales.
ficulty quitting were the most diagnostic measures (e.g. John et Factor analysis suggested a
highly heritable symptoms of al., 2004a). In terms of the unidimensional structure for the
nicotine dependence for both men prediction of likelihood of future CDS-12.
and women (Lessov et al., 2004). cessation, it is unclear that diag-
A study of twin fathers, using the nostic measures possess any Validation: The CDS scales were
Vietnam Era Twin Registry, found incremental validity relative to significantly correlated with
that paternal DSM diagnosis of briefer measures, such as the number of cigarettes smoked per
nicotine dependence was sig- HSI. The diagnostic scales have day (whether a smoker was a
nificantly associated with offspring relatively high reading levels, daily or occasional smoker),
DSM diagnosis of nicotine which may hinder their use with strength of urges during the last
dependence (Volk et al., 2007). certain populations (even if quit attempt, and cotinine level
However, one study found that administered orally). (Etter et al., 2003b). Curiously, the
DSM nicotine dependence was CDS-5 was more strongly
not related to familial liability to Cigarette Dependence Scale correlated with cotinine levels than
smoking persistence, because was the CDS-12. This was
familial density of persistence was The Cigarette Dependence Scale probably due to the fact that the
not associated with smoking (CDS) is another unidimensional question about smoking heavi-
persistence among nicotine- tobacco dependence measure ness (Question #2) determined a
dependent daily smokers (John- (Etter et al., 2003b). This assay greater portion of total scale
son et al., 2002). Other genetics was developed using smokers variance in the 5-item version. In
research has linked DSM- reports of signs that they believed one study, none of the three
diagnosed nicotine dependence indicated addiction to cigarettes. dependence measures (i.e. the
with the CYP2E1 genotype, which Both a 5- and 12-item version of FTND, CDS-5, or CDS-12) was a
codes for a protein that meta- the CDS were developed (see significant predictor of relapse
bolizes alcohol and tobacco Appendix 5). The items overlap likelihood (Etter et al., 2003b);
smoke nitrosamines, and is somewhat with the Fagerstrm however, only a third of potential
implicated in creating metabolic tests (e.g. they both assess respondents participated in the
cross-tolerance between alcohol number of cigarettes smoked per follow-up study, which might have
and tobacco (Howard et al., 2003). day and time to first cigarette in the produced considerable response
morning). The Flesch-Kincaid bias. In a second study, the CDS-
Summary: There is evidence that Reading Grade Levels were 4.9 for 12 weakly predicted smoking
diagnostic measures effectively the CDS-12 and 6.8 for the CDS-5. abstinence at 1-month post-quit,
index smoking heaviness, smo- but in a counterintuitive direction
king-related health and mental Reliability and structure: To date, (e.g. higher CDS-12 scores
health risks, and likelihood of only two published studies have predicted abstinence) (Etter,
future cessation. There is also reported data on the two versions 2005).
strong evidence of heritability of of the CDS, using data collected
DSM-diagnosed nicotine depen- via the mail or Internet (Etter et al., Heritability: To date, no data
dence. It is unclear whether 2003b; Etter, 2005). The CDS-12 regarding heritability or genetics
paper-pencil versions of such had strong internal consistency, have been published using the
measures (the TDS) are com- the CDS-5 was within the CDS.
parable to the interview versions acceptable range, and both scales
of such measures. Moreover, were slightly skewed toward Summary: While the CDS scales
there is evidence that the FTND higher values. Test-retest cor- do index smoking heaviness well,
may predict cessation and health relations were 0.60 for all items there is little evidence that they

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 131

Measurement of nicotine dependence

predict likelihood of cessation 6). The Flesch-Kincaid Reading item scale. The internal con-
effectively, or that they index other Grade Level is 7.7. This reading sistency for the NDSS total scale,
health outcomes of public health level is somewhat elevated relative the NDSS-T, is good (Shiffman et
importance. Further, there is little to other self-administered scales, al., 2004); however, data show
evidence that they possess which may reflect the fact that that the internal consistencies of
incremental validity relative to some items contain unusual words individual subscales are prob-
other measures, such as the and require integration of more lematic (Piper et al., 2006). Prin-
diagnostic measures or the FTND. than one sentence or statement. cipal components analysis re-
Overall, this measure is promising For instance, the item, My vealed a 5-factor structure for the
in that it can be used with paper- smoking pattern is very irregular NDSS (Shiffman et al., 2004) as
pencil administration and it has throughout the day. It is not predicted by the underlying theory.
good reliability, but a meaningful unusual for me to smoke many Significant differences in the
evaluation must await additional cigarettes in an hour, then not have scores on the subscales between
validity research. another one until hours later, White and African-American
involves three negatives over its smokers suggest the scale may
Multidimensional Measures of two sentences. In addition, some operate differently in subpopu-
Tobacco Dependence questions are double-barrelled, lations, although there were no
such as Its hard to estimate how ethnic differences in the total
Nicotine Dependence Syndrome many cigarettes I smoke per day NDSS score (Shiffman et al.,
Scale because the number often 2004). A more recent study, using
changes. If a person answers no, the 19-item questionnaire with the
The Nicotine Dependence Syn- it is unclear whether the answer Finnish Twin Cohort Study
drome Scale (NDSS) (Shiffman et refers to difficulty of estimation per population, found that a 3-factor
al., 2004) is a 19-item multi- se, or because the number of structure (priority/drive, continuity/
dimensional scale based on cigarettes smoked per day does stereotypy, and tolerance) best fit
Edwards and Gross 1976 theory not change. Some items may be the data, with the internal
of the alcohol dependence significantly influenced by cultural consistencies of the three factors
syndrome. The NDSS was factors, such as eating in ranging from 0.83 to 0.92 (Broms
intended to complement, not re- restaurants that are smoke-free or et al., 2007).
place, traditional dependence experiences during air travel.
measures, such as the DSM- These features may make the Validation: Much of the initial
based assessments, and there- NDSS somewhat less appropriate validation work was done with the
fore there is little content overlap than some other measures for 30- and 23-item NDSS, prior to its
between the NDSS and the uni- individuals of modest reading being refined to the 19-item
dimensional measures. The abilities or educational status. The version. These results indicated
NDSS assesses five dimensions NDSS has been translated into that the NDSS-T predicted time to
of nicotine dependence: Drive Finnish (Broms et al., 2007). lapse and time to relapse, but no
reflects craving, withdrawal, and individual subscale predicted
smoking compulsions; Priority Reliability and structure: To date, lapse or relapse (Shiffman et al.,
reflects preference for smoking four studies of adult smokers have 2004). However, new data
over other reinforcers; Tolerance generated data on the NDSS; one suggest that the NDSS subscales
reflects reduced sensitivity to the study has reported on the NDSS are significantly, though modestly,
effects of smoking; Continuity in adolescents aged 12-18 (Clark related to cigarettes smoked per
reflects the regularity of smoking et al., 2005). day (r = 0.12-0.26) and that the
rate; and Stereotypy reflects the Psychometric data discussed Tolerance and Continuity sub-
invariance of smoking (Appendix here are based on the revised 30- scales are modestly related to CO

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 132

IARC Handbooks of Cancer Prevention

level (r = 0.12 and 0.13, smoking heaviness measures, different smoking dependence
respectively) (Piper, et al., 2008). other dependence measures, and motives: Affiliative Attachment,
In samples of treatment-seeking smoking cessation likelihood Automaticity, Behavioral Choice/
smokers, the NDSS Priority and (Broms et al., 2007; Piper et al., Melioration, Cognitive Enhance-
the Stereotypy subscales were 2008). The majority of this re- ment, Craving, Cue Expo-
found to predict cessation out- search has been done on clinical sure/Associative Processes, Loss
comes for up to 6-months populations and it is not known of Control, Negative Rein-
post-quit (TTURC Tobacco how well these results would forcement, Positive Reinforce-
Dependence Phenotype Work- generalize to population-based ment, Social and Environmental
group, 2007; Piper, et al., 2008). samples. There is evidence that Goads, Taste and Sensory
The NDSS Drive, Tolerance, and the various subscales of the Properties, Tolerance, and Weight
the total score were found to measure are differentially related Control (see Appendix 7 for the
predict heaviness of smoking and to various dependence criteria items and scoring). The Flesch-
cessation outcome in a popu- (Shiffman & Sayette, 2005; Broms Kincaid Reading Grade Level is
lation-based sample of college et al., 2007; TTURC Tobacco 4.6; however, balanced against
students (Sledjeski et al., 2007). In Dependence Phenotype Work- this easy reading level is the fact
Finnish smokers, the NDSS was group, 2007). This suggests that that the total scale is quite long.
significantly correlated with both some of the subscales possess Therefore, investigators might
FTND and DSM-IV, as assessed discriminative validity with respect wish to use individual, theoretically
by the CIDI measures of de- to different dimensions or aspects targeted subscales in epide-
pendence (Broms et al., 2007). of dependence. However, there is miologic research (subscales
The NDSS subscales accounted evidence that the NDSS is not range from 4-7 items) (Lerman et
for 51% of the variance in self- able to predict the major de- al., 2006). Finally, relatively subtle
reported difficulty abstaining pendence criteria of smoking psychological concepts are ad-
among chippers (light/non-daily heaviness or cessation likelihood dressed in this measure, such as
smokers) (Shiffman & Sayette, better than shorter measures thinking of cigarettes as a friend or
2005), with the Drive subscale (TTURC Tobacco Dependence experiencing a loss of control, and
having the strongest relation ( = Phenotype Workgroup, 2007). In this may affect the validity of such
0.61), relative to the other scales addition, the marginal reliabilities items in some cultures. There are
( = 0.13-0.28). of some of the subscales, and the English and Spanish versions of
reading level and complexity of the WISDM (D.W. Wetter,
Heritability: In the Finnish cohort, some of the items, may dis- personal communication, Decem-
NDSS was found to have a courage use in large popu- ber 12, 2006).
significant heritability estimate of lation-based samples. While all subscales assess
0.30, relative to a heritability dependence, it should be noted
estimate of 0.40 for the FTND Wisconsin Inventory of Smoking that some of the subscales (i.e.
(Broms et al., 2007). Dependence Motives Cue Exposure/Associative Proce-
sses, Social/Environmental Goads,
Summary: Like the CDS, the The Wisconsin Inventory of and Taste/Sensory Properties)
NDSS is a relatively new scale Smoking Dependence Motives represent early-onset motives,
and it is not yet possible to draw (WISDM) (Piper et al., 2004) is a which are present for all smokers
firm conclusions about its validity 68-item measure developed to even at modest levels of smoking
relative to other dependence assess the discrete motivational experience, while other subscales
instruments. In its favour is the basis of dependence. This mea- represent late-onset motives (i.e.
fact that some of its subscales sure has 13 theoretically-based Affiliative Attachment, Automa-
have been shown to predict subscales designed to tap ticity, Behavioral Choice/ Melio-

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 133

Measurement of nicotine dependence

ration, Cognitive Enhancement, WISDM (TTURC Tobacco Depen- subscales may code for biological
Craving, and Tolerance), which dence Phenotype Work-group, diversity so as to permit genetic
are present only in individuals who 2007). mapping.
smoke at a moderate daily rate or The various WISDM subscales
have at least moderate smoking show different patterns of relations Summary: Like the CDS and the
experience (Piper et al., 2004). with the dependence criteria. For NDSS, the WISDM is a relatively
instance, the Tolerance subscale new scale and it is too soon to
Reliability and structure: To date, was the best predictor of CO level, draw firm conclusions about its
only one study has published data but the Craving, Cue Exposure/ validity relative to other depen-
on the WISDM (Piper et al., 2004). Associative Processes, and dence instruments. However, data
Across two different samples all Tolerance subscales were the reveal that some of its subscales
13 subscales had strong internal best predictors of DSM-IV depen- predict smoking heaviness
consistencies that were evident dence when entered together into measures and smoking cessation
across gender and across Whites a multiple regression equation likelihood (Piper et al., 2008).
and African-Americans. A new (Piper et al., 2004). One study There is also evidence that the
study found that the internal found that although the total score various subscales of the measure
consistency of the subscales was not a significant predictor of are differentially related to various
ranged from 0.74-0.94 with the relapse after controlling for dependence criteria (TTURC
total scale having a Chronbachs treatment, the combination of Tobacco Dependence Phenotype
alpha of 0.96 (Piper et al., 2008). Automaticity, Behavioral Choice/- Workgroup, 2007; Piper et al.,
Factor analytic strategies indi- Melioration, Cognitive Enhance- 2008), suggesting that this mea-
cated that the WISDM-68 is ment, and Negative Reinforce- sure is able to capture different
multidimensional, although some ment subscales all predicted dimensions or aspects of depen-
scales hit on related or over- relapse by the end of treatment in dence. However, there is evi-
lapping dimensions of depen- a multivariate model (Piper et al., dence that the WISDM is not able
dence. Thus, it is safe to say that 2004). Data from two different to predict the major dependence
some of the subscales are tapping smoking cessation trials found that criteria of smoking heaviness or
the same underlying dimensions. WISDM Automaticity and Tole- cessation likelihood better than
rance were predictive of outcome shorter measures (TTURC
Validation: The total WISDM was at 6-months post-quit (TTURC Tobacco Dependence Phenotype
correlated with smoking heaviness Tobacco Dependence Phenotype Workgroup, 2007). Some WISDM
(cigarettes per day r = 0.63; CO r Workgroup, 2007). subscales have been related to
= 0.55) (Piper et al., 2004). Data various dependence-linked gene-
also indicated that WISDM Total Heritability: There is evidence that tic components. It is important to
predicted outcome at both 1-week the Taste/Sensory Properties note that the WISDM research has
and 6-months post-quit (TTURC subscale was significantly related been done on clinical populations
Tobacco Dependence Phenotype to a genetic variant that deter- and it is not known how well these
Workgroup, 2007). Thus, there is mines sensitivity to bitter tastes results would generalize to
evidence that the whole scale is (the phenylthiocarbamide (PTC) population-based samples.
meaningfully related to the major haplotype) (Cannon et al., 2005).
dependence criteria. However, as Data have also revealed a Summary:
with the NDSS, it appears that significant relation between the
some shorter measures, such as WISDM Tolerance subscale with Assessment of cigarette-induced
the HSI, predict smoking heavi- the ratio of 3-hydroxycotinine to nicotine dependence is an
ness and cessation likelihood as cotinine (Piper et al., 2008). These important goal for three reasons.
well or better than the longer data suggest that some WISDM First, the human and economic

chap3.3janvier13:Layout 1 13/01/2009 14:48 Page 134

IARC Handbooks of Cancer Prevention

costs of cigarette-induced, nico- dependence measures accounts public health outcomes. However,
tine dependence is significant. for a large proportion of variance the relative lack of validity
Second, only a portion of cigarette in outcomes in cessation information on these scales may
smokers are dependent (as likelihood. This is no doubt due to mean that researchers should use
defined by traditional instruments), the fact that cessation likelihood is these instruments only in the
and those who are dependent are affected by countless situa- context of exploratory research.
indeed distinguishable from other tional/environmental factors, and They might be most appropriate
smokers on the basis of factors, other person factors. In addition, if for research addressing etiology
such as likelihood of future one uses a brief measure, such as and cultural or population-based
cessation and amount smoked the HSI, it is important to differences in smoking deter-
daily. Finally, cigarette-induced recognize that it does not tap all minants.
nicotine dependence may serve to dependence factors. It also does
moderate individuals responses not appear to predict certain core Measures of smokeless
to different tobacco control pro- features of dependence well, such tobacco-induced nicotine
grammes and policies, as well as as withdrawal, and it may be dependence
the proximal and distal effects of inappropriate in populations that
these interventions. do not smoke daily or have Like cigarettes, smokeless tobac-
It is important to note that there significant restrictions on smoking co (ST) products contain nicotine,
is considerable evidence that the (e.g. restrictions that constrain although the levels vary con-
various measures of nicotine smoking in certain contexts or siderably across products (Hatsu-
dependence are not highly related times of day). kami et al., 1992; IARC, 2007b).
to one another, and can have very There may be situations when Data on patterns of use of ST
different relations with validity there is a need to assess support the conclusion that many
measures (Hughes et al., 2004a; particular, relatively discrete, users are nicotine dependent
Piper et al., 2006). Thus it is facets of nicotine dependence. (Henningfield et al. 1997; IARC,
critical that investigators select For example, identifying specific 2007b). Many ST users experi-
measure(s) that are psycho- tobacco dependence mechanisms ence withdrawal symptoms upon
metrically sound, appropriate for may facilitate: identification of a abstinence (Hatsukami et al.,
the intended population, and more proximal phenotype (Can- 1992; 1999). Studies have used a
target the constructs in which the non et al., 2005), identification of biomarker of nicotine uptake,
researchers are interested. If the specific dependence dimensions cotinine, to show that daily users
goal is to assess a central core of with which one could create of ST exhibit levels of nicotine
nicotine dependence as a pre- treatment algorithms, monitoring absorption that are equivalent to
dictor of cigarette use cessation of the development of tobacco daily cigarette smokers (Gritz et
likelihood, or as an index of dependence, or identification of a al., 1981).
associated health risks, then the specific group of dependent Dependence on smokeless
FTND or HSI appear best suited tobacco users for whom a policy is tobacco has often been assessed
for this purpose (Tables 3.22 and particularly effective or ineffective. with questionnaires derived from
3.23). These instruments are brief If this is the goal of the research, FTND, with the addition of specific
and have relatively impressive then a multifactorial measure (i.e. items, in particular, swallowing the
predictive validities, and their the NDSS and the WISDM-68, tobacco juice (Boyle et al., 1995;
reading level should make them and their subscales) would be Ebbert et al., 2006). In three
appropriate for a broad range of optimal, despite the fact that there different samples, use of ST within
populations. However, it is is little evidence for incremental 30 minutes of waking and
important to note that none of the validity in predicting important swallowing the tobacco juice were

Construct Tobacco Dependence

Measure 1 Fagerstrm Test of Nicotine Dependence (FTND) 6 items

Source Heatherton et al., 1991

Variation It is possible to change the wording of the items to be culturally appropriate or to reflect non-cigarette
tobacco use. However, these changes may affect the reliability and validity of the data obtained.

Validity Predicts both behavioural (e.g. lifetime amount smoked) and biochemical (e.g. CO, cotinine) indices
of smoking in multiple countries
Predicts cessation
Evidence of linkage to specific dependence-linked genetic variants

Comments This measure is recommended as an assessment of dependences ability to predict cessation

and heavy use
Brief and well-known
Strong predictive validity of heavy use and cessation
Internal consistency is modest, which may reflect a 2-factor structure
Some items may be influenced by smoking restrictions in the environment
Has been translated into a number of different languages

Measure 2 Heaviness of Smoking Index (HSI) 2 items from the FTND: number of cigarettes smoked per day and
time to first cigarette in the morning

Source Kozlowski et al., 1994

Variation It is possible to change the wording of the items to be culturally appropriate or to reflect non-cigarette
tobacco use. However, these changes may affect the reliability and validity of the data obtained.

Validity Predicts both behavioural (e.g. lifetime amount smoked) and biochemical (e.g. CO, cotinine)
indices of smoking in multiple countries
Predicts cessation the HSI appears to be the strongest predictor of cessation, accounts for
much of the predictive validity of the FNTD
Highly heritable (71%) and linked to specific dependence-linked genetic variants

Comments This measure is recommended as the most efficient measure to assess dependences ability to predict
Using this measure may only involve the addition of item (time to first cigarette) if number of
cigarettes per day is already being collected
Strong predictive validity of heavy use and cessation
Items may be influenced by smoking restrictions in the environment
Has been translated into a number of different languages

Table 3.22 Measures of Cigarette-Induced Nicotine Dependence

IARC Handbooks of Cancer Prevention

Construct Tobacco Dependence

Measure Fagerstrm Test of Nicotine Dependence (FTND) 6 items

Sources Boyle et al., 1995; Ebbert et al., 2006

Variation It is possible to change the wording of the items to be culturally appropriate or to reflect non-cigarette
tobacco use. However, these changes may affect the reliability and validity of the data obtained.

Validity Predicts both behavioural (e.g. lifetime amount smoked) and biochemical (e.g. CO, cotinine)
indices of smoking in multiple countries
Predicts cessation
Evidence of linkage to specific dependence-linked genetic variants

Comments This measure is recommended as an assessment of dependences ability to predict cessation and
heavy use
Brief and well-known
Strong predictive validity of heavy use and cessation
Internal consistency is modest, which may reflect a 2-factor structure
Some items may be influenced by smoking restrictions in the environment
Has been translated into a number of different languages

Table 3.23 A Measure of Smokeless Tobacco-Induced Nicotine Dependence

the variables most consistently to provide a means for identifying measures of cigarette and smoke-
associated with cotinine level ST users who are nicotine depen- less tobacco nicotine depen-
(Boyle et al., 1995) (see Appendix dent. dence. For cigarette smoking, the
8 for the items and scoring). 2-item Heaviness of Smoking
Summary and recommenda- Index is recommended for use in
Summary: tions population level studies. If only a
single item measure is possible,
Like cigarettes, smokeless to- Nicotine dependence is an the use of time to first cigarette in
bacco can result in nicotine important construct to assess as a the morning is recommended.
dependence. While less research moderator for the effects of For smokeless tobacco, the
has been done to validate self- tobacco control programmes and FTND-ST appears to be a useful
report measures of ST-induced policies. In this section the measure of nicotine dependence.
nicotine dependence, question- evidence was reviewed on the
naires derived from FTND appear validity of various proposed

4.1 Data sources for monitoring tobacco control

Introduction this section is concerned with core teristics: (1) include all relevant
governmental policy interventions, tobacco control policies and
since in most countries, only govern- regularly be updated to include new
Do we know why the prevalence of ments have a population-wide reach innovative policies; (2) characterise
smoking in Sri Lanka decreased and the capacity and authority to the interventions against current
from 54% in 1988 to less than 40% consistently enforce stringent mea- best practice standards; (3) include
in 2003? What is this decrease sures. Such interventions typically the degree of enforcement of policy
related to? Does tobacco control include any governmental form of interventions; (4) rely on credible
have a part in this? If so, what regulation, funding decision, insti- sources; (5) cover all countries, as
specific policy interventions were tutional statement, organisational well as all relevant sub-national
most useful in decreasing the development, or administrative ac- jurisdictions; (6) be updated as
prevalence in Sri Lanka? How does tion to apply (or not apply) tobacco changes occur, or at least at regular
that compare to other countries? To control policies. Further down, this and short intervals, while keeping
respond to these and similar ques- section discusses evaluation criteria historical information; and (7) span
tions on the relationship between for tobacco control policy inter- a long enough period to link
the implementation of specific ventions monitoring systems, and changes in tobacco control policies
tobacco control policies and toba- reviews currently available data to changes in the prevalence of
cco use prevalence in any country, sources based on these criteria. The tobacco use and other impact indi-
researchers and policy-makers last part of the section builds on the cators. Therefore, tobacco control
need a solid understanding of the first two and discusses renewed monitoring systems are assessed in
current state of policies and their efforts to build comprehensive this paper in relation to the following
specific impact at the country level tobacco control monitoring systems variables:
( surveillance/ in the new international tobacco
infobase/web/InfoBaseCommon/). control context. Policy scope
This section describes the cur- Characterization of interventions
rently available sources of Criteria for assessing against best practice standard
information on tobacco control poli- tobacco control policy inter- Characterization of degree of
cy interventions, with special vention monitoring systems enforcement
attention to the new WHO Global Source of the primary data
Tobacco Control Report system, An ideal global tobacco control Geographical/jurisdictional
and assesses their credibility, monitoring system would track coverage and comparability of
completeness, and usefulness. It interventions to decrease tobacco data across jurisdictions
also discusses important metho- use in all relevant policy domains, Timeliness and frequency of
dological issues and gauges future and would make the data com- data collection
prospects for such systems. Alt- parable across all jurisdictions, Characterization of evolution of
hough tobacco control policy based on an explicit and transparent policies over time.
interventions can be initiated by protocol. Such a monitoring system
private sectors of the civil society, would have the following charac-

IARC Handbooks of Cancer Prevention

Scope of policies covered: differences in effectiveness. In- Cummings et al., 2003; Glantz et
creasing prices through high al., 2005; Wiehe et al., 2005).
Tobacco control interventions are taxes, as well as smoke-free Tobacco control monitoring sys-
wide in scope and vary in time and environments, are generally seen tems should be assessed on the
space. However, despite the as the most effective tobacco strategic choice of the policy
sheer diversity of possible policy control policies (Ranson et al., domains and interventions they
interventions, they can be re- 2000), and therefore are con- cover. Although this choice is
grouped in a few convenient sidered essential in any in- generally implicit in all datasets,
categories that generally fall under formation system. the data collector should clearly
demand reduction measures If resources allow, clearly describe the basis for that choice,
and supply reduction measures ineffective policies could be whether in terms of efficacy, or
(although some policies do not monitored. This could provide a any other criteria.
easily yield to this rather strict scan of the policy environment
dichotomy) (Table 4.1). In asses- and assess the imbalances Characterization of tobacco
sing the scope of tobacco control produced by focusing on ineffec- control policies based on best
data systems, one must bear in tive measures. For example, in the practice standards:
mind that not all tobacco control context of constant aggressions
policies are equal. Supply reduc- from the tobacco industry to avoid Once the scope is established, the
tion policies are generally effective tobacco control, moni- data system must be assessed in
considered not to be very effective toring measures that are in- relation to its capacity to
at reducing tobacco use, except efficient, but at the same time (and characterise each policy accor-
perhaps for anti-smuggling mea- for the same reason?) the darling ding to an explicit standard or
sures under certain conditions of the industry, might indicate how recognised best practices. For
(Rowena et al., 2000). Given the misguided the policy priorities of a example, it is generally acknow-
limited resources devoted to data given jurisdiction are. Examples ledged that bans on advertising,
gathering, efforts should first be are the effectiveness of school- promotion, and sponsorship
dedicated to demand reducing based education programmes and should be comprehensive. There-
policies. Even among such poli- prohibition sales of tobacco pro- fore, systems monitoring mar-
cies, however, there are wide ducts to minors (Ling et al., 2002; keting restrictions should be

Demand Reduction Policy Domains Supply Reduction Policy Domains

Price and tax of tobacco products Liability and litigation

Protection from exposure to secondhand tobacco smoke Access to tobacco by youth

Tobacco advertising, promotion and sponsorship Banning sales of tobacco products

Packaging and labelling of tobacco products Crop substitution

Treatment of tobacco dependence Contents of tobacco products

Education, communication and public Awareness Illicit trade in tobacco products

Table 4.1 Tobacco Control Interventions

Data sources for monitoring tobacco control policies

assessed on their ability to provide and/or films, and sponsored standard considers that bans with
information that would allow events. In addition, the existence smoking rooms are not complete,
gauging the policies of any given of each element of the policy the Italian ban would not
jurisdiction against this standard. should be assessed with a Yes/No complete. However, the require-
According to this standard, the question that leaves little room for ments for smoking rooms are so
monitoring system must then interpretation and explicitly meets stringent, that Italian law de facto
select all relevant variables the best practice standards. The can be considered providing for a
describing the components of this monitoring system should clearly complete smoking ban, as
policy and collect the data accor- describe the criteria used in smoking rooms are rarely
dingly (Joossens et al., 2006). answering Yes/No questions, and available.
Following with the previous these criteria should be termed in Characterizing of any given
example, to ascertain the existence the same language as the laws. policy intervention becomes even
of a comprehensive ban on For instance, these apply notably more difficult in the absence of
advertising, promotion, and spon- in deciding whether a smoking clear information on regulations.
sorship, the monitoring system ban is complete, whether health Some countries have legal
should provide information warnings are effective, whether systems where regulation is very
separately on each form of com- advertisements are banned from general, leaving it to admini-
mercial communication, recom- the media, etc. strative actions to determine how
mendation or action, and any form To have a clear charac- regulations are to be applied.
of contribution to any event, terization of any given policy Some regulations have loopholes;
activity, or individual with the aim, intervention is not always easy. some countries have contradictory
effect, or likely effect of promoting a Even with all necessary legal decrees issued by many types of
tobacco product or tobacco use information, the data collector is authorities, with uncertain rules
either directly or indirectly. Such a left to match their own definition of determining which decree has
policy would include data on the the desired policy with the jargon precedence. In other countries,
existence of direct advertising bans of the law. One desired policy one must consider jurisprudence
of tobacco products or brands in might be a complete smoking ban. and court orders suspending or
every existing media, including However, even good laws typically modifying regulations.
national and international TV from do not provide for complete bans In summary, any tobacco
any source (cable, satellite, and could include some exemp- control monitoring system, be-
internet, etc.), national and inter- tions. The Irish law is a case in cause it attempts to verify the
national radio, local and inter- point; it does not provide for a existence of an implicitly defined
national magazines and news- complete ban strictly speaking. good" policy intervention, must
papers, billboards, points-of-sale, However, judging when exemp- synthesize complex information to
the internet, and cinemas. More- tions are minor or not might be a answer simple questions. At one
over, the monitoring system should challenge, and setting a clear and time or another, collecting the
collect data on the ban of each detailed standard of excellence is information may call for some
specific form of promotion of important in assessing and judgment by the data collector. A
tobacco products, brand names, or collecting data for monitoring good tobacco control monitoring
company names, including direct tobacco control policies. More system should minimize the
mail giveaways, promotional dis- complicated is the assessment of impact of these judgment calls
counts, non-tobacco products the Italian law. It does contain a and make them as explicit as
identified with cigarette brand smoking ban, but exceptions are possible.
names, brand name of non- allowed in the form of smoking
tobacco products used for tobacco rooms, usually not considered a
product, product placement in TV best practice. If the applied

IARC Handbooks of Cancer Prevention

Enforcement survey reported that they had two obvious problems with such
seen billboards with tobacco measures: first is that the
Any characterization of a policy advertisement in the month before existence of enforcement efforts
intervention is not complete the survey, despite a successful does not indicate enforcement of
without assessing the actual complete ban enforced 5 years the law necessarily; and second,
enforcement of the measure. It is earlier. It is thus possible that the absence of enforcement
not enough to know that a policy survey respondents did not efforts is not an indication of lack
intervention legally exists without understand the question or that of enforcement in countries where
knowing if it is applied. The they might actually be reporting tobacco control measures are
system monitoring tobacco control types of advertisement that are widely respected without severe
policies can use two types of not covered by the law (Global enforcement. Countries where
measures to assess the en- Youth Tobacco Survey fact interventions are self-enforcing
forcement of a policy intervention: sheets; from the beginning, or where
de facto implementation of the cco/global/GYTS/factsheets/paho/ significant efforts might not be
intervention in conformity with the factsheets.htm). needed after many years of
policy, and enforcement efforts by A more feasible alternative is successful enforcement, will fare
the government. The first type of to rely on the opinion of key quite badly next to a country with a
measure is best since it addresses informants or experts, providing severe enforcement problem
exactly what needs to be gauged, some sort of qualitative direct despite significant government
while the second method is an observation. The panel of key efforts. In addition, such statistics
indirect indicator that looks at the informants or experts is especially are not always available. In fact, in
process leading to enforcement. sensitive to judgment calls and some countries, it is not clear who
De facto implementation must be assessed very carefully. should enforce the law, and
requires specific quantitative me- In this respect, developing a gathering statistics then becomes
trics based on direct observation stringent, multi-layered protocol is difficult. In the case of smoke-free
of people or events, outside the probably a sound base, but there environments, for example, some-
purview of a monitoring system. is not yet a consensus on what times police are in charge of
Such measures are often would be a method that is enforcement and often do not use
unrealistic for many countries with inclusive enough at the national fines to enforce the law, given the
low resources; measuring level, yet comparable enough at low social acceptability of a fine for
enforcement of smoking bans, the international level. Indeed, smoking in a restaurant; casual
for example, may require popu- qualitative assessment of enforce- reprimand is used instead and no
lation surveys, sometimes inclu- ment is not easy, especially at the trace is left in any official record.
ding biological measures of international level, where national Given these difficulties and
exposure to secondhand smoke. experts might have a widely inherent limitations of the second
Other metrics might include data different appreciation of enforce- approach of measuring enforce-
provided by the industry, because ment. ment efforts, it is probably better to
of clear legal obligations (e.g. Methods based on quantitative mainly rely on the first approach,
detailed sales or advertising data), measures can be used to gauge but to also use some basic
that can help understand the efforts (usually by the govern- measures of government efforts
impact of policy. Although pre- ment) leading to enforcement. that are in line with recom-
ferable to other approaches, direct These can be measured by mendations on enforcement.
observation is not exempt from enforcement budget, number of Monitoring systems could, in this
problems. Even surveys are full-time equivalent inspectors, case, gather data on the existence
difficult to interpret. In Brazil, for number of inspections, number of of a clearly identified body in
example, 70% of respondents to a fines distributed, etc. There are charge of enforcing the law, and if

Data sources for monitoring tobacco control policies

possible, the budget or staff of that on surveys, ad hoc metrics, order to increase flexibility of the
specific agency or unit, if it exists. qualitative measures, and expert exercise and country level
Whichever approach is used, a judgment. Moreover, it is very dif- relevance and buy-in.
monitoring system should be ficult to use a method that is The coverage of specific sub-
assessed on its explanation of the suitable for all national contexts; national jurisdictions follows the
measure of enforcement used. hence, the importance of descri- same principle. In the countries
The choice of approach and bing and justifying methods used. where this is relevant, inclusion is
method must thus be explicit. If it an absolute priority. In Canada, for
uses a survey, a panel of experts, Geographical/jurisdictional example, very stringent smoke-
or any other investigation method coverage free laws are enforced at the
to determine the actual impact of provincial level, and excluding
a policy, this method must be An ideal monitoring system should provinces would result in faulty
described in detail so that the provide data on policy inter- answers. Yet, there are only a few
reader can clearly understand it ventions in all countries of the cases where inclusion of sub-
strengths and limitations. world, and in all relevant sub- national jurisdictions is essential,
national jurisdictions within each and once more, local knowledge
Source of the primary data country. Worldwide geographical on the existence and relevance of
coverage comes at a cost; a these policies is critical. Should
The scope and characterization of balance must be struck between municipal by-laws be included for
policy interventions described coverage and thoroughness. Not example? What if a city comprises
above are key to assess the only can resources prove to be a a significant minority or even a
relevance of the contents of an constraint, but the wider the majority of the population and has
information system. However, the geographical coverage, the more such by-laws? Given the com-
crucial element to evaluate the difficult it becomes to make the plexity of some political systems
quality of the information it provides data comparable, and the less and jurisdictions, this will typically
is the assessment of the primary uniform relevant policy scope require local consultation. These
source of data. Written laws and tends to be. The goal of the questions can only be resolved on
regulations are the usual source of monitoring system must thus be a case-by-case basis, hence the
primary data for policy inter- carefully considered before necessity of the monitoring system
ventions. Monitoring systems deciding what the best geo- to outline clear guidelines for
should make all legal documents graphical coverage is. inclusion/exclusion. Among the
available for users to consult when In general, global coverage guidelines is the stability of these
in doubt (online if possible), so that should be the main goal, with very institutions and laws, number of
the reader can see what relevant clear questions and definitions people affected by the laws, their
information was collected. and thought to specific regional share in the national population,
However, assessing the exis- issues. Given the broad diversity strong within-country variations,
tence of some policy interventions in national contexts, this type of etc.
cannot be done by looking at the exercise should also be
written regulations. This is decentralized; hence, the neces- Timeliness and frequency of
typically the case of treatment and sity for a wide, yet highly data collection
education efforts. The presence of coordinated, network in order to
an easily reachable quitline, for make the data comparable. Such Given the pace of change in the
example, requires a measure of focus, however, should not field of tobacco control, an ideal
actual existence and use. Ob- preclude the existence of regional monitoring system should be live,
serving and characterizing these variations over and above a that is, updated as changes occur.
policy interventions must often rely common core set of questions, in Live systems demand the

IARC Handbooks of Cancer Prevention

existence of a stable tobacco databases. Described below are tained in the WHO FCTC, the policy
control country level network and the reporting instruments of the scope of the COP reporting
a central coordination mechanism. WHO FCTC, the precursors of the instrument is very large, but does
Short of that standard, and in the GTCR, and the GTCR itself. not directly prioritize policies in
absence of a stable network, the terms of effectiveness. This
frequency of updates should The reporting instrument of instrument contains "Group 1"
mainly depend on budgetary the Conference of Parties to questions, which are wide in scope
issues, with a careful balance to the WHO FCTC: and range from tobacco use
be struck between the frequency prevalence to measures taken to
of updates and budgetary The WHO Framework Convention curb illicit trade, as well as
sustainability. In all cases, the on Tobacco Control (WHO FCTC) education, and public awareness
data should not be more than one is the first treaty negotiated under programmes. Core Group 1 ques-
or two years old, or the time it the auspices of the WHO. It was tions require information about
takes for these policies to sig- adopted unanimously at the 56th tobacco use, licit supply of tobacco,
nificantly affect prevalence. World Health Assembly, in May duty-free sales volume, price and
2003. Its provisions obligate only tax measures to reduce demand for
Change of policies over time parties that have ratified the treaty, tobacco, regulation of tobacco
which as of September 2008 were product disclosures, illicit trade in
Old data should also be kept and 160 WHO member states. An tobacco products measures,
made available, so that re- important provision of the WHO seizures of illicit tobacco, edu-
searchers can track the evolution FCTC is that each Party is cation, communication, training and
of policy in an attempt to link it to obligated to submit periodic reports public awareness activities,
prevalence. Old laws, date of on its implementation of the measures on sales to and by
changes in the law, date of Convention, in accordance with minors, liability measures,
changes in the implementation of Article 21. To this end, the first management of tobacco depen-
the law, etc., are all very important meeting of the COP in 2006 dence and cessation services,
for monitoring systems whose aim provisionally adopted a reporting measures to support alternatives to
is to track the evolution of policy, system whose objective is to tobacco growing, research, sur-
and not just current policy, if we understand and learn from the veillance and exchange of in-
are to assess these measures. various experiences of parties in formation, programmes and plans,
implementing the WHO FCTC. national coordinating mechanisms,
Description and assessment Questions in the reporting and technical and financial
of current data collection instrument are clustered into three assistance provided and received.
systems groups. Only Group 1 questions The data is collected at the
have been designed and applied country level, and its purpose is
Only two global tobacco control by countries reporting to the not to provide a uniform
monitoring systems are presently second meeting of the COP in framework for comparison, but
operational: the WHO Global 2007 [the third meeting of the COP rather a way of observing the
Tobacco Control Report (GTCR) on November 2008 approved progress of the implementation of
and the reporting instrument of the changes to Group1 questions]. the treaty obligations within each
Conference of the Parties (COP) country. Therefore the possibility
to the WHO FCTC. The GTCR is Scope and characterization of of comparing answers across
based on the previous work of the interventions: countries is extremely limited,
National Tobacco Information although the questions on
Online System (NATIONS) and on Given the need to report on the legislative measures are in
still existing WHO regional wide range of obligations con- general quite detailed.

Data sources for monitoring tobacco control policies

Enforcement: Timeliness, frequency of data mation, since there is no a

collection, and trend: detailed protocol to make the data
There are no enforcement comparable.
measures considered in the COP Group 1 questions must be ans- The reporting instrument,
reporting instrument. wered within two years of entry into however, might evolve towards a
force of the Convention for that monitoring system. An indepen-
Data sources: Party, and then every three years dent assessment of the current
after that. Group 2 and 3 questions system is scheduled for 2009; the
The information is self-reported by must be reported within five and COP will further consider the
governments, which are required eight years of entry into force of the matter of reporting in 2010. Al-
to provide the supporting Convention for that country, res- ready decisions of the second
legislative documents. However, pectively. [Group 2 questions were COP, that gathered in Bangkok in
there is no external validation approved in November 2008. the summer of 2007, point to the
planned. The absence of any However, Group 3 questions have need for increased standardi-
formal standardization process, not been designed yet]. By the end sation through an improved
beyond the instructions of the of 2008, 140 parties will all have questionnaire, as well as through
reporting instrument, might mean completed the Group 1 questions the long-term evolution of the
that the user should go back to for the first time. questionnaire with Group 2 and
supporting documents in a The main goal of the reporting Group 3 questions.
systematic fashion. This is espe- instrument is to report on treaty
cially the case for the questions implementation and not on The Global Tobacco Control
regarding legislation, where tracking the evolution of tobacco Report (GTCR) precursors:
countries are asked if they have control. In this respect, following NATIONS and the WHO
"adopted and implemented legis- the trend of legislative measures regional databases
lative, executive, administrative, or is not an objective of the COP
other measures" on specific reporting instrument. The periodic Although NATIONS (http://apps.
policies whose level of imple- reports submitted by parties, is not
mentation is sometimes quite however, may allow some trend updated anymore, it was the first
vague (e.g. smoke-free environ- analysis within each country. global monitoring system for
ments are defined as "full," In summary, the WHO FCTC tobacco control and played a
"partial," or "none", without any reporting system in its current historical role for later efforts.
specific definitions of these terms). form is not designed to be a NATIONS was a collaborative
thorough, scientifically-oriented, effort by the United States Centers
Geographical coverage: annual monitoring programme. It for Disease Control and Pre-
has serious limitations on the vention (CDC) and the WHO, and
The geographical coverage of the immediate use of its data for also involved the American
reporting instrument is limited to monitoring policy interventions Cancer Society (ACS), and the
the signatory parties, although the and comparing legislative mea- World Bank (WB). Its aim was to
number of parties increases sures across countries. Once the monitor tobacco use and control,
regularly and might finally include data are available publicly, based on data gathered from
all WHO member states. The however, independent researchers several sources that stretched
issue of subnational legislation is can undertake the type of work from governmental and inter-
also absent from the ques- they choose to, but it will be based national agencies to commercial
tionnaire. on their own interpretation of the entities, scientific literature, etc. A
data and their own assumption on lot of the data was originally
the comparability of the infor- collected by the ACS and the

IARC Handbooks of Cancer Prevention

WHO to prepare the monograph tionSources/Publications/Catalogu agreement, they prescribe a 100%

Tobacco Control Country Profiles, e/20070226_1). What follows is a to smoke-free environments or not.
which was first published in 2000, description of the EURO database. The same issue applies to all
followed by a second edition in other tobacco control measures,
2003 (Shafey et al., editors). After Scope and characterization of where a clearer and more explicit
the adoption of the WHO FCTC by interventions: protocol would be needed. The
WHO Member States in May description of each tobacco control
2003, the data and further The scope of policies covered in measure, and their charac-
responsibility for collection efforts the EURO database is ample terization in terms of Yes and
was transferred to the WHO, and (Table 4.2). As for NATIONS, the No, are much more detailed than
they undertook the creation of data covers more than tobacco in NATIONS, thus leaving less
regional databases through their control (e.g. prevalence, mortality, room for interpretation by the data
regional offices. economics of tobacco); it ad- collector. The format of some of the
The data gathering process ditionally covers policies, such as data could also be improved, such
also underwent important changes. taxation and cessation. as the tax data that provides not
Data collection was decentralized The criteria for guiding the the rates, but the share of the price
to the regional level in order to choice of policies are not explicitly of a pack that goes into different
increase proximity to the countries provided, and the dataset includes types of taxes; the underlying tax
and obtain more accurate infor- tobacco control measures of very rates and the methodology to
mation on tobacco control diverse cost-efficiency without convert them in share of the prices
measures and their implemen- characterizing them. The protocol would be useful. However, most
tation. The data being collected and definitions to make the data legal documents that were relied
through the WHO regional offices comparable is also absent from the on are available on the website
became official, and had to be publicly available information on (except for taxes), thus mitigating
validated by national authorities the website. This might lead to that problem.
before it could be published. The some comparability issues. In the
WHO Regional Office for Europe case of smoke-free environments, Enforcement:
(EURO)( for example, the situation of a
bacco/) has so far provided the country is classified into one of The enforcement is assessed by
most comprehensive data col- three categories: smoking bans, the opinion of the focal point1
lection effort and has the most restrictions, and voluntary agree- collected by completion of a
complete regional dataset of all ments. The first problem is that questionnaire. A score of 1 to 5 is
regional offices. This database is smoking bans in the EURO provided for the enforcement of
used in turn to support the database are not really complete smoke-free legislation, bans on
European Tobacco Control Report, and might allow for some excep- direct and indirect advertising,
a publication with detailed infor- tions. The second problem is that product regulation, and sales to
mation on the state of tobacco voluntary agreements are not minors. However, the assessment
control in the 52 countries of EURO described to ascertain if, indepen- is not published on the website.
(http://www.euro.who. int/Informa- dently regulated by law of the

A National Focal point (NFP) is a national centre, designated by each State Party, which is accessible at all times for communications
with WHO International Health Regulation Contact Points. While the exact structure and organisation of the NFP are left to the State,
IHR (2005) define the role, functions and operational requirements for real time management of information and for efficient
communications. It is foreseen that NFPs will be offices rather than individuals.

Data sources for monitoring tobacco control policies

Tobacco Use Smoking prevalence in adults

Smoking prevalence in young people

Economics Cigarette consumption

Cost (in money and labour) of tobacco products
Tobacco tax revenues from excise duties
Duty stamps, earmarking of tobacco taxes
Government ownership and financial incentives
Studies of smuggling, economic and social costs, and litigation
Annual price variations of tobacco products in real terms (%)
Structure of taxation of tobacco products

Laws and Regulations Direct advertising of tobacco products

Indirect advertising of tobacco products
Distribution of tobacco products through various outlets
Regulations for sale of tobacco products
Smoke-free areas
Smoke-free public transport
Health warnings
Measurement, regulation and disclosure of tobacco product ingredients
and smoke constituents
Treatment of dependence:
- Interventions to support smoking cessation
- Quitlines
- Availability of smoking cessation treatment
- Training for health professionals
General policy: different sub-national laws or regulations
Public information and advocacy
Participation in WHO networks

Health Consequences and Costs Average number of years lost per death from smoking (years)
Deaths attributed to smoking in all ages
Deaths attributed to smoking in middle age (35-69 years)
Proportion of deaths attributed to smoking in all ages (%)
Proportion of deaths attributed to smoking in middle age (35-69 years) (%)
Standardised death rate from trachea, bronchus, or lung cancer
(per 100 000)

EURO: WHO Regional Office for Europe

Table 4.2 Scope of Policies Covered by the EURO Tobacco Control Regional Database

IARC Handbooks of Cancer Prevention

Data sources: potential inaccuracies for coun- that of EURO and the regions with
tries that legislated during this least policy database docu-
This database relies on a ques- period. The process of updating mentation, and the datasets cover
tionnaire that was distributed to the data is not specified and there mainly the information available in
national level tobacco control focal is no built-in regular update legal texts for a subset of countries.
points, who often work from within mechanism. Criteria for assessing this infor-
their national Ministry of Health, mation are much more detailed,
thus ensuring accuracy and Situations in other regions: with very specific questions leaving
country endorsement. The data little room for interpretation.
source is thus highly credible, but Not all regional offices had the Overall, the WHO regional
this process is not described on means to set up systems as databases represented until now
the website, so the reader cannot complete as that of EURO the best existing global data source
assess the validity of the infor- ( on tobacco control policies.
mation. Main sources are _data/regional_databases/en/inde However, they suffer from many
legislative measures to control x.html). In the Africa Regional issues, of which timeliness and
tobacco, although other policies Office (AFRO), the system does lack of enforcement data are the
are also monitored, such as not exist and the outdated most immediately obvious ones.
prevalence and epidemiological NATIONS represents the main Most important is that the tobacco
impact of tobacco consumption, source of data. In the Eastern control indicators are not the same
as well as tobacco economics. Mediterranean Region (EMRO; between regions, and are not defined with the same criteria
Geographical coverage: tryProfile-Part6.htm) and the (besides the fact that these criteria
South East Asia Region (SEARO; are never fully described). This
The EURO database covers all, the raises serious issues of overall
European countries. Although data was compiled in 2000-2002 comparability.
data from subnational jurisdictions and has been updated in 2008.
is not available, its existence is The policy scope is much The Global Tobacco Control
assessed for eight categories of narrower than in EURO, reasons Report (GTCR) system
legislative measures. for selecting the indicators are not
specified beyond being relevant The Global Tobacco Control
Timeliness, frequency of data and readily available, and geo- Report (GTCR), released in early
collection, and trend: graphical coverage could be 2008, is the central instrument of
improved. As for other regions, the a worldwide tobacco control
The data collection involves a lot protocol or criteria for interpreting monitoring effort by WHO
of back and forth between the laws is not explicitly described, (
countries and the regional office, thus raising issues of com- er/en/). The objective of the report
in order to clarify and standardize parability between countries, but is to monitor a core of essential
answers, as well as ensure mostly between regions (some tobacco control policy initiatives,
country buy-in. This, however, EMRO legal texts are available and to report on their imple-
creates long delays between online). In the Pan American mentation on an annual basis. The
initiation and conclusion of the Health Organization (PAHO; GTCR aims to provide a highly
data collection effort. The last ( structured and focused framework
round of data collection, for osHome.asp) and the WHO through which progress towards
example, was initiated in June Western Pacific Region (WPRO; the implementation of defined,
2005, but was not completed until (, the concrete tobacco control mea-
the fall of 2006, which allowed for situation is somewhat in between sures at the country level will be

Data sources for monitoring tobacco control policies

compared in a standardised by type of media; complete well as direct and indirect

manner across countries. Essen- smoking bans by sector; the advertising bans for each country
tial indicators are measured availability of tobacco dependence (Table 4.4). The assessment of
through a short questionnaire that treatment; and existence of na- enforcement is integrated globally
is completed by country level focal tional tobacco control policy through an enforcement score,
points. objectives. Policies such as where a highly enforced policy is
awareness campaigns or anti- worth two points, a moderately
Scope and characterisation of smuggling initiatives are not enforced policy one point, and a
interventions: considered. Answers to this an- minimally enforced policy no
nual questionnaire will be points, hence a maximum score of
The GTCR focuses on a few analysed in the GTCR, which will 10 points given the five experts.
policies that were selected based use gaps between optimal and This system, although very simple,
on their efficiency and cost- existing policies revealed in these works quite well with the majority of
efficiency. The questionnaire data and analyses to develop a countries with legislation providing
requires information on national strong advocacy message. Table the assessment and enforcement
prevalence of daily tobacco use; 4.3 provides the scope of policies scores conforming to expec-
the share of tobacco taxes in the covered by the GTCR. tations. Moreover, the scores are
price of a pack; the existence of credible at the global level, with a
visible health warnings occupying Enforcement: wide dispersion of values, as well
at least 30% of the package of as within countries, with very few
tobacco products; complete The GTCR uses the following polarized expert assessments and
advertising, marketing, and pro- protocol to assess the enforcement yet very few consensual situations.
motion bans of tobacco products of smoke-free environments, as The score, however, suffers from

Tobacco use Internationally comparable smoking prevalence in adults

Economics Structure of taxation of tobacco products

Earmarking of tobacco taxes
Tobacco tax revenues from excise duties
Price of main cigarette brands

Laws and Regulations

Direct advertising of tobacco products
Indirect advertising of tobacco products
Smoke-free areas
Health warnings
Treatment of dependence:
- Interventions to support smoking cessation
- Quitlines
- Availability of smoking cessation treatment
General policy: different sub-national laws or regulations

GTCR: Global Tobacco Control Report

Table 4.3 Scope of Policies Covered by the GTCR

IARC Handbooks of Cancer Prevention

1 Choose five key (non-paid) experts of different institutions and professions. Preferably select individuals with the
following background: (1) one health professional with a strong background in tobacco control, (2) one academic
who specializes in tobacco control, (3) the head of a prominent non-governmental organisation in tobacco control,
(4) the government official responsible for tobacco control activities, (5) the WHO focal point for tobacco control (who
usually is also filling out the questionnaire).

2 Consult the experts separately. In many countries, tobacco control networks are very small and the same individuals
might wear many hats. For example, the chief tobacco control officers in the government are often dedicated to the
point of also being the head of leading tobacco control non-governmental organisations. All such experts are likely
to know each other and might not want to openly disagree or share the same limited experience, especially if this
disagreement might have some impact on issues not related to monitoring.

3 Ask each expert to score, in writing, enforcement for three broad categories of tobacco control measures on a scale
of 1 to 3 (minimally, moderately or highly enforced: (1) smoke-free environments, (2) direct advertising, (3) indirect
advertising (promotion and sponsorship).

4 Review the expert's opinion at the national level. The GTCR national focal point: review these answers and clarify
any pending issue or obtain more information regarding widely different answers.

5 Review national findings at the regional level. Consistency and comparability of the national answers could then be
compared at the regional level and scores revised if needed.

6 Integrate results globally.

GTCR: Global Tobacco Control Report

Table 4.4 GTCR Protocol to Assess in Country Enforcement of Smoke-Free Environments, and Direct
and Indirect Advertising Bans

the pitfalls of such measures provide supporting information for spreads throughout data collection
described earlier, and the data these answers in the form of legal and is completed by a final country
collectors are aware of some coun- texts and official policy guidelines, validation of the data. This
tries where there are very close although supporting documents validation includes official signing
links between the experts. The are generally incomplete. This off on the questionnaire answers
system, however, is successful information is then assessed at by an authorized civil servant2.
enough to serve as a basis for the the regional level by a regional Additional primary data sources
next round of data collection. data collector and then again at are the actual knowledge of the
the worldwide level. For most country informant on local policies
Data sources: countries, this process results in a regarding the treatment of tobacco
large flow of communications cessation. For example, the
In most cases, the source of where questionnaire answers are informant has to collect infor-
primary data is legislation as questioned, answered again, mation on the national availability
assessed by country level infor- documented, and finally validated of quitlines, as well as counselling
mants. Informants also have to by all. The validation process thus services for cessation. This

This validation process was not followed for the European region in the first release, since the source of the data was the already
validated data used for the European Tobacco Control Report, in addition to minor updates.

Data sources for monitoring tobacco control policies

information is not backed by Timeliness, frequency of data A new context

supporting documents unless collection, and trend:
policy papers, or even leaflet The environment of tobacco
advertisements for these services, The GTCR will be released control has evolved very rapidly
are available. annually, even if annual dif- over the past few years and many
Some questionnaire items ferences are minimal. Some initiatives either directly promote
proved difficult to respond to. The changes in the data might occur policy monitoring systems or
simplicity of the questionnaire despite the absence of any new create a strong demand for them.
could not capture well the com- measures, since a much larger A major change has been the
plexity of national tax data. team will be in charge of reversal of the tide in most high-
Government spending on tobacco assessing questionnaire answers income countries, with decreasing
control also proved an elusive and comparing it to legislation; prevalence and number of smo-
piece of information, because hence, possible revisions and kers. However, despite pre-
such expenditures are not clearly refinements. The GTCR will keep valence rates that are also often
labelled and are often scattered an annual record of the situation in decreasing in low- and middle-
across many budget items. It is each country, which will permit income countries, higher demo-
therefore likely that future editions trends analysis. graphic growth will inevitably lead
of the GTCR will need to modify to deaths on a massive scale.
the questionnaire to better capture Ref lections on the future of Tobacco companies are also
very complex information. Finally, tobacco control monitoring instituting shifts in their operations
it proved easier to handle systems that are geared to these new
prevalence data through WHOs markets. For this reason, tobacco
Global InfoBase than through None of the existing monitoring control needs to quickly implement
prevalence-related questions on systems fully meets all the criteria the same shift and undertake
the questionnaire, given the clear developed in the second part of massive efforts in low- and
advantage and networks InfoBase this section, and thus it remains middle-income countries.
developed over the years. difficult to answer the questions Many factors could help this
outlined in the introduction without shift. The most important factor,
Geographical coverage: undertaking a detailed country and one that is often forgotten, is
analysis and relying on experts that tobacco control is now a tried
The geographical coverage is very opinion (Joossens & Raw, 2006). and tested policy, with a tried and
wide, including all 193 WHO In other words, reliable, com- tested network of dedicated
member states; although 21 parable, comprehensive, and individuals and institutions.
countries, mostly from the Wes- ready-to-use time series on the Tobacco control advocates can
tern Pacific and Americas regions, prevalence of tobacco use and build on a lot of existing know-
did not participate in the first tobacco control measures do not ledge, experience, and suc-
release. At this stage, the GTCR exist and cannot be related to cesses, as well as failures.
questionnaire does not collect each other. This means that given Awareness is also much higher,
information on subnational juris- the current stage of existing data, as not even the tobacco industry
dictions, but does ask questions to it remains challenging to properly can argue anymore that tobacco is
certify the existence of such and systematically assess all not bad for health.
measures, in order to consider the aspects of tobacco control as a The WHO FCTC is also a major
feasibility of collecting these public policy intervention at the structuring element for tobacco
measures in the next release. international level, although the control. By signing it, a country de
GTCR offers a good basis to do so facto accepts its premises and
if developed properly. commits itself in front of the world

IARC Handbooks of Cancer Prevention

community to enact very specific social determinants and impact. process with clear objectives and
tobacco control measures, and Secondhand smoke, for example, constant reassessment of policy
report on the implementation of was not a major concern for public means. The most striking
their international treaty obliga- policy before research clearly implication of this policy process is
tions. By virtue of being a treaty, linked it to specific health the ensuing need for a dedicated
the WHO FCTC makes tobacco conditions (US Department of network of individuals, institutions,
control a concern that is much Health and Human Services, and ongoing discussions regar-
broader than health, but an 1986). Realizing that youth ding both the evolution and
altogether international affairs prevalence is a major explanatory continuity of the system, as well as
issue; hence, additional pressure factor for future adult prevalence, the nature and usefulness of the
through linkage with other "high has meant that tobacco control collected data. Health practitioners,
politics" issues. could adopt much more aggres- economists, epidemiologists, data
Finally, new private and sive policies towards this specific managers and collectors, govern-
highly significant initiatives, such market. Knowledge that some of ment officials, and many others
as the large donation by New the harm caused by tobacco to the need a very high level of
York City Mayor Michael Bloom- cardiovascular system can be collaboration in order to set up and
berg add fuel and momentum to reversed within a few years of maintain a good tobacco control
tobacco control. These initiatives cessation, has given a tre- monitoring system. A prerequisite
not only help strengthen existing mendous boost to cessation to any good monitoring system is,
efforts, such as the WHO FCTC, policies. The tobacco industrys therefore, a good organisation,
but also help empower tobacco reaction to original advertising which points directly to the most
control advocates who can then bans has prompted a policy important ingredients: dedicated
set the standards at a higher reaction that now stretches to work with regular, predictable, and
level and convince governments promotion and sponsorship, etc. stable funding.
to follow suit. This new focus on Linking smoking further to a Referring back to the questions
tobacco control is thus a general discomfort and economic outlined in the first paragraph of
fantastic opportunity to start costs for nonsmokers, and the introduction: why cant we
working on monitoring systems, realizing that smoking bans were better assess the impact of
as it creates a new demand for also a very efficient way to help specific tobacco control policy
such information. It is time to addicted smokers quit, helped interventions in terms of efficiency
rethink tobacco control based on justify further tobacco control in and efficacy? One important factor
past experience and highlight the field of secondhand smoke. is the capacity to build and sustain
some of the improvements that The health impact on non- policy monitoring systems. In fact,
should be implemented. These smokers, however, remains a many initiatives were started and
obviously have to do with the crucial underpinning for public left incomplete, mainly because of
nature and analysis of the data, intervention in this field. irregular or insufficient funding
but mostly with the capacity to Monitoring systems for tobacco (perhaps as a reflection of lack of
gather them. control must thus be flexible political will). As this section made
enough to evolve and keep up clear, a high-quality international
with the changes in overall policy monitoring system is first and
Capacity for relevant data objectives, tobacco control en- foremost a good and stable
collection vironment, and consumption network of competent and highly
patterns. Monitoring systems for coordinated individuals and
Tobacco control is also a field that tobacco control are consequently institutions. Such networks are
has greatly evolved with our much more than just gathering difficult to build and maintain. In
knowledge of tobacco and of its data. They involve a complex addition, close supervision of

Data sources for monitoring tobacco control policies

country level activities is impossible system. It follows that in this new relationship between this data
to perform from the outside, and international context, capacity collection system and the WHO
this necessitates close involvement building should come first with FCTC should be carefully as-
of local authorities and staff, hence data collection undertaken as an sessed. Although the WHO FCTC
the absolute necessity of country integral part of it. This would does not yet cover all countries and
buy-in. ensure country buy-in, help keep does not gather data with the aim
This means that the most competent data collectors in the of comparing them (at least for
pressing demand from countries is network, and answer the needs of now), there is nevertheless a
in capacity building to gather and countries regarding the WHO significant overlap between the
analyse data. Indeed, based on FCTC. Most importantly, this COP reporting instrument and
past experiences, building a would ensure that the data GTCR. The closer these processes
sustainable tobacco control collection system does not vanish are, the easier data collection
monitoring system is impossible after a round of data collection, as becomes, and the more efficient
without a prior effort to build a it will be linked to the overall policy the entire system will be.
solid network of competent indi- needs of the countries making
viduals and institutions, and a these efforts relevant not only for
national level capacity that can international users, but also for Conclusions
sustain this system. Previous data local users. This network also
collection efforts were mainly needs to be expanded outside of This section describes the few
donor-driven. A network of infor- the traditional country level existing data collection initiatives
mants was set up from various individuals from ministries of on policy interventions in the field
sources (ministries of health, non- health, and include officials from of tobacco control. Only the WHO
governmental organisations, etc.), external affairs and economic GTCR system is, at this moment,
questionnaires were answered, ministries, as made possible, if not a repository of good quality
stipends paid, and when funds necessary, by the WHO FCTC. information on a wide range of
dried up, this embryo of a network tobacco control policy inter-
was unfortunately left to dis- ventions for the large majority of
integrate. These data collection Towards one ef fective policy countries. It is also the only one
efforts provided highly valuable data collection system with sustainable funding, and
information, and individuals who therefore the most promising
worked on them were pioneers in A monitoring system that is solidly initiative to support prospective
tobacco control, but unfortunately anchored in a network to be national policy changes over time.
a lot of the data cannot be used assembled by a significant Nevertheless, the GTCR only
now. capacity building effort is a focuses on policy domains that
The incredible opportunity that necessary condition for success, have been proven to be effective
now exists, thanks to the WHO but surely not a sufficient con- in reducing tobacco use. Its main
FCTC, is a global demand for dition; dispersing efforts among limitation is that it does not yet
capacity building, as countries will several systems should be contain information about sub-
start to struggle to meet inter- avoided. Countries should not be national policies. All policy
national obligations. Answering this burdened by excessive data researchers studying policy dif-
demand quickly is crucial to build collection, at least with regards to ferences between countries are
a comprehensive international tobacco control. This means, for encouraged to use the WHO
network for tobacco control. This example, completing the integra- GTCR system in their investi-
network is in turn a necessary tion of the WHO regional data- gations.
condition to the emergence of a bases and GTCR. It also means
global tobacco control monitoring that over the next few years, the

4.2 Using production, trade, and sales data in

tobacco control
Introduction nately, such a system is not yet the import and export of
available. Tobacco control resear- manufactured tobacco products can
Article 20 of the Framework chers and advocates must find provide valuable information on
Convention on Tobacco Control important data, such as cross- important, key players in the na-
(FCTC) calls for parties to: country estimates of production, tional tobacco control debate. For
trade, and tobacco consumption example, a close examination of
(a) establish progressively a na- from a variety of sources. trade patterns in tobacco products
tional system for the epide- The objectives of this section are can reveal the precise origin of
miological surveillance of to- 3-fold: to discuss the potential cigarette imports; similarly, it can
bacco consumption and rela- usefulness of production and trade identify key export markets. Such
ted social, economic and data in tobacco control, with par- information can be invaluable in
health indicators; ticular attention to the advantages identifying important players in the
(b) cooperate with competent and disadvantages of using these national tobacco control arena.
international and regional inter- data to measure tobacco con- Finally, production figures can be
governmental organizations sumption; to examine the use of combined with import and export
and other bodies, including export and import statistics for figures, to provide a measure of
governmental and nongovern- measuring the illegal cigarette trade; national consumption of manu-
mental agencies, in regional and to review the availability and factured tobacco products that may
and global tobacco surveil- quality of existing data. be useful in attempting to quantify
lance and exchange of the magnitude of the smuggling
information on the indicators Trade and production data in market. Sales data, based on tax
specified in paragraph 3(a) of tobacco control records, can also be used as an
this Article (WHO, 2003). estimate of the consumption of
Data on trade and production of various tobacco products.
One can envisage that as the FCTC manufactured tobacco products can
is progressively implemented in a be obtained from national statistical Using aggregate data to
substantial number of countries, a agencies and international data- measure cigarette consump-
comprehensive and sustainable bases with relative ease and tion: advantages and dis-
surveillance system will emerge. provide valuable information to advantages
Such a system would allow advo- tobacco control advocates. First,
cates and researchers a one stop production data can provide a good Estimates of consumption and
source of information where com- indicator of the importance of the prevalence of use of tobacco
parable key tobacco control sta- national tobacco industry at both the products can originate from various
tistics, such as mortality attributable national and international levels types of data. They can be based on
to tobacco use, prevalence of and, in the absence of trade, (self-reported) tobacco use preva-
tobacco use, and consumption of production data can provide an lence surveys, which provide
and trade in manufactured tobacco accurate measure of the national information on the proportion of
products are accessible. Unfortu- tobacco market. Secondly, data on tobacco users in a given population.

IARC Handbooks of Cancer Prevention

Prevalence data combined with cigarettes smoked each day is tobacco consumption (Rijo,
tobacco use intensity data (e.g. underreported. In addition, many 2005); and Thailand where high
number of cigarette smoked per population-based surveys do not levels of use of hand-rolled
day) can also yield total con- interview people in the military, tobacco have been reported
sumption estimates. Consump-tion prison, and psychiatric institutions (Sarntisart, 2003)
can also be derived from aggre- and thus will not assess use in The major problem with
gate production and trade populations with fairly substantial aggregate data is perhaps that,
statistics. Production plus imports smoking prevalence. Another unlike prevalence survey-based
minus exports will yield apparent potential limitation is the infre- data, they cannot be used for
consumption estimates. For quent availability of trend data. analyzing changes in sex, age,
example: Finally, the subjective nature of income, and education distri-
cigarette consumption = ciga- surveys and differences in survey bution, and they do not permit a
rette production + cigarette methodology (questions, defini- distinction between a change in
imports cigarette exports tions, languages, etc.) also make the number of smokers and
per capita cigarette con- comparison of estimates across changes in consumption per
sumption = cigarette con- countries difficult. smoker (Warner, 1977). Other
sumption / (pop. 15+) Aggregate production and important problems include illicit
trade statistics are objective data trade in cigarettes and illegal
National cigarette sales data, that eliminate the underreporting manufacturing and counterfeit
based on sales or tax records, can problem inherent in data based on trade, resulting in export and
also be an estimator of con- subjective survey responses import data not being registered in
sumption (Guindon & Boisclair, (Warner, 1977). These data are official figures, which may lead to
2003). also readily available across time under or overestimating con-
Prevalence surveys can and countries. This feature, as sumption of tobacco products
provide important insights into well as the availability of cen- (WHO, 1998a). The problem of
patterns of and changes in tralized data sources using stockpiling may also emerge, as
consumption according to sex, common methodologies, allows not all cigarettes will be consumed
age, income, and education for good comparability. However, in the year they are produced or
(Warner, 1977). They also allow most of these large-scale tobacco imported. If this stockpiling is
distinguishing between a change statistics are only available for significant it may bias con-
in the number of smokers and manufactured cigarettes. Data sumption estimates. However, it is
changes in consumption per from the Global Youth Tobacco doubtful that stockpiling will affect
smoker. On the other hand, Survey (GYTS) indicate that more trends since it is not likely to vary
consumption data (the number of than 10% of students used tobac- from year-toyear, although tobac-
cigarettes consumed) based on co products other than cigarettes, co companies have been known
surveys can suffer from significant with the rate being highest in the to time cigarette stockpiling
underreporting (Warner, 1978; southeast Asia region and the against health measures so that
Jackson & Beaglehole, 1985; eastern Mediterranean region they appear less effective (WHO,
Hatziandreu et al., 1989; Foss et (Warren et al., 2006). Specific 1998a). Transient populations will
al., 1998). Surveys generally examples include: India where affect aggregate trade and
provide valid estimates of pre- tobacco consumption is domina- production statistics to a varying
valence (Velicer et al., 1992; ted by use of non-cigarette degree. Finally, the question of
Patrick et al., 1994; Caraballo et tobacco (bidis, leaf tobacco etc.), measurement units can yield
al., 2001; Caraballo et al., 2004), resulting in cigarette consumption diverging trends and biased point
suggesting that the number of repre-senting only 15% of total estimates. More specifically:

Using production, trade and sales data in tobacco control

Apparent consumption will units, trade statistics in metric bility through time, adding all
underestimate true consump- tons, and one gram of cigarette production data points in a
tion in countries where tobacco equals one cigarette, true particular year can lead to under-
products are illegally imported consumption will be over-esti- estimation.
and consumed, while it will mated if the country is a net Sales data based on tax
overestimate true consumption importer of cigarettes, and records are also aggregate data,
where tobacco products are underestimated it if the country and similarly present the same
illegally exported to another is a net exporter. general advantages and dis-
country. Apparent consumption will advantages as those described
Trade and production data can overestimate true consumption above for production and trade
be reported in weight or in in countries with large transient statistics. It should be noted,
physical units. In countries populations (for example however, that sales data are not
where cigarette weights have tourists or military), and small as readily available across
not remained constant over indigenous populations, such countries and are not available in
time, cigarette consumption as Malta and the Maldives. centralised databases. On the
expressed in units and in weight In addition to the measurement other hand, they do not suffer from
can show diverging trends. For issues described above, pro- the limitations associated with
example, Australian cigarettes duction and trade figures reported measuring and reporting units or
became progressively lighter in by national statistical agencies stockpiling. They also present the
the late 1980s. When ex- may not accurately reflect true advantage (unlike estimates
pressed in grams per capita, figures. There may be a time lag obtained from trade and pro-
cigarette consumption in of three to six months between duction statistics) of yielding
Australia fell by 4.9% between recording export and import consumption estimates that ex-
1986 and 1990, while it in- statistics. It may also be the case clude duty-free sales, most of
creased by 5% when expressed that import statistics are recorded which are to non-residents and
in units (Chapman, 1992). more rapidly and accurately are not consumed in the country.
Trade and production statistics because of more prevalent import Finally, sales data may be
for an individual country can duties (as compared to export segmented by tobacco products
also be reported in different duties). Finally, there may be (e.g. cigarettes, cigars, etc.),
units. For example, manu- reporting errors at the national brands and brand variant (e.g.
factured cigarette imports and level, and between the national length-type, and descriptor-type,
exports are often reported in statistical agencies, international such as light or mild), and thus
metric tons, while production is agencies, and organisations that yield information on market shares
expressed in units. When this report cross-country statistics. by individual brands, brand family,
is the case, it is usually Production data can be used and brand variant.
assumed in the calculations at the global level as a proxy for
that one cigarette weighs one world consumption. It will be a Population adjustments:
gram. But this assumption may poor proxy for consumption in
not hold and thus bias most countries, but as world Total cigarette consumption can
consumption estimates. The exports must equal world imports, be useful to gauge the size of a
direction of the bias will aggregating cigarette production tobacco market, but it does not
depend on two factors: the true for all countries would do away allow for comparison across time
conversion factor, and the with the problems associated with and across countries. To achieve
respective size of imports and smuggling and attenuate the the latter, total cigarette con-
exports. For example, in a problems associated with mea- sumption or sales can be
country where production surement units. Unfortunately, weighted by population in order to
statistics are expressed in because of unequal data availa- provide an indicator of individual

IARC Handbooks of Cancer Prevention

consumption, usually by dividing Agriculture (USDA) data showed released figures which showed
total cigarette consumption by the that recorded cigarette exports that the gap had been reduced to
population aged 15 years and exceeded recorded imports by 126 billion cigarettes in 2001.
above. The age group 0-14 is more than 300 billion each year in Caution with the analysis of USDA
normally omitted because of its the period 1995-2000. The only data is necessary.
limited contribution to tobacco use plausible explanation for these Another explanation might be
(Chapman, 1992). However, dif- missing cigarettes is smuggling that the reduction of smuggling
ferences between countries in (Joossens & Raw, 1995; Joossens occurred as some major inter-
demographic distribution and & Raw, 1998). national tobacco companies have
tobacco use prevalence in the 10- Some cautious interpretation reviewed their export practices
20 age group can be important of these results is advisable due to lawsuits. The reduction of
and diminish comparability. (Merriman et al., 2000). Many the gap may finally be explained
factors may explain a discrepancy through the increase of illegal
The use of export and between recorded exports and manufacturing and counterfeit
import statistics for imports. An analysis of data from cigarette trade, which is a growing
measuring the illegal the United Nations Commodity concern in many countries. The
cigarette trade Trade Statistics Database (UN illegal nature of their production
Comtrade) shows large dis- means that they are not registered
The gap between global exports crepancies between total reported in the official export and import
and global imports is often used to imports and exports of many data.
make estimates of the overall size goods. However, researchers Finally, the analysis of export
of cigarette smuggling. World admit that cigarettes are different and import practices can also be
cigarette production is known from other commodities, as used to study the smuggling
fairly accurately, and, since there cigarette exports consistently problem at the national level. For
are not large numbers of greatly exceed imports. It is con- instance, exports from the British
cigarettes in storage because cluded that the most reasonable tobacco companies to Andorra
they do not keep for long, world explanation for the observed data increased from 13 million ciga-
production is very close to world is that a large and growing rettes in 1993 to 1,520 million in
consumption. Global imports fraction of international trade is 1997. Taking into account that
should thus be close to exports, smuggled (Merriman et al., 2000). almost none of these cigarettes
after allowing for legitimate trade USDA statistics for the period were legally re-exported, that
usually excluded from national 2001-2004 showed that the gap Andorra only has a population of
statistics. (These are principally between recorded cigarette im- 63000, and that smokers in
imports for duty-free sales to tra- ports and exports had been Andorra on the whole do not
vellers, diplomatic staff, and reduced to around 150 billion smoke British brands, it was clear
military establishments.) cigarettes annually. There may be that these increased exports were
Imports, however, have long different explanations for these intended for the smuggling market
been lower than exports to an reductions. USDA data are not (Joossens & Raw, 2002). Induced
extent that cannot be explained by always reliable at the national or by high taxes in the early 1990s,
legitimate duty-free sales. Even the worldwide level. In 2002, the cigarette smuggling increased
lag time of three to six months USDA magazine Tobacco: World substantially in Canada. Virtually
between recording export and Markets and Trade published data all smuggled cigarettes had been
import statistics, cannot explain the which showed that the gap previously exported from Canada.
differences between them which between exports and imports was As Canada did not, and still does
have been high for years. World- 276 billion cigarettes in 2001. Two not, export a large amount of
wide, United States Department of years later, the same magazine cigarettes, exports proved to be

Using production, trade and sales data in tobacco control

an accurate indicator for smug- Availability and quality of United Nations Statistical
gling (Galbraith & Kaiserman, existing data Division (UNSD) Industrial
1997). Similarly, a significant and Commodity Production Statis-
unlikely decrease in apparent This section describes various tics Dataset:
cigarette consumption per capita cross-country sources of pro-
was observed in Brazil, while duction and trade statistics that The current version of the UNSD
apparent consumption was rising provide information on manu- Industrial Commodity Production
rapidly in Paraguay in the late factured tobacco products, and Statistics Dataset contains the
1980s and early 1990s, driven by discusses their strengths and entire database of industrial
a 16-fold increase in exports to weaknesses. commodity statistics, including
Paraguay (Shafey et al., 2002). manufactured cigarettes and
The aforementioned examples United Nations Commodity cigars, cheroots, and cigarillos
indicate the usefulness of exa- Trade Statistics Database (UN covering the period 1950-2003
mining production, trade, and Comtrade): (1970-2003 for manufactured
consumption data to gain insights cigarettes). Data for the time
into the smuggling market. That The United Nations Commodity period 1994-2003 are available in
said, other methods exist and Trade Statistics Database (UN print in the 2003 Industrial
have been used to estimate the Comtrade) contains detailed im- Commodity Statistics Yearbook
size of national smuggling market. port and export statistics, including (United Nations Statistical Divi-
Tobacco consumption estimated manufactured cigarettes and sion, 2003). The data contained in
from production and trade or sales cigars, cheroots, and cigarillos this database has primarily been
data can be compared to esti- reported by statistical authorities collected from questionnaires sent
mates of consumption based on of close to 200 countries or areas yearly to national statistical
prevalence surveys while taking (http://unstats. authorities. However, data have
into account under-reporting. The trade/). It contains annual trade also been collected from other
United Kingdom has used this (import and export) data from governmental agencies, spe-
method extensively to estimate 1962 to the present. UN Comtrade cialised agencies, intergovern-
the size of the smuggling market is considered the most compre- mental bodies, private institutes,
(for more details, see HM hensive trade database available and associations. The UNSD
Customs & Excise, 2001). In and is continuously updated. Un- Industrial Commodity Production
Thailand, individuals who reported like other existing data sources Statistics Dataset can be con-
using tobacco products during where only total amounts are sidered the most reliable and
face-to-face interviews, were obtainable, UN Comtrade makes comprehensive production dataset
asked to present their tobacco available the complete trade available (
package to the interviewer. An matrix. Whenever trade data are unsd/industry/ics_ intro.asp).
examination of the health war- received from the national autho-
nings (i.e. absence of warnings or rities, they are standardised by the Food and Agriculture Organi-
a warning in a language other than United Nations Statistics Division zation of the United Nations
Thai) can reveal if the tobacco and then added to UN Comtrade. (FAO) FAOSTAT:
products are likely to have been Despite its comprehensiveness
legally purchased (Sarntisart, and its online availability, UN The Food and Agriculture Orga-
2003). Comtrade is rarely used by nization of the United Nations
tobacco control researchers and FAOSTAT provides access to
advocates. over 3 million time-series and
cross-sectional data relating to

IARC Handbooks of Cancer Prevention

food and agriculture from over 100 the data contained in these The World Cigarette Reports,
countries and areas (http://faostat. commodity and country reports published by ERC Statistics are not official USDA data, but International PLC, a London-
The FAOSTAT TradeSTAT represent estimates made by FAS based market research organisa-
module contains detailed agri- Attachs. The publication Tobac- tion, provides some original
cultural trade data, including co: World Markets and Trade was statistical information, including
import and export statistics for discontinued in September 2005, up-to-date production and trade
manufactured cigarettes and while tobacco attach reports figures for a number of countries
cigars, cheroots, and cigarillos were discontinued in January covered (ERC Statistics Inter-
(i.e. as a grouping). Data are 2006. national PLC, World Cigarette Mar-
obtained from national statistical Data from the USDA are kets; http://www.erc-world. com).
and agricultural agencies and are arguably the most widely used
standardised, processed, and and cited cross-national con- United Nations Population
validated by the FAO Statistics sumption and trade statistics in Division (UNOP) World
Division, whereby the national tobacco control research and ad- population prospects:
commodity classification (usually vocacy. The WHO Global Status
the Harmonized System) is Report (WHO, 1997) relies almost This dataset provides the official
converted to the FAO commodity exclusively on data from the United Nations population esti-
classification. TradeSTAT has just USDA. The much cited analysis of mates and projections pre-pared
recently begun providing detailed the impact of USA trade policy on by the Population Division of the
trade matrices. cigarette use in Asia, utilised Department of Economic and
cigarette consumption estimates Social Affairs of the United Nations
United States Department of that were derived from USDA data Secretariat (http://www.un. org/esa/
Agriculture (USDA), Foreign (Chaloupka & Laixuthai, 1996). population/publications/WPP2004
Agricultural Service (FAS): Other more recent research /wpp2004.htm). Detailed popula-
examples include Gilmore & tion estimates stratified by sex and
- Tobacco: World Markets and McKee (2004) and Gilmore & age for close to 200 countries and
Trade McKee (2005). areas are available.
( In addition to the data sources
co_arc.asp) Market research reports: discussed above, there exists a
- Attach Reports (http://www. number of initiatives that report There is a plethora of reports cross-country data for smaller
Rep/default.asp) published by market research groupings of countries often on a
firms on the manufactured tobac- regional basis. Examples include
The USDAs FAS World Market co sector. Most provide country the Organization for Economic
and Trade reports provide the snapshots using various market Cooperation and Development
latest data on a number of agri- size indicators including apparent (OECD) Health Data which re-
cultural commodities, outlining the consumption, which, as men- ports tobacco consumption
current supply, demand, and trade tioned earlier, is constructed from estimates for OECD member
estimates both for the USA and for trade and production figures. states. The latest version of the
many major countries. FAS These reports often present mar- OECD database was released in
international offices provide infor- ket share data by brands, brand June 2006, and contains a
mation on production, consump- families, and companies. Many number of comparable statistics
tion, and trade of many com- reports offer little original infor- on health and health systems
modities, including manufactured mation (e.g. some rely almost across OECD countries. The
cigarettes. It should be noted that entirely on USDA published data). database contains more than

section4.2janvier9:Layout 1 09/01/2009 13:31 Page 159

1200 series covering a wide range (ADB) Key Indicators (http://www. those published by other organi-
of health topics (i.e. health status, sations, such as the United
health care resources, health care dicators/2006/default.asp), which Nations Statistical Division and the
utilisation, expenditure on health, reports up-to-date manufactured FAO, or by national statistical
health care financing, social cigarette statistics for a number of agencies. For a great number of
protection, pharmaceutical market, countries. Most data, but not all, low- and middle-income countries
and non-medical determinants of contained in the OECD, CIS, and (e.g. Albania, Algeria, Bang-
health). OECD Health Data is ADB databases are also available ladesh, Bolivia, Ecuador, Jordan,
developed jointly by the OECD in the UN databases discussed Lebanon, and Viet Nam), USDA
Secretariat and the Institut de earlier. However, these databases cigarette production and trade
Recherche et dtude en co- offer a relatively easy opportunity data appear at best to be an
nomie de la Sant (IRDES), a to compare estimates of con- extrapolation based on a gues-
French research institute spe- sumption and production from stimate. As discussed earlier, an
cialising in health economics and multiple sources. examination of what is often
health statistics. The data are referred to as the size of the
compiled from national statistical Discussion smuggling market (the difference
agencies and other relevant between total exports and total
national organisations (http://www It is important to point out that a imports) yields a very different,2340,en large amount of the data pub- picture if looking at data from the
_2825_495642_12968734_1_1_1 lished and available from the data USDA or FAO (UN Comtrade
_1,00.html). sources described above can does not publish global figures of
A second cross-country data differ substantially. In particular, manufactured cigarettes import
source is the Interstate Statistical the trade data reported by the and export) (Guindon & Boisclair,
Committee of the Commonwealth USDA, UN Comtrade, and the 2003). For these reasons, it is
of Independent States (CIS), FAO differ widely at times. This strongly suggested to use pub-
Official Statistics of the Countries makes it important to use the best lished USDA data for low- and
of the CIS (the CIS is comprised available data by first comparing middle-income countries with
of Azerbaijan, Armenia, Belarus, data from multiple sources. great caution.
Georgia, Kazakhstan, Kyrgyzstan, It is generally the opinion that Researchers and advocates
Moldova, Russia, Tajikistan, data from UN Comtrade (export interested in production, trade,
Turkmenistan, Uzbekistan, and and import) and UNSD (pro- and consumption estimates from a
Ukraine). The CIS database duction) are the most reliable and single country are advised to
( comprehensive available. FAOs always look first at potential local
offst.htm) is updated annually and TradeSTAT is a good source of and national primary sources of
contains annual data on more data that can be used alongside information, such as government
than 3500 socioeconomic indi- UN Comtrade. Of particular statistics agencies and ministries
cators from 1980 for all CIS concern are the country data of trade and industry.
countries. Another data source is published by the USDA. They are
the Asian Development Bank often significantly at odds from

tobacco use behaviours

Introduction vided in regards to purpose, cross-country comparisons were

methodology, survey instrument, not possible.
The purpose of this section is to survey administration procedures,
describe the data collection efforts data analyses, dissemination of To address these data gaps, the
for global surveillance on tobacco information, and utility in monitoring Pompidou Group developed a
use in youth and adults. We include and evaluating articles from the standard questionnaire for school-
only those surveillance systems that WHO FCTC (WHO, 2003). based surveys which was pilot
are cross-national and on-going. tested in eight European countries.
The youth surveys are school- Youth Further work was not done until the
based with a target survey popu- early 1990s, when the Swedish
lation of students between 11 and Purpose Government convened a meeting of
15 years of age, the primary age of 21 European countries to build on
smoking initiation in many countries. European School Survey Project the work of the Pompidou Group by
The surveillance systems described on Alcohol and Other Drugs developing a system for simul-
in this section include: The Euro- (ESPAD): taneously collecting school-based
pean School Survey Project on The Pompidou Group is a multi- data using a common methodology.
Alcohol and Other Drugs (ESPAD) disciplinary cooperation forum to This resulted in the development of
(ESPAD, 2007), the Global School- prevent drug abuse and illicit the ESPAD project which has now
Based Student Health Survey trafficking in drugs, set up in 1971 completed four cycles of data
(GSHS) (GSHS, 2007), the Global and incorporated into the Council of collection: 1995, 1999, 2003, and
Youth Tobacco Survey (GYTS) Europe in 1980. At that time, the 2007. Future expansion will occur
(GYTS, 2007), and the Health group recognized the need for on a four year cycle. The countries
Behavior in School-Aged Children countries to collect data on alcohol, that have participated in ESPAD are
Survey (HBSC) (HBSC, 2007). The tobacco, and other drug use as it shown in Table 4.5.
adult surveys have been population- relates to public health policy and The goal of ESPAD is to collect
based and target a wider age range programmes (ESPAD, 2007). Three cross-nationally comparable data on
(in most cases aged 15-64 or age points were apparent: alcohol, tobacco, and other drug
18+) than the youth surveys. The use among students in European
adult surveillance systems des- 1) Systematic information is gene- countries, and monitor the trends in
cribed include: the Global Adult rally best gathered through alcohol and drug use. This is very
Tobacco Survey (GATS) (GATS, surveys important as it relates to the
2007), the International Tobacco 2) Large-scale, on-going surveys European Union (EU) action plan on
Control Survey (ITC) (ITC, 2007), have been conducted, but only in drugs (EPHA, 2007) and the WHO
and the STEPwise Approach to a few countries and not as part of Europe declaration about young
Chronic Disease Factor Surveil- a cross-nationally coordinated people and alcohol (WHO, 2007b).
lance (STEPS) (STEPS, 2007). A system
description of these youth and adult 3) Previous surveys had different
surveillance systems will be pro- methodologies and content, so

1995 1999 2003 2007

Croatia Croatia Croatia Croatia

Cyprus Cyprus Cyprus Cyprus
Czech Republic Czech Republic Czech Republic Czech Republic
Denmark Denmark Denmark Denmark
Estonia Estonia Estonia Estonia
Faroe Islands Faroe Islands Faroe Islands Faroe Islands
Finland Finland Finland Finland
Hungary Hungary Hungary Hungary
Iceland Iceland Iceland Iceland
Italy Italy Italy Italy
Latvia Latvia Latvia Latvia
Lithuania Lithuania Lithuania Lithuania
Malta Malta Malta Malta
Norway Norway Norway Norway
Poland Poland Poland Poland
Portugal Portugal Portugal Portugal
Slovakia Slovakia Slovakia Slovakia
Slovenia Slovenia Slovenia Slovenia
Sweden Sweden Sweden Sweden
Turkey Ukraine Turkey Turkey
Ukraine United Kingdom Ukraine Ukraine
United Kingdom Greece United Kingdom United Kingdom
Greenland Greece Greece
Bulgaria Greenland Greenland
France Bulgaria Bulgaria
FYR Macedonia France France
Netherlands Netherlands FYR Macedonia
Romania Romania Netherlands
Russian Federation Russian Federation Romania
Austria Russian Federation
Belgium Austria
Isle of Man Belgium
Germany Isle of Man
Switzerland Germany
Bosnia & Herzegovina

(ESPAD) by Year of Completion

Global School-Based Student Global Youth Tobacco Survey 150 countries had conducted the
Health Survey (GSHS): (GYTS): GYTS, and over 50 countries had
The GSHS was developed by In 1998, WHO and the CDC repeated the survey at least one
WHO (Health Promotion Division) convened a meeting in Geneva to time. In 2007, 11 countries con-
in collaboration with UNAIDS, address the issue of data needs ducted GYTS for the first time, 46
UNESCO, and UNICEF, and with on tobacco use among youth completed a second round, and 8
technical assistance from the across all Member States of a third round.
United States Centers for Disease WHO. Three summary points
Control and Prevention (CDC), were made at this meeting: Health Behavior of School-aged
Division of Adolescent and School Children Survey (HBSC):
Health in 2001. A school-based 1) Research from developed In 1982, the HBSC was initiated
survey, GSHS is designed to help countries has found that the by researchers from England,
countries assess behavioural risk majority of smokers begin using France, and Norway. The purpose
and protective factors among tobacco products well before of HBSC is to collect data on
students aged 13-15 years. GSHS the age of 18 years (Perry et al., young peoples health and well-
data can be used by countries to 1994; Kessler, 1995) being, health behaviours, and the
develop priorities, establish pro- 2) Little information exists on social context in which youth live.
grammes, and advocate for tobacco use among youth in Data from HBSC have been used
resources for school and youth developing countries to influence health promotion and
health programmes and policies. It 3) To bridge this data gap and to education policy at national and
also can be used by international promote tobacco control for all international levels. In the mid-
agencies, countries, and others to WHO Member States, WHOs 1980s, HBSC was adopted by the
make comparisons across coun- Tobacco Free Initiative (TFI) WHO European Regional Office
tries regarding the prevalence of and CDCs Office on Smoking as a WHO collaborative study.
health behaviours and protective and Health (OSH) agreed to HBSC was developed by a multi-
factors and to analyze trends in support the development of the disciplinary network of researchers
the behaviours. Implementation of GYTS (GTSS Collaborating from countries in Europe and
GSHS started in 2003; by the end Group, 2005). North America. It was first
of 2006, 23 countries had Implementation of GYTS star- conducted in 1983/84 (5 countries),
completed a GSHS (Table 4.6). ted in 1999 with 12 countries then in 1985/86 (13 countries), and
(Table 4.7). By the end of 2006, then every four years: 1989/90 (16

2003 2004 2005 2006 2007

China Chile Botswana Egypt Cayman Islands

Kenya Guyana Lebanon Guatemala Djibouti
Philippines Jordan Oman Morocco Philippines
Swaziland Namibia Senegal Tanzania India
Uganda Zambia Tajikistan Uruguay Libya
Venezuela United Arab Emirates Peru
Zimbabwe St Lucia
St Vincent &

IARC Handbooks of Cancer Prevention

1983/84 1985/86 1989/90 1993/94 1997/98 2001/02 2005/06

Austria Austria Austria Austria Austria Austria Austria

Denmark* Denmark* Denmark* Denmark Denmark Denmark Denmark
England Finland Finland Finland England England England
Finland Norway Norway Norway Finland Finland Finland
Norway Belgium Belgium Belgium Norway Norway Norway
Hungary Hungary Hungary Belgium Belgium Belgium
Israel Scotland Israel Hungary Hungary Hungary
Scotland Spain Scotland Israel Israel Israel
Spain Sweden Spain Scotland Scotland Scotland
Sweden Switzerland Sweden Spain Spain Spain
Switzerland Wales Switzerland Sweden Sweden Sweden
Wales Netherlands* Wales Switzerland Switzerland Switzerland
Netherlands* Canada Netherlands Wales Wales Wales
Latvia Canada Netherlands Netherlands
N Ireland Latvia Canada Canada Canada
Poland N Ireland Latvia Latvia Latvia
Poland N Ireland N Ireland N Ireland
Czech Rep Poland Poland Poland
Estonia Czech Republic Czech Republic Czech Republic
France Estonia Estonia Estonia
Germany France France France
Greenland Germany Germany Germany
Lithuania Greenland Greenland Greenland
Russia Lithuania Lithuania Lithuania
Slovakia Russia Russia Russia
Slovakia Slovakia Slovakia
Greece Greece Greece
Portugal Portugal Portugal
Rep of Ireland Rep of Ireland Rep of Ireland
FYR Macedonia FYR Macedonia
Italy Italy
Croatia Croatia
Malta Malta
Slovenia Slovenia
Ukraine Ukraine

countries), 1993/94 (25 countries), Global Youth Tobacco Survey two-stage sample design,
1997/98 (29 countries), 2001/02 (GYTS): statistical analysis conducted
(36 countries), and 2005/06 (40 The GYTS is a school-based by the CDC (Centers for
countries) (Table 4.8; http:// www. survey of a defined geographic Disease Control and Preven- area that can be a country, a tion, 1999b) has found that, for
province, a city, or any other geo- most sample designs, a mini-
Methodology graphic entity (Centers for Disease mum of 1500 completed
Control and Prevention, 2001). student interviews is needed to
European School Survey Project Samples are selected as follows: obtain a precision level of 5%
on Alcohol and Other Drugs The country research coordi- for a given estimate. WHO and
(ESPAD): nator identifies the grades that CDC use this information to
The ESPAD is a school-based correspond to students aged work with the countries to
survey with the target population 13-15 years in the educational determine the sample size of
being students who are, or will be, system. schools and students needed
16 years old during the year the The research coordinator pre- for each site. The desired
data are collected. ESPAD follows pares a database of schools sample size is then adjusted for
a cluster sample design to produce that include the identified anticipated non-response at the
nationally representative data; but grades. Each school is as- school, class, and student
the sampling can be either total signed a unique identifier to levels. Sample size is further
population sampling, simple cluster facilitate school selection. The increased if regional or popu-
sampling, two-stage cluster sam- number of students enrolled in lation subgroup estimates are
pling, or stratified cluster sampling. each school grade to be requested within the country.
A minimum of 2400 completed surveyed is added to the data-
interviews are recommended by base, which forms the survey Since classes are carefully
ESPAD. If students aged 15-16 are sampling frame. The amount of identified to correspond to
in two or more grades, the survey work involved in creating this students 13-15 years old, the
protocol recommends that all these database varies from country majority of selected students are
grades should be included in the to country. In some countries, in this age group. However, all
sampling frame. the creation of the sampling students in the selected classes
frame has been the most time are eligible to participate regard-
Global School-Based Student consuming part of the GYTS. less of age; therefore, some
Health Survey (GSHS): The database is sent to the students were younger than 13
A school-based survey, GSHS is CDC, where the GYTS sample years or older than 15 years.
conducted primarily among stu- is drawn using a two-stage
dents aged 13-15 years. It uses the cluster sample design. Schools Health Behavior in School-Aged
same methodology as GYTS are selected with probability Children Survey (HBSC):
(discussed below in the GYTS proportional to school enrol- The HBSC is a school-based
methodology section). In 11 coun- ment size during the first stage, survey with the target population
tries, GYTS and GSHS are and then classes within par- of students 11, 13, and 15 years
currently being conducted simul- ticipating schools are selected old. The desired mean age for the
taneously, sharing sampled as a systematic equal pro- three age groups is 11.5, 13.5,
schools, but different classes are bability sample with a random and 15.5 respectively. In some
randomly selected for each survey. start during the second stage. countries, each age group can be
All students in the selected found in the same school year,
classes are eligible to par- while in others they may be found
ticipate in the survey. For this across years with a proportion of

students being advanced or held Survey Instrument regular class period. The
back. Cluster sampling is used questions are translated into the
where the primary sampling unit is European School Survey Project appropriate language of instruc-
school class. The survey is carried on Alcohol and Other Drugs tion for the students and pilot
out as a nationally representative (ESPAD): tested for comprehension. All
sample in each participating coun- Questions on alcohol, tobacco, questions share common charac-
try. The recommended sample and drugs are included in the teristics to enhance the flow of the
size for each of the three age ESPAD. There are core questions survey and comprehension by the
groups is set at approximately that all countries are encouraged student.
1500 students. This target to include, as well as optional and Core GSHS questions on
population assumes a 95% confi- module questions that may be tobacco use include: age of ini-
dence interval of + 3% around a added. Countries are encouraged tiation, cigarette smoking during the
proportion of 50% and a design to field-test their questionnaire. past 30 days (i.e. current cigarette
effect of 1.2, based on analysis of The final version of the ques- smoking), use of other tobacco
existing HBSC data. tionnaire is translated into each products during the past 30 days,
Given differences in school language needed within country attempts to stop smoking during the
systems, age at admission, and the then back-translated into English past 12 months, exposure to
degree of advancement and as a quality control check. The secondhand smoke during the past
holding back among students, research protocol specifies that 7 days, and use of tobacco by
imposing a uniform approach is questionnaires should be adminis- parents or guardians.
problematic in the HBSC. To over- tered anonymously.
come this complexity, age has Tobacco-related questions in Global Youth Tobacco Survey
been a priority for sampling, with ESPAD include: lifetime cigarette (GYTS):
students of the relevant age use, use of cigarettes in the last 30 The GYTS questionnaire is a self-
selected across school years. This days (i.e. current cigarette smo- administered, school-based instru-
position can be further complicated king), age of initiation of cigarette ment consisting of a core set of
when the target population is split smoking, number of friends who questions that are used by all
across different levels of schooling, smoke cigarettes, and number of countries, unless the information
such as primary and secondary. siblings who smoke. is not relevant in that country (e.g.
Where the number of classes pro-cigarette advertising is not
eligible for sampling is unknown, Global School-Based Student permitted in Singapore, so these
probability proportionate to size Health Survey (GSHS): questions are omitted). In addition,
sampling is used, making use of The GSHS includes questions on there is an optional set of ques-
actual or estimated school size. In alcohol, and other drug use; tions from which a country can
some countries, to minimize the dietary behaviours; hygiene; men- draw depending on its needs and
number of participating schools, tal health; physical activity; priorities. Specific guidelines are
classes for one age group were protective factors; respondent followed for questionnaire trans-
randomly sampled in schools, and demographics; sexual behaviour; lation into local languages and
then classes drawn from other tobacco use; and violence and pilot testing. The final ques-
grades in the same schools. In unintentional injury. Each country tionnaire is the responsibility of
order to produce mean ages of develops their questionnaire, each participating country.
11.5, 13.5, and 15.5, the survey is which can include core modules, The 2007 core GYTS ques-
administered at appropriate times core-expanded questions, and tionnaire consists of 54 questions,
of the year. country-specific questions. The and includes items on the
final questionnaire is self-ad- following topics: prevalence of
ministered in classes during one tobacco use, age of initiation,

Data sources for monitoring global trends in tobacco use behaviours

exposure to tobacco advertising, the research protocol states that management procedures across
perceptions and attitudes on the survey should be conducted the countries and within each
behavioural norms with regard to during a week that does not country across time. A GYTS
tobacco use among young people, proceed a holiday. Schools that research manual was developed,
media and advertising, school cur- cannot perform the survey during which includes detailed proce-
riculum, and secondhand smoke an assigned week are encouraged dures for administering the GYTS
exposure. The GYTS core ques- to use the following week. When in schools. The manual is modified
tionnaire includes information that possible, the survey should be for each subsequent GYTS
can be used to monitor seven conducted at the same time in all training to meet the specific needs
Articles of the WHO FCTC classes in a school; thus, avoiding of the countries in those trainings.
(Articles 8, 12, 13, 14, 16, 20, and the possibility of discussion among The manual includes information
21) (WHO, 2003). students in the school. Each on obtaining school participation,
ESPAD researcher decides who to procedures for completing all
Health Behavior in School-Aged use for survey administration (i.e. survey forms, protocol in the
Children Survey (HBSC): teachers, research assistants). classroom, and instructions for
The HBSC questionnaire consists ESPAD provides the survey admi- returning the completed forms to
of a mandatory set of items that nistrator with written instructions on CDC for data processing. The
each country is required to in- how to conduct the data collection GYTS uses a generic answer
clude: health and well-being, in a class. sheet, which allows for a maxi-
tobacco smoking, alcohol use, mum of 99 questions, with eight
cannabis use, physical activity, Global School-Based Student response categories available per
sedentary behaviour, eating ha- Health Survey (GSHS): question. There are no open
bits, body image, weight control, A survey coordinator in each ended questions, skip patterns, or
body weight, oral health, bulling, country manages the GSHS. The multiple response questions in the
physical fighting and victimization, coordinator is responsible for the GYTS. The completed answer
and injuries. Countries can also overall management of the pro- sheets are scanned through an
include items specific to their na- ject, and functions as a liaison with optical reader. Edits for con-
tional needs. The final question- other agencies and organisations sistency and out-of-range res-
naire includes items on health and in the country, as well as with ponses are performed for each
health-related behaviours and the WHO and CDC. Survey coor- question. Data quality issues of
life circumstances of young people. dinators are trained during this type have been rare;
HBSC questions on tobacco regional workshops on the specific consistency failures or out-of-
use include: lifetime tobacco use, procedures to follow for data range responses rarely exceed
current tobacco smoking, rate of collection and data management. 5% per question.
consumption of cigarettes, and The GYTS is administered
age of initiation of daily smoking Global Youth Tobacco Survey during one class period. GYTS
(GYTS): administration procedures were
Survey administration proce- As with GSHS, the GYTS is designed to protect students
dures managed by a survey coordinator privacy by assuring that student
in each country. Regional training participation was anonymous and
European School Survey Project workshops are held each year to voluntary. Before the survey is
on Alcohol and Other Drugs train the coordinators on data administered, each country fol-
(ESPAD): collection and data management lows local procedures for
The ESPAD recommends data procedures. The intent is to stan- obtaining parental permission and
collection during March/April, and dardize the data collection and institutional review.

Health Behavior in School-Aged Chil- culated for each school; a student smaller units (i.e. geographical
dren Survey (HBSC): level, non-response adjustment areas) to ensure coverage of all
In most cases, data collection for factor calculated by class; and a regions. This stratification is likely
HBSC is between October and post-stratification adjustment fac- to reduce standard errors and
May. Data collection consists of tor calculated by sex and grade. should be taken into account
the delivery of questionnaires to The computer program SUDAAN when they are being calculated.
selected schools for teacher admi- ( is
nistration. In some schools, used to compute standard errors, Dissemination o f Informa-
researchers administer the sur- 95% confidence intervals, and tion
vey in the classes in an attempt to weighted prevalence estimates.
minimize teacher burden. Once Information on the ESPAD can be
collected, the data are sent to the Health Behavior in School-Aged found at In
HBSC Internal Data Bank at the Children Survey (HBSC): addition, cross-national reports for
Norwegian Social Science Data HBSC employs a clustered study years 1995, 1999 and 2003
Services for cleaning and final sampling design, where the are available from the Swedish
country dataset preparation. primary sampling unit is the class Council for Information on Alcohol
(or school) rather than the and Other Drugs.
Data analysis individual student, as in a simple Information on the GSHS can
random sample. Given such a be found at
Global School-Based Student design, the students responses chp/gshs/en and http://www.cdc.
Health Survey (GSHS) and cannot be assumed to be in- gov/gshs. Country datasets can
Global Youth Tobacco Survey dependent, as students within the be obtained on both websites.
(GYTS): same class or school are more Information on the GYTS can
Both GSHS and GYTS data are likely to be similar to each other be found at http://www.
weighted to adjust for sample than to students in general. tobacco/global. The GYTS web-
selection (school and class le- Cluster sampling, therefore, re- site includes Country Fact Sheets,
vels), non-response (school, sults in standard errors that tend Country GYTS Reports, and
class, and student levels), and to be higher than would be the access to country datasets. In
post-stratification of the sample case if the same size of sample addition, over 45 articles using
population relative to the grade were obtained using a simple GYTS data have been published
and sex distribution in the total random sample. Consequently, in peer reviewed journals, such as
population. The weighting factor standard errors must be Lancet, Tobacco Control, and
consists of the inverse of the calculated using an appropriate Morbidity and Mortality Weekly
probability of selection for each method that takes into account Reports.
school; the inverse of the pro- the correlation of young people in Information on the HBSC can
bability of selection of each schools or classes (SUDAAN, be found at
classroom; within each selected STATA (, Over 160 articles have been
school, a school level; non-res- and EPI INFO (http:// www.cdc. published featuring HBSC data,
ponse adjustment calculated by gov/epiinfo/) are statistical pac- including recent articles in the
school enrolment size category kages developed for the analysis European Journal of Public
(small, medium, large); school of complex survey data). In Health, Health Education, and the
non-response calculated within addition, a number of countries Journal of Adolescent Health.
each tertile; a class level, non- and regions stratify their samples,
response adjustment factor cal- classifying the sample frame into

only about cigarette smoking. survey, if they discussed reasons

2007 2008 GSHS and GYTS ask about why people their age smoke, and if
cigarette smoking, as well as use of they were taught about the specific
Bangladesh China
other tobacco products. All four health effects of smoking. The other
Brazil Indonesia
surveys ask about age of initiation three surveys do not include items
Egypt Mexico
India Pakistan of cigarette smoking, however to assess school curriculum com-
Russian Philippine ESPAD, GSHS, and GYTS ask ponents.
Federation Poland about first use, whereas HBSC asks GYTS measures exposure to
Thailand Turkey about initiation of daily smoking. pro-tobacco media messages by
Ukraine ESPAD, GSHS, and GYTS ask asking students if they have seen
Viet Nam respondents about secondhand actors smoking in movies, videos,
smoke exposure, but use different or on TV; if they saw ads on
Table 4.9 Countries Participating indicators to assess exposure. billboards or in newspapers for
in the Global Adult Tobacco Sur- ESPAD and GYTS ask about tobacco products; and if they have
veys (GATS) by Year of Survey number of friends who smoke and an object with a cigarette brand
Completion ESPAD asks about number of logo on it. GYTS also asks stu-
siblings who smoke. GSHS and dents if they have seen
GYTS ask about exposure to anti-tobacco media messages.
Summary secondhand smoke at home and in The other three surveys do not
public places during the week prior include indicators of media expo-
Comparison of youth survey to the survey, as well as smoking sure to tobacco advertising.
content behaviour of parents. GSHS and GYTS ask students
All four surveys measure tobacco GYTS assesses school curri- about cessation behaviour. Both
use prevalence (See Table 4.12 for culum by asking students if they surveys ask students if they have
a full comparison of measures by were taught about the dangers of tried to quit smoking in the year
survey). ESPAD and HBSC ask smoking in the year prior to the prior to the survey. GYTS also

Australia Australia Australia Australia Australia Australia

Canada Canada Canada Canada Canada Canada
United Kingdom United Kingdom Ireland Ireland China China
United States United States United Kingdom Malaysia Ireland France
United States Republic of Korea Mexico Germany
Scotland Scotland Ireland
Thailand United Kingdom Malaysia
United Kingdom United States Mexico
United States Uruguay New Zealand
United Kingdom
United States

Table 4.10 Countries Participating in the International Tobacco Control Survey (ITC) by Year of Survey

asks students if they received help dents who miss class or refuse to cco-free programming. Specifically,
to quit smoking and from whom, participate are not represented in the partner organisations will:
and measures tobacco depen- the sample. Third, extensive Refine and optimize tobacco
dency using a standard indicator reliability testing of all the instru- control programmes to help
of addiction (time to first cigarette). ments used by the different smokers stop and prevent
The other two surveys do not surveys has not been completed; children from starting
include measures of cessation. however, questions on tobacco Support public sector efforts to
GYTS assesses minors' access use in GYTS also appearing in the pass and enforce key laws and
to tobacco products by asking CDCs Youth Risk Behavioral implement effective policies, in
current smokers where they Survey (YRBS), have been shown particular, to tax cigarettes,
usually get their cigarettes, if they to have good test-retest reliability prevent smuggling, change the
have been refused purchase of in a study conducted in the USA image of tobacco, and protect
cigarettes when they tried to buy (Brener et al., 1995). workers from exposure to other
them in a store, and if they have peoples smoke
been offered free cigarettes by a Adults Support advocates efforts to
tobacco company representative. educate communities about
The other three surveys do not Purpose the harms of tobacco and to
include measures of minors' enhance tobacco control acti-
access to tobacco products. Global Adult Tobacco Survey vities so as to help make the
(GATS); world tobacco-free
In 2006, the GATS was initiated Develop a rigorous system to
Limitations of youth survey with funds from the Bloomberg monitor the status of global
content Foundation to reduce tobacco use tobacco use.
in low- and middle-income coun- The CDC Foundation worked
There are several limitations tries. The initiative places a priority with partners around the world,
inherent in each of the youth on countries with the greatest particularly with the WHO, and in
surveys. First, the target popu- number of smokers. More than high-burden countries, to develop
lations are young people in half of the world's smokers live in GATS (i.e. establish systematic,
school, and by definition, school- fifteen countries: China, India, standardised global surveillance
based surveys do not attempt to Indonesia, Russia, Bangladesh, and monitoring of the tobacco
collect information about the por- Brazil, Mexico, Turkey, Pakistan, epidemic).
tion of the youth population that is Egypt, Ukraine, Philippines, Thai-
out of school. School-based land, Viet Nam, and Poland (Table International Tobacco Control
surveys are thus not repre- 4.9). Survey (ITC):
sentative of the entire youth In addition to the CDC The ITC Project began in 2002 as
population in any country. The Foundation, other key partners in a prospective cohort study trac-
extent to which the information the Bloomberg Initiative include king and comparing the impact of
collected by a school-based the Campaign for Tobacco Free- national-level tobacco policies
survey is not representative of the Kids, the World Lung Foundation, among representative samples of
total youth population varies by the Johns Hopkins Bloomberg adult smokers in four countries:
country. Second, the school- School of Public Health, and the the USA, Canada, the United
based surveys described in this WHO. Partners are charged with Kingdom, and Australia (Table
section conduct anonymous and working collaboratively to promote 4.10). In 2004, ITC was expanded
self-administered interviews giving international support for tobacco to include smokers from Ireland
each student in a selected class control policies, increase effective and a new cohort of smokers from
one chance to participate. Stu- advocacy, and implement toba- the UK, to evaluate the 2004

Data sources for monitoring global trends in tobacco use behaviours

Ethiopia Algeria American Samoa Burundi Aruba Angola

Fiji Bangladesh Cook Islands Cote dIvoire Iran Barbados
Oman Cameroon Jordan DRC* Kuwait Botswana
Samoa India Lebanon DPRK* Mauritania Cambodia
Indonesia Maldives Egypt Mongolia Cape Verde
Kenya Myanmar Iraq Sri Lanka China
Marshall Islands Nauru Kiribati Thailand Cuba
Micronesia Pakistan Mauritius Vanuatu Curacao
Palau Mozambique Zambia Dominica
Sri Lanka Nepal Dominican Rep
Syria Saudi Arabia Equatorial Guinea
Solomon Islands Gaza Strip
Tokelau Ghana
Tuvalu Grenada
Zimbabwe Kenya
St Kitts & Nevis
South Africa
Trinidad & Tobago
Turks & Caicos
Viet Nam

Ireland smoke-free policy. In 2005, co, and Uruguay; in 2007 France, ITC Project uses multiple country
the collection of ITC countries was Germany and New Zealand joined controls, longitudinal designs, and
further expanded to include co- on. The objective of the ITC is to theory-driven mediational models
horts of smokers in Malaysia, apply rigorous research methods that allow tests of hypotheses
Republic of Korea, Scotland, and to evaluate the psychosocial and about the anticipated effects of
Thailand. In 2006, ITC was further behavioural effects of national given policies.
expanded to include China, Mexi- level tobacco control policies. The

IARC Handbooks of Cancer Prevention

STEPwise Approach to Chronic section). There are currently two specific probabilities) is known.
Disease Factor Surveillance primary STEPS surveillance sys- Aside from needed oversampling
(STEPS): tems: the STEPwise approach to (e.g. by urban/rural and region),
In 2000, the 53rd World Health risk factor surveillance, and the random selection was used in each
Assembly passed a resolution in STEPwise approach to stroke stage in a way that makes
support of the need to prevent and surveillance. The survey is cur- selection probabilities among
control non-communicable di- rently being implemented in over respondents as equal as possible.
seases (NCD). The goal of the 80 countries with new countries Substitution or replacement sam-
resolution was to support WHO coming on board on a regular pling was not allowed in any stage
Member States in their efforts to basis (Table 4.11). STEPS is of the sample design. Four stages
reduce morbidity, disability, and active in all WHO regions except are included in the sample design:
premature mortality related to EURO (where existing sur- primary sample units (PSU) of the
NCDs. Development of a NCD veillance systems are already in smaller, or the smallest, recog-
surveillance system was one of place for NCD risk factors). Nearly nized geopolitical area units with
the primary objectives of this all AFRO countries have done or current statistical population (i.e.
effort, and WHO STEPwise ap- plan to do STEPS surveys. individual or household); count
proach to Surveillance (STEPS) data and quality cartographic maps
was developed to meet this need. Survey methodology (e.g. county, census tract, or block
The WHO STEPS is a simple, group, rather than state in the
standardised method for collec- Global Adult Tobacco Survey USA); secondary sampling units
ting, analyzing, and disseminating (GATS): (SSU) of recognized geopolitical
data in WHO member countries. The GATS is a household survey subunits to the area units used for
By using the same stan- of adults aged 15-64 years. The PSUs; individual housing/dwelling
dardised questions and protocols, sample domains include complete units (see Census website for
all countries can use STEPS population coverage, except for definitions of these geographic
information not only for monitoring areas that have special country terms), or small groups (<10) of
within-country trends, but also for circumstances (e.g. conflict areas, neighboring housing units (HUs
making comparisons across remote areas). In addition, compact segments); and finally,
countries. The approach encou- institutional populations (e.g. pri- within-household sampling of one
rages the collection of small sons, dormitories, hospitals) are study-eligible household resident
amounts of useful information on excluded. A multi-stage sampling from a roster of residents 15-64
a regular and continuing basis. design was used to include all years of age.
As a surveillance system, household members aged 15-64 Targeted sample sizes (for both
STEPS provides information on from a sample of households, with genders combined) for urban and
NCD risk behaviours that one individual randomly selected rural respondents should be
countries can use for better public per household. Interviews were approximately 4000 each. This can
health policy decision-making. completed face-to-face. In this be accomplished by selecting the
The goal of STEPS is to build the survey, a probability sample is same number of PSUs in urban
capacity of