Sie sind auf Seite 1von 63

Representations of Lesbians in British National

Newspapers 2013
A Corpus-based Critical Discourse Analysis

A dissertation submitted to The University of Manchester for the degree of


Master of Arts
in the Faculty of Humanities

2014

Daniela Alejandra Silva Paredes


School of Arts, Languages and Cultures

List of contents
List of tables ______________________________________________________ 04
List of figures _____________________________________________________ 06
Abstract _________________________________________________________ 07
Declaration _______________________________________________________ 08
Intellectual Property Statement _______________________________________ 08
CHAPTER ONE
Introduction _____________________________________________________ 09
1.1. Rationale of the study _________________________________________ 09
1.2. Organisation of Chapters _______________________________________ 11
CHAPTER TWO
Literature Review ________________________________________________ 13
2.1. Overview of Critical Discourse Analysis __________________________ 13
2.2. Corpus-based Critical Discourse Analysis _________________________ 14
2.2.1. Semantic Preference and Semantic Prosody ____________________ 15
2.3. Language and Representations __________________________________ 16
2.3.1. Representations of Lesbians in the Media ______________________ 17
CHAPTER THREE
Data Collection and Methodology ___________________________________ 18
3.1. Research Questions __________________________________________ 18
3.2. Data Collection and Corpus Description ___________________________ 18
3.3. Data Analysis ________________________________________________ 22
3.3.1. Drawing a collocational profile: A quantitative and qualitative
approach ______________________________________________ 22
3.3.2. Refining the collocational profile: A qualitative approach _________ 24
CHAPTER FOUR
Data Analysis I: The collocational profile of lesb* ______________________ 25
4.1. Collocates and semantic categories _______________________________ 25
4.2. Concluding remarks ___________________________________________ 29
2

CHAPTER FIVE
Data Analysis II: Refining the collocational profile of lesb* _____________ 30
5.1. Ties: Couple and couples ______________________________________ 30
5.1.1. Group 1: Relationships, parenthood, legal matters, and statistics ____ 30
5.1.2. Group 2: Politics and law making, religion, arguments and disagreements
.___________________________________________________________ 31
5.1.3. Group 3: Damage, anger, harmful behaviour, and violence ________ 33
5.2. Entertainment: Star ___________________________________________ 33
5.3. Occupation: queen, housekeeper, vet, MP, and ministers ______________ 36
5.3.1. Queen __________________________________________________ 36
5.3.2. Housekeeper and MP ______________________________________ 37
5.3.3. Vet ____________________________________________________ 40
5.3.4. Ministers _______________________________________________ 40
5.4. Colour, Nationality, and Creed: Black, French, and Muslim ___________ 42
5.4.1. Black __________________________________________________ 42
5.4.2. French _________________________________________________ 42
5.4.3. Muslim _________________________________________________ 43
5.5. Age: Young _________________________________________________ 46
5.6. Appearance: Fat _____________________________________________ 46
5.7. Concluding Remarks __________________________________________ 49
CHAPTER SIX
Conclusion _____________________________________________________ 50
6.1. Main Findings and Implications _________________________________ 50
6.2. Assessment of the Study _______________________________________ 53
6.3. Suggestions for Future Research ________________________________ 54
References _____________________________________________________ 56
Appendix 1: List of lexical collocates __________________________________ 61

Final word count: 14,065

List of Tables
Table 3.1 Corpus of British national newspaper articles (BN13L) with at least one
instance of the words lesbian, lesbianism, or lesbo, in singular and/or
plural form, published in the year 2013 ___________________________ 22
Table 4.1. Top 20 Lexical collocates of the query lesb* within a -4 and +4 span
with a minimum MI value of 4, sorted by decreasing MI score ________ 25
Table 4.2. Semantic categories of the lexical collocates of lesb* with a minimum
MI value of 4, ordered by decreasing frequency ____________________ 26
Table 4.3. Subcategories of Ties ordered by decreasing number of collocate
frequency __________________________________________________ 27
Table 5.1. Sample of concordance lines of the collocates couple and couples and
the query lesb* showing collocates of the semantic categories in
groups 1 and 2 ______________________________________________ 32
Table 5.2. Sample of concordance lines of the collocate couple and the query
lesb* showing collocates of the semantic categories in group 3_______ 35
Table 5.3. Sample of concordance lines of the collocate star and
the query lesb* ____________________________________________ 35
Table 5.4. Sample of concordance lines of the collocate queen and
the query lesb* ____________________________________________ 38
Table 5.5. Sample of concordance lines of the collocate MP and
the query lesb* ____________________________________________ 38
Table 5.6. Sample of concordance lines of the collocate housekeeper and
the query lesb* ____________________________________________ 39
Table 5.7. Sample of concordance lines of the collocate vet and
the query lesb* ____________________________________________ 41
Table 5.8 Sample of concordance lines of the collocate ministers and
the query lesb* ____________________________________________ 41
Table 5.9 Sample of concordance lines of the collocate black and
the query lesb* ____________________________________________ 44
Table 5.10 Sample of concordance lines of the collocate French and
the query lesb* ____________________________________________ 44
Table 5.11 Sample of concordance lines of the collocate Muslim and
the query lesb* ____________________________________________ 45

Table 5.12 Sample of concordance lines of the collocate young and


the query lesb* ____________________________________________ 48
Table 5.13 Sample of concordance lines of the collocate fat and
the query lesb* ____________________________________________ 48

List of Figures
Figure 3.1. A LexisLibrary search interface _____________________________ 20
Figure 3.2. Close up of the LexisLibrary search interface, showing the query terms
used in the first search _______________________________________ 21
Figure 3.3. Close up of the LexisLibrary search interface, showing the query terms
used in the second search ____________________________________ 21

Abstract
For most people, the mass media are considered a reliable source of knowledge and
information. Every day, the media report events to large numbers of people, having the
potential to influence their audiences attitudes and opinions about those events and the
people involved in them. Newspapers are an interesting source of data from which to
evaluate media messages that are being communicated to us as well as their ideologies.
These messages and ideologies are conveyed through language, which has an important
role in shaping, maintaining, and changing relations of power. The present study
pertains to language and sexuality, and attempts to identify how lesbians were
represented in British national newspapers published in 2013. This year was selected so
as to provide some insights into contemporary media representations of this sexual
identity group. This study uses a corpus-based methodology and critical discourse
analysis, which allow for the examination of large data sets from a critical perspective.
Previous studies on language and sexuality have been mainly concerned with
homosexual men. The ones that have examined media representations have found that
gay men are often represented within discourses of gay identity and homophobia,
among others. Regarding media representations of lesbians, a number of studies from
various disciplines have found that they tend to be represented within a discourse of
sexual desire. Similarly, the present study concluded that there was no single
representation of lesbians in the corpus, but a number of conflicting ones. These
findings coincide with those in previous studies about sexual identity groups, with
lesbians being portrayed within discourses of gay identity, sexual desire, and
homophobia. However, new representations have also emerged, which portray them in
relation to motherhood, vulnerability, and particularly under an aura of conflict.

Declaration
I declare that no portion of the work referred to in the dissertation has been submitted in
support of an application for another degree or qualification of this or any other university or
other institute of learning.

Intellectual Property Statement


i. The author of this dissertation (including any appendices and/or schedules to this
dissertation) owns certain copyright or related rights in it (the Copyright) and s/he has
given The University of Manchester certain rights to use such Copyright, including for
administrative purposes.
ii. Copies of this dissertation, either in full or in extracts and whether in hard or electronic
copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988
(as amended) and regulations issued under it or, where appropriate, in accordance with
licensing agreements which the University has entered into. This page must form part of any
such copies made.
iii. The ownership of certain Copyright, patents, designs, trade marks and other intellectual
property (the Intellectual Property) and any reproductions of copyright works in the
dissertation, for example graphs and tables (Reproductions), which may be described in
this dissertation, may not be owned by the author and may be owned by third parties. Such
Intellectual Property and Reproductions cannot and must not be made available for use
without the prior written permission of the owner(s) of the relevant Intellectual Property
and/or Reproductions.
iv. Further information on the conditions under which disclosure, publication and
commercialisation of this dissertation, the Copyright and any Intellectual Property and/or
Reproductions described in it may take place is available in the University IP Policy (see
http://documents.manchester.ac.uk/display.aspx?DocID=487), in any relevant Dissertation
restriction declarations deposited in the University Library, The University Librarys
regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The
Universitys Guidance for the Presentation of Dissertations.

CHAPTER ONE
Introduction
Many studies have focused on the relation between language, gender, and sexuality
from various perspectives. A number of studies have concentrated on differences in
language use by different genders and/or sexual identity groups (e.g. Tannen 1990;
Hayes 1976), while others have examined how gender and sexual identities are
constructed or represented through language. Within the last group some studies have
analysed how individuals construct their own gender or sexual identity through
language (e.g. Cameron 1997; Holmes and Schnurr 2006; Koller 2013); while others
have focused on how gender or sexual identity are represented by others, which would
correspond to an out-group perspective (e.g. Baker 2005; Kosetzi and Polyzou 2009;
Atanga 2012).
Significantly, the body of literature on language and gender has tended to focus
specifically on heterosexual individuals. Similarly, scholarship on the relation between
language and sexuality has been mainly concerned with homosexual men (Baker 2008).
According to Lazar (2005: 10), lesbians are an especially vulnerable sexual identity
group, as they are not only subjected to discrimination from the hetero-gendered order
but they are also made invisible within the gay community. According to Ellis et al.
(2003:135) it is a widespread belief among lesbian and gay psychologists that
invisibility is a way of perpetuating negative attitudes toward lesbians and gay men,
while also leading to ignorance about their issues, needs, and concerns (ibid.). With
this in mind, the existence of media representations can be seen as a way of contributing
towards tolerance and a better understanding of topics related to this sexual identity
group.
1.1 Rationale of the Study
For most people, the mass media are considered a reliable source of knowledge and
information. Every day, the media report events to large audiences, having the potential
to influence their audiences attitudes and opinions about those events and the people
involved in them. News producers decide what piece of news is worth writing/talking
about, which one is not, what to include in these reports, and what to omit. Because of
this, Fairclough (1989) defines media discourse as involving hidden relations of
power.
9

In his seminal work, Foucault (1972: 49) defines discourse as practices which
systematically form the objects of which they speak. Similarly, Burr (1995: 48)
suggests meanings, metaphors, representations, images, stories, statements and so on
have the potential to produce a particular version of events (ibid.). Against this
backdrop, it becomes necessary to identify which discourses or versions of lesbians
are being transmitted through the media, so as to determine their potential effect on
media consumers. The present study focuses on the discourses of print media, more
specifically on newspapers genres. The print media seem an interesting source of data
for the following reasons. First, as OKeeffe (2011) suggests, media discourse is
manufactured, not spontaneous. Because of this, it is important to consider the
ideologies that go into the manufacturing process, and continuously evaluate the
messages that are being communicated to us. Second, as Wodak and Busch (2004)
assert, language becomes a powerful tool when it is used by those who have power. As
van Dijk (2010) notes, by controlling public discourse the media exercise this power,
indirectly controlling the public mind. This is facilitated by their continuous circulation
in the social world, which allows them to produce and reproduce cultural and
ideological meanings (Wodak and Busch 2004).
Issues related to same-sex relationships have received considerable media
attention in the UK, particularly in the last decade due to the passing of the 2004 Civil
Partnerships Act and the Marriage (Same Sex Couples) Act 2013. Events such as these
have kept lesbians and gay men in the news headlines, with newspapers often reporting
about support, opposition, and approval regarding these laws, as well as other issues
related to lesbian and gay individuals. In the present study, I intend to ascertain the most
recurrent patterns of representation of lesbians in newspaper articles. The corpus
compiled for this study consists of original English news texts published in Great
Britain in the year 2013. This year was selected because it marked an important
milestone for lesbian, gay, bisexual and transgender (LGBT) rights with the passing of
the Marriage (Same Sex Couples) Act 2013 for England and Wales. It was therefore
envisaged that articles published during this period could provide some interesting
insights into contemporary media representations of this sexual identity group.
Due to the nature of the data that makes up my corpus, I bring two approaches
together, namely corpus linguistics and Critical Discourse Analysis (hereafter CDA). A
corpus-based approach will allow me to locate recurrent patterns in the data set in a
time-efficient and statistically significant manner. A CDA approach will allow me to
10

closely analyse my findings in their textual environment, being able to interpret them in
terms of their social implications. The use of a critical approach seems particularly
suitable when focusing on a sexual minority group who often undergoes
discrimination both due to their sexual identity, and gender.
In the field of linguistics, there has been a surge in the number of studies using
corpus-based CDA to analyse the way in which certain topics or individuals are
represented in newspapers. Some of the topics these studies have focused on are
immigration in the UK (Baker and McEnery 2005; OHalloran 2009; Baker et al. 2013);
feminism in British and German press (Jaworska and Krishnamurthy 2012); and US
news media discourses pertaining to North Korea (Kim 2014). Additionally, the
methodological approach adopted in these studies has been considerably recommended
as a good alternative to analyse large data sets from a critical perspective (see HardtMautner 1995; Baker et al. 2008; Baker 2012).
To the best of my knowledge, the only work in linguistics that more closely
resembles the methodology and focus of this study was carried out by Baker (2005). In
his book, he examined the constructions and representations of gay men in a variety of
genres, such as debates, press, sitcoms, and adverts. Through a corpus-based approach,
he identified a series of related yet conflicting discourses of homosexuality. As can be
seen, corpus-based CDA studies focusing exclusively on lesbians have not been carried
out as yet, and that is an area in which this study aims to contribute.
1.2 Organisation of Chapters
Following this introduction, chapter Two presents the main theoretical concepts that
allow for an understanding of my object of study. The chapter provides an overview of
the main tenets of CDA followed by an explanation of what makes a corpus-based CDA
approach a suitable methodology. Additionally, it presents some important notions that
are used in the analysis such as collocation, semantic preference, and semantic prosody.
Finally, the chapter refers to language and representations in more depth, supplementing
these notions with some findings about lesbian representations in different studies.
Chapter Three presents the research questions this study intends to answer, and
outlines the steps followed in the collection of the data. Additionally, it provides a
corpus description, and presents a detailed picture of the methodological procedure to
be followed so as to answer the research questions.

11

Chapters Four and Five present the results of the data analysis. The aim of these
chapters is to answer the research questions introduced in Chapter Three.
Chapter Six revisits my research questions, and discusses the main findings and
implications of this study. Finally, it includes an assessment of the study, and provides
some suggestions for future research.

12

CHAPTER TWO
Literature Review
2.1 Overview of Critical Discourse Analysis
According to Wodak and Meyer (2009: 2), CDA is a constitutive, problem-oriented,
interdisciplinary approach that focuses on the relations between discourse and society
(van Dijk 1995). Primarily, CDA aims to investigate any social phenomena, especially
when they can be regarded as a manifestation of social inequality, such as sexism,
homophobia, or racism. For CDA scholars, language plays an important role in the
production, maintenance, and change of social relations of power (Fairclough 1989:
1). Because of this, raising awareness of how language contributes to perpetuating such
relations of power, often through the domination of powerless social groups, has been
recognised as the first step towards emancipation (ibid.).
As Wodak and Meyer (2009: 8) suggest, CDA is interested in revealing power
structures and unmasking ideologies. Under CDA, dominant ideologies are
conceptualised as a set of everyday beliefs that tend to appear and be perceived by
society as neutral, rather than the product of power imbalances. According to Wodak
and Meyer (ibid.) discourse is one of the means through which social domination and
inequality are (re)produced by those in power, including the media, politicians, and
teachers i.e. members of a symbolic elite that controls communication, information,
and knowledge (van Dijk 2010). Against this backdrop, examining how the elite
exercises its power via language is a major aim of CDA specialists.
As Wodak and Meyer (2009) assert, the concept of discourse has been used in
different ways by different researchers and academic cultures. In the context of CDA,
discourse is shaped by a dialectical relationship between any given discursive event and
the situation(s), institution(s) and social structure(s), which frame it and vice versa
(Fairclough and Wodak 1997: 258). In this way, discourse contributes to the
maintenance and reproduction of the social status quo, but it can also contribute to its
transformation.
There are several schools within CDA, each with its own theoretical background
(Wodak and Meyer 2009). Consequently, CDA does not stipulate a normative approach
to the analysis of discourse. The approach adopted by each school is determined by the
data under study and often involves a combination of methodologies (Baker 2012).
These studies can thus incorporate analytical strands pertaining to grammar, phonology,
13

semantics, or other semiotic dimensions, such as pictures, films, or gestures. Among the
different methodological approaches available to CDA specialists, this study has chosen
to draw on corpus linguistics.
2.2 Corpus-based Critical Discourse Analysis
As Wodak and Meyer (2009) note, corpus linguistic, henceforth CL, provides
quantitative methodological tools to carry out CDA studies and can be combined with
other CDA approaches. Combining methods from CL and CDA is a tradition that first
started in the 90s (e.g. Hardt-Mautner 1995; Krishnamurthy 1996) and continues to this
day, with an increasing number of corpus-based CDA studies being published every
year. As Orpin (2005) suggests, quantitative CL methodology is very useful when we
focus on lexical items whose collocational and syntactic patterns we aim to describe.
As Mautner (2009: 124) states, using CL tools can be a practical and efficient
time-saver. The CL methodology operates by quickly processing computer-held
collections of naturally occurring language. This approach is characterised by the
analysis of large sets of data with a range of computer software applications (e.g.
concordancers) and tools (e.g. statistical tests to calculate frequencies, obtain
collocational profiles, and keyword lists) that yield quantitative insights into the texts
under study. Furthermore, there are CL tools that allow researchers to approach their
data from a qualitative perspective. One of these tools produces concordances, which
are corpus extracts where it is possible to examine the co-text of a particular search
word (also referred to as node) and identify patterns within it.
In approaches that combine CL and CDA methods, each stage of the analysis
informs the next one (Baker et al. 2008). Once the analyst has made a decision about
what to include in the corpus, what her/his focus will be, and the cut-off points of
statistical significance s/he will consider, CL tools are applied and the results of
quantitative analysis are obtained. At this stage, the analyst must take a qualitative
stance so as to make sense of the findings yielded by CL tools. A popular way in which
this qualitative stance is adopted in corpus-based CDA studies is through the
identification of semantic preference and semantic prosody (see section 2.2.1 below).
This is carried out through a close analysis of concordance lines so as to identify
patterns in the co-text that are not immediately apparent by looking at frequencies,
collocations, or keywords. This type of analysis can also be supplemented with the use
of dictionaries, or the comparison of findings to those in other corpora.
14

As can be seen, CL methods can act as a powerful heuristic tool helping to clear
pathways to discovery (Mautner 2009: 124). By yielding quantitative findings, a
corpus-based approach helps to reduce researcher bias as researchers can avoid over or
under-interpreting results (Baker et al. 2008). This makes the corpus-based CDA
approach a good way of facing an issue for which CDA has received a fair amount of
criticism, especially by Widdowson (e.g. 1995, 1996, 1998), who claims CDA lacks
objectivity.
2.2.1 Semantic Preference and Semantic Prosody
In order to explain the notions of semantic preference and semantic prosody it is
necessary to be familiarised with the concept of collocation. As Baker et al. (2008: 278)
suggest, collocation can be defined as the above-chance frequent co-occurrence of two
words within a pre-determined span. With this in mind, collocates are words that cooccur next or in close proximity to another word with statistically significant frequency.
The collocates of a word can help to shape its meaning (Nattinger and DeCarrico 1992),
and even convey messages implicitly (Hunston, 2002: 109). Based on these ideas, they
can be regarded as a useful way of discursively presenting a group (Baker 2006).
According to Stubbs (2001: 65), semantic preference is a relation between a
lemma or word-form and a set of semantically related words. As Baker (2006) states,
semantic preference is a concept that is closely linked to that of collocation, when our
focus is on the meaning of words rather than their grammatical function. For example,
Partington (1998) studied how the intensifying adjective sheer collocated with other
lexical items, finding that it occurred in the company of words related to volume,
strength, persistence, and strong emotion, among others. Similarly, Baker (2006)
found that the word rising co-occurred with words related to money and work such
as wages, and unemployment, in the British National Corpus.
A concept related to that of semantic preference is semantic prosody. This
notion has been developed by authors such as Louw (1993), Sinclair (1996) and Stubbs
(2001), although the latter uses the term discourse prosody to refer to it. Semantic
prosody has been defined by Louw (1993: 157) as the consistent aura of meaning with
which a form is imbued by its collocates. A semantic prosody is an aspect of
evaluative meaning (Partington 2004: 131). It is attitudinal (Sinclair 1996), as it often
expresses the speakers/writers stance (Hunston and Thompson 2000). In other words,
it is a descriptor of speaker attitude (Stubbs 2001: 88). Additionally, Stubbs (2001)
15

and Partington (2004) claim that the evaluative meanings of semantic prosodies are not
necessarily communicated through the directly adjacent collocates of a particular word,
but through the lexical item and its extended environment. As an example of semantic
prosody, let us consider one of Stubbs (2001) studies. In his analysis of the verbs cause
and provide, he found that the former occurred more often with words designating
unpleasant events, while the latter occurred near lexical items pertaining to things that
were necessary or desirable. In another study, Partington (1991) found that the
maximizer perfectly tended to occur with words that referred to good things, while
absolutely showed a balance between positive and negative items.
2.3 Language and Representations
As Baker (2006) suggests, any object or concept is surrounded by not one, but various
ways of constructing it. These constructions or representations circulate in the social
world through language, having the potential to become naturalised through repetition,
which is an essential requisite for the construction of the social world (Bourdieu 1991).
These ideas can be illustrated in the findings of corpus research, which has
demonstrated that words tend to be used in routine phrases that have become
conventionalized in their semantics and are sometimes even lexically predictable
(Stubbs 2001). Similarly, Hoey (2005: 8) suggests that a word or word sequence
becomes cumulatively loaded with the contexts and co-texts in which it is encountered,
and our knowledge of it includes the fact that it co-occurs with certain other words in
certain kinds of context. This knowledge, he asserts, is not necessarily conscious, and
becomes part of our communicative competence.
One way in which representations can be easily and effectively disseminated is
through the mass media. As Krishnamurthy (1996) asserts, our daily contact with mass
media language has the potential to influence the language we use. Through the media
we access language that has been continuously recycled (Stubbs 2001), words and
expressions that can make us adopt certain attitudes and opinions (Krishnamurthy
1996). Consequently, understanding how a particular event, object or individual is
represented in the media can help us to get a sense of the ideas about and attitudes
towards this object that are being circulated to a large audience.

16

2.3.1 Representations of Lesbians in the Media


The majority of the studies that have aimed to identify media representations of lesbians
have focused on how they are portrayed in films and TV shows. These studies have
identified some changes in the way lesbians have been represented through time.
According to Ciasullo (2001), in the past they were usually represented as undesirable
and stereotypical. Among these stereotypes we find the masculine lesbian. Similarly,
Doty and Gove (1997: 86) note that they tended to be shown as violent predatory
butches. Nowadays, these representations are still available, but they have been
supplemented with others. As Jackson (2009: 199) asserts lesbians are often portrayed
as defined by their embodied sexuality and constituted as sexually desiring and
desirable subject. This has also been acknowledged by McRobbie (1996, 2004), who
links these representations of self-pleasing and sexually plural individuals to postfeminist discourses. Through the new representations, lesbianism has also come to be
portrayed in close association with the notion of heteroflexibility, a concept suggested
by Essig (2000). The idea of heteroflexibility pertains to representations of women
experimenting with same-sex sexuality. As Wilkinson (1996) notes, these depictions
help to trivialise homosexual relations, showing them as a choice and a temporary
phase. Finally, Jackson (2009: 201) suggests that the contemporary depictions of
lesbians that construct them within a post-feminist discourse of sexual desire objectify
them in the same way as heterosexual women have been objectified. This, in turn,
construct them for consumption by the male gaze (ibid.)
In the field of linguistics, studies focusing on representations of lesbians have
been scarce. These studies have either focused on lesbians from an in-group perspective
(see Koller 2009, 2013), or analysed their representations in conjunction to those of gay
men. In one of these studies, Gouveia (2005) examined the ways in which gays and
lesbians were represented in a series of texts about gay power in a Portuguese
newspaper. Using a combination of CDA and Hallidays systemic functional grammar,
Gouveia concluded that the term gay people was mostly used in reference to gay men,
with lesbians barely appearing in the articles he analysed, and even less so as explicit
agents of power. The present study is developed against this backdrop.
The next chapter presents the research questions, data collection techniques, and
methodological steps that are followed so as to identify the representations of lesbians
in British national newspapers.

17

CHAPTER THREE
Data Collection and Methodology
This chapter presents the research questions that this study aims to answer.
Additionally, it outlines the process adopted to create the corpus that comprises the data
set for this study. A description of the corpus is also provided. Finally, it explains the
methodological steps followed in the analysis of the data set.
3.1 Research Questions
The present study aims to answer the following research questions:
Overarching question:
What does an approach that combines CL and CDA methods reveal about the ways in
which British newspapers represented lesbians in the year 2013?
Sub-questions:
1. What does a collocational analysis of the corpus reveal about the semantic
categories used in the representations of lesbians?
2. What does a closer analysis of concordance lines reveal about the semantic
preferences and the semantic prosodies featuring in the representation of
lesbians in the corpus?
3.2 Data collection and corpus description
In order to investigate the representations of lesbians in British newspapers, a corpus of
articles published in 2013 was collected. This corpus is made up of 1,469,036 words
distributed across 2,027 articles published in 11 British national newspapers and their
Sunday editions, when available. These newspapers were selected because they
represent the whole range of British national newspapers which are readily accessible
on the LexisLibrary database. The newspapers are The Daily Mail, Mail on Sunday,
The Daily Telegraph, The Sunday Telegraph, The Daily Star, The Express, The Sunday
Express, The Daily Mirror, The Sunday Mirror, The People, The Guardian, The
Independent, The Independent on Sunday, The Observer, The Sun, The Times, and The
Sunday Times. Made up of both broadsheets and tabloids articles, the corpus represents

18

a wide range of positions within the UKs contemporary political spectrum. However,
this factor did not play a role in the project design, since the focus of this study is to
identify the representations of lesbians in the British national newspapers as a whole,
rather than across the different types of newspapers.
The selection of the articles that make up the corpus responds to the following
criteria. First, they had to be published between the 1st January and the 31st December
2013, both dates inclusive. Second, they had to contain at least one occurrence of the
word lesbian, lesbianism or lesbo 1 in either their singular or plural forms
regardless of the journalistic genre the article belonged to. As a result, the corpus
contains articles from a variety of genres, such as news reports, news summaries,
reviews, obituaries, letters to the editor, sports news, and commentary pieces, among
others.
The corpus articles were obtained using the LexisLibrary online database, which
was accessed through the University of Manchester. The data set was retrieved using
the duplicate options on the LexisLibrary search interface. If an article has been
published more than once in a newspaper2, this database groups all of its versions into
one cluster. In this way, when the results are saved into a file, only one version of the
article is included in that file. The duplicate options offered by LexisLibrary allow users
to obtain results where highly similar or moderately similar3 articles have been grouped.
In this study, moderate similarity was chosen. This means that for every cluster of
articles identified as slightly similar by LexisLibrary, only one was included in the data
set. After the moderate similarity option had been selected, the collection of articles was
carried out in two searches4 (Figure 3.1 shows a screenshot of the LexisLibrary search

Articles featuring the name of the Greek island, Lesbos, and no reference to lesbians in their
body were not included in the corpus.
2
As LexisLibrary contains all the editions of a newspaper that have been published on a single
day, several versions of one article may be retrieved by a search. These versions can be identical
or include changes the article has undergone throughout the day.
3
LexisLibrary states that highly similar documents must be nearly identical so as to be included
in the same group of similar documents. On the other hand, when moderate similarity is chosen,
documents with relatively less similarity are included in the same group of similar documents.
Unfortunately, LexisLibrary does not provide details on how the different degrees of similarity
are determined by the duplicate options tool.
4
The results of the searches retrieved articles where the search terms occurred in both their
singular and plural forms.

19

interface). The first search consisted of the following query terms 5: ATL2(lesbian) OR
lesbianism OR lesbo (see Figure 3.2). In this search, the ATL2(lesbian) query term
was used so as to retrieve articles that contained the word lesbian at least twice, i.e. in
the body and the index6 of the article. The second search consisted of only one query
term, namely lesbian and not terms (lesbian) (see Figure 3.3). This search was used so
as to exclude indexes from the results and thus helped to identify only those texts where
the search word occurred just in the body of the article. The decision to carry out two
consecutive searches was made after finding out that using only the word lesbian as a
search word retrieved articles that contained no instance of that word in the body of the
text which did, however, appear in the index as Gays & Lesbians.

Figure 3.1. A LexisLibrary search interface

Each of the query terms in single quotation marks was entered into one individual search
terms box, as LexisLibrary allows users to enter up to five query terms at a time for only one
search.
6
The index of an article is a set of labels used to group its content according to the topics it
covers. This concept is similar to what is commonly known as keywords or tags.

20

Figure 3.2. Close up of the LexisLibrary search interface, showing the query terms used in the
first search.

Figure 3.3. Close up of the LexisLibrary search interface, showing the query terms used in the
second search.

The articles retrieved by the searches were saved in txt files of up to 100
documents each. These files were checked for the presence of duplicates of different
length, as the duplicate option was not reliable in these cases. So as to find these
duplicates, the files were loaded onto AntConc 3.4.1m (Anthony 2014). In order to
locate the articles that had been stored more than once, I began by using the query term
lesb*. By focusing on the collocates of the query term (using a collocation span of -1
and +1) and the concordance lines where such collocates appeared, I was able to
establish whether two articles could be regarded as duplicates. When duplicates were
found, only one article was kept, provided the other version(s) had been published on
the same day. Finally, each file was cleaned individually in order to remove metadata
(e.g. name of the newspaper or other technical terms pertaining to the process of

21

publication in print press media, such as copyright) that may have otherwise distorted
the findings of the corpus interrogation. Once these metadata had been removed, the
final 2013 British newspaper corpus of articles that mentioned lesbians, henceforth
BN13L, can be described as follows (Table 3.1).
BN13L CORPUS
N of Articles
2013
The Daily Mail and Mail on Sunday
165
The Daily Telegraph and The Sunday Telegraph
214
The Daily Star
137
The Express and Sunday Express
72
The Daily Mirror and The Sunday Mirror
163
The People
20
The Guardian
270
The Independent and The Independent on Sunday
182
The Observer
90
The Sun
308
The Times and The Sunday Times
406
Total
2,027

N of Words

N of Tokens

161,222
163,448
38,206
39,677
64,768
6,763
272,471
128,319
103,390
111,767
379,005
1,469,036

266
287
189
91
235
23
500
300
127
439
591
3,048

Table 3.1. Corpus of British national newspaper articles (BN13L) with at least one instance of
the words lesbian, lesbianism, or lesbo, in singular and/or plural form, published in the
year 2013.

3.3 Data Analysis


The analysis of the corpus will be carried out using a combination of CL and CDA
methods. These, in turn, will allow me to address the data set using quantitative and
qualitative approaches. The analysis of the corpus has been divided into two parts.
3.3.1 Drawing a collocational profile: A quantitative and qualitative approach
The analytical procedure described in this section aims to identify the semantic
categories used in the representations of lesbians in the corpus. In order to achieve this
goal, the corpus will be loaded to AntConc 3.4.1m, a general-purpose corpus analysis
toolkit or concordancer. AntConc 3.4.1m can be used to analyse text in any language
supported by Unicode standards. This software is freeware and will allow me to search
the data set for collocates using statistical measures. To obtain these collocates,
AntConc 3.4.1m can use either the Mutual Information (hereafter MI) or t-score
statistical measures. These measures determine the strength of the bond between two
lexical items. Among other functions, the software allows its users to look at
22

concordance lines, and get file view access. This last tool presents the key words in their
corresponding texts.
Through AntConc 3.4.1m, the collocates of the search query lesb* within a
collocation span of -4 and +4 words and a minimum collocate frequency of 10 will be
obtained. The asterisk in the query term acts as a wildcard that indicates that zero or
more characters can follow the search term. In this study, the collocates will be
retrieved using the MI algorithm, which measures the strength of the collocational bond
between the query term and its collocates. This measure is obtained through the
calculation of the difference between the expected and observed frequency of cooccurrence of all potential collocates in the corpus. The MI statistical measure was
chosen because the highest MI scores tend to include content words7. These words are
more useful to determine what a text is about (Baker 2006; Stubbs 2011), being more
suitable to identify discourses within the data set. The collocates with an MI score of at
least 4 will be examined and put into semantic categories according to the semantic
preferences they suggest. The categorisation process will consider only those collocates
that are content words, leaving out function words. Proper nouns will also be excluded
from the categorisation process. As Baker (2006) suggests, function words may yield
many irrelevant examples, while proper nouns may make it difficulty to generalise our
findings. In the categorisation process, the meanings of the different collocates will be
determined using the Macmillan Dictionary as reference. As some words can belong to
more than one semantic category, their concordance lines will be closely scrutinised so
as to determine their meaning in context. This procedure will be carried out using the
concordance tool of AntConc 3.4.1m.
As can be seen, the procedure described in this section involves a quantitative
approach, where CL tools are used to obtain the collocates of the query term lesb*
using a statistical measure. Additionally, there is a qualitative procedure involved in the
classification of the collocates into semantic categories. Once these semantic categories
have been identified, their collocates will be closely examined so as to identify lexical
items that may require more in-depth analysis. The procedure that will be followed in
the analysis of these collocates is explained in the next section.

On the other hand, the highest t-scores tend to include function words (Mautner 2009), which
have the main function of relating content words to each other (Stubbs 2001).

23

3.3.2 Refining the collocational profile: A qualitative approach


As Mautner (2007: 55) asserts, quantitative indicators highlight particularly promising
entry points into the data. Based on this idea, some of the collocates obtained in the
previous section will be further analysed using a qualitative approach. These collocates
will be selected according to their potential to enhance the set of representations of
lesbians in the corpus.
The qualitative analytical procedure will involve the following steps. First, the
concordance lines of the collocates selected will be closely scrutinised using the
concordance and file view tools of AntConc 3.4.1m. The aim is to identify lexical items
that frequently co-occur with these collocates. Then, these lexical items, or collocates,
will be classified into semantic categories according to the semantic preferences they
suggest. Again, the categorisation process will be carried out using the Macmillan
Dictionary as reference. Finally, the semantic prosodies of the semantic categories will
be determined, according to the connotations the collocates in the semantic categories
suggest. The semantic prosodies in this study will be classified as either positive or
negative. As Mautner (2009: 128) suggests, the notions of semantic preference and
semantic prosody taken together show us what kind of social issues a particular lexical
item is bound up in, and what attitudes are commonly associated with it. In this way,
the identification of semantic preferences and semantic prosodies will make it possible
to identify the social domains and social attitudes associated with lesbians in the corpus.
As can be seen, a closer analysis of concordance lines will allow me to refine the
collocational profile of lesb* through the identification of collocates that contribute to
particular semantic preferences and prosodies of lesbians in the corpus. The aim is to
identify different ways in which the British newspapers represented this sexual identity
group in 2013. The findings of the data analysis procedures described in these two
sections are provided in the following two chapters.

24

CHAPTER FOUR
Data Analysis I: The collocational profile of lesb*
This chapter offers a collocational profile of the query lesb*. The aim is to identify the
semantic categories that the collocational analysis of the corpus suggests so as to
determine what these semantic categories reveal about the representations of lesbians in
the data set. This is done using quantitative and qualitative analyses.
4.1 Collocates and Semantic Categories
The search for collocates of the query lesb* in the BN13L corpus calculated using the
MI statistical measure returned a total of 276 collocate types. Out of this number,
collocates with an MI score of at least 4 made up a total of 106, from which 86 were
lexical words. The top 20 collocates are shown in Table 4.1 (a table with the 86 lexical
collocates can be found in the Appendix).
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Collocate
transgender
bisexual
bisexuals
gays
vet
housekeeper
fling
lover
graphic
kiss
trans
couple
gay
affair
couples
fat
Muslim
explicit
equality
queen

MI score
8.67468
8.63505
8.62849
7.8661
7.65546
7.6068
7.38748
7.34121
7.29592
7.00046
6.93661
6.85356
6.82954
6.71319
6.69796
6.61255
6.59197
6.37215
6.25545
6.25345

Frequency
201
294
12
90
18
22
11
59
18
33
13
135
861
38
117
15
13
12
61
32

Table 4.1. Top 20 Lexical collocates of the query lesb* within a -4 and +4 span with a
minimum MI value of 4, sorted by decreasing MI score.

25

After a close analysis of these collocates, the tokens were grouped into 12
semantic categories, as shown in Table 4.2.
N

Collocates
gay (861), bisexual (294), transgender (201), people (113),
Sexual identity & Gender gays (90), men (63), straight (30), heterosexual (20),
homosexual (13), trans (13), bisexuals (12)
couple (135), couples (117), lover (59), friends (50), partner
Ties
(36), parents (35), daughter (26), families (26), wife (23), ex
(20), girlfriend (17), lovers (13), sister (13), pair (11), single
(11)
sex (96), love (70), relationship (45), affair (38), kiss (33),
Sex/Love
relationships (19), graphic (18), porn (17), action (14),
explicit (12), fling (11), affairs (11), romance (10)
equality (61), rights (37), LGBT (34), group (25), full (18),
LGBT movement
charity (17), activists (14), festival (13), movement (10),
groups (10)
scenes (22), star (22), drama (21), plays (20), scene (18),
Entertainment
magazine (18), comedy (14), movie (11), sitcom (10)
queen (32), housekeeper (22), vet (18), MP (11), ministers
Occupations
(10)
Groups of people
community (28), staff (15), members (14),

Marriage

9
10

Colour/Nationality/Creed Black (22), French (15), Muslim (13)


Age
young (31)

11

Visibility

openly (20)

12

Appearance

fat (15)

3
4
5
6

Semantic Category

married (23), marry (15), wedding (15)

Table 4.2. Semantic categories of the lexical collocates of lesb* with a minimum MI value of
4, ordered by decreasing frequency.

As semantic category 1 in Table 4.2 shows, the strongest semantic preference


the search term lesb* displays in the corpus is for other lexical items used to label
sexual identities and gender. Within this group, the most frequent collocate is gay,
which is more commonly used as the label for male homosexuals. This collocate is
followed by bisexual and transgender, which represent the remaining member
categories of the LGBT group. Additionally, there are two collocates that refer to
heterosexuality, namely heterosexual and straight. This indicates that, even when most
collocates in this semantic category are related to other sexual minority groups, the
query lesb* collocates with lexical items from the whole spectrum of sexuality and
gender categories. Moreover, we also find the generic term people within the list of
collocates. This word was included in this semantic category after a detailed scrutiny of
the concordance lines featuring it. The analysis revealed that most occurrences of

26

people were pre-modified by gay and/or lesbian. Something similar happened with the
word men, which was the only gender label that made it into the 86 collocates list.
Concordance analysis revealed that this noun was more typically used with the adjective
gay acting as a pre-modifier.
As semantic category 2 shows, the search query lesb* also displayed a strong
tendency to co-occur with lexical items designating ties or connections between
individuals. In this semantic category we can distinguish three types of ties or
subcategories: A) sexual or romantic ties, B) family ties, and C) friendship ties. As
Table 4.3 shows, most collocates belong to subcategory A. This suggests that lesbians
are predominantly represented in terms of their sexual or romantic relationships in the
corpus. As lesbian partners seem to be a frequent focus of the newspapers in the data
set, the two most frequent collocates in this semantic category, namely couple and
couples, have been earmarked for closer analysis in the following chapter.

Sexual or
A
Romantic
B Family
C Friendship

Ties subcategories
couple (135), couples (117), lover (59), partner (36), ex (20), girlfriend (17),
lovers (13), pair (11), single (11)
parents (35), daughter (26), families (26), wife (23), sister (13)
friends (50)

Table 4.3. Subcategories of Ties ordered by decreasing number of collocate frequency.

Another semantic preference that the search query lesb* displays in the corpus
pertains to sex and/or love. A quick look at the collocates in semantic category 3 reveals
that the majority of these words involve the idea of sex. This is confirmed by our
reference dictionary in the definitions of words such as relationship(s), affair(s), and
fling. Additionally, a concordance analysis of the collocate kiss, reveals it is usually
used in close proximity to the collocate sex, assigning a predominantly sexual quality to
it. An exception to these observations are the collocates love, and romance. However,
the scrutinising of the concordance lines of love reveals it is occasionally used as a
synonym of sex.
The data set also contains frequent occurrences of words that can be related to
the LGBT movement. These terms have been grouped into semantic category 4, where
we find collocates such as equality, movement, and full. Concordance analysis of this
last word revealed it was primarily used as a pre-modifier of lexical items such as

27

equality, protection, and participation. Finally, a quick look at the concordance lines of
group and groups reveals these collocates relate to the idea of comradeship.
Other lexical items retrieved by collocation analysis indicate that the search term
co-occurs with words pertaining to entertainment. As semantic category 5 shows, these
terms can be primarily associated with visual displays, such as those in films, TV
shows, and magazines. The list of collocates in this semantic category suggests that in
2013 British national newspapers wrote more frequently about scenes portraying
lesbians. Frequency counts of the collocates drama, comedy, and sitcom suggest there
was a slightly higher tendency for lesbian characters to appear in humorous fictional
situations. Another frequent collocate in this semantic category is the noun star, which
hints at specific real life referents. The collocate star is analysed in the next chapter so
as to provide more insights into how lesbians were written about in the corpus when
fame was a salient issue.
Semantic category 6 comprises occupations and it contains four terms that
belong to different social domains. These are politics, with collocates such as queen and
MP; religion, with the noun ministers; and occupations and trades, with the words vet
and housekeeper. These lexical items are interesting on account of their specificity.
They may refer to particular individuals and events reported in the newspapers. Because
of this, this semantic category is further analysed in the next chapter in order to shed
some light on the representations of lesbians these individuals (and events) suggest.
In semantic category 7 the query lesb* shows a tendency to be associated with
groups of people. Collocates like members and community suggest a group of
individuals who share common features. Additionally, a word like staff implies the idea
of workforce.
Another semantic preference identified in the data set pertains to marriage
(semantic category 8). This is not surprising considering same sex marriage was an
important topic in Britain in the year 2013. However, the overall number of collocates
associated with it was surprisingly small, with less than a hundred hits for all three
collocates combined, as shown in Table 4.2.
The set of collocates in semantic category 9 were grouped into one category
although they refer to different but somewhat related notions. This semantic category
contains three collocates which refer to skin colour, nationality, and creed, respectively.
The fact that only one lexical item for each of these notions appears in the list of 86
collocates suggests that we may be dealing with specific referents. In order to identify
28

the representations these collocates suggest, a deeper concordance analysis of this


semantic category is carried out in the next chapter.
The last three semantic categories, namely 10, 11 and 12, contain only one
collocate each. Semantic category 10 concerns age, where the collocate young refers to
only one age group. This suggests that youth may be a relevant feature to consider when
determining the representations of lesbians in the corpus. A concordance analysis of the
collocate openly in semantic category 11 reveals it designates openness about ones
sexual identity. Thus being categorised as pertaining to visibility. Finally, semantic
category 12 includes the term fat, the only adjective used to refer to appearance in the
whole list of 86 collocates (see Appendix). Given the specificity of the collocates young
and fat, a closer concordance analysis is necessary to help us understand the nature of
their association with lesb*. This is carried out in the next chapter.
4.2 Concluding Remarks
As the semantic categories of the collocates of lesb* show, the picture that emerges
from a collocational analysis of the corpus positions lesbians as being primarily
represented in connection to other sexual identities and being mostly referred to in
relation to their partners. This explains the frequent occurrence of lexical items that
pertain to the nature of the type of relationships they engage in. In this data set, these
relationships are predominantly sexual, as there are comparatively less references to
more formal types of commitments. Additionally, lesbians are frequently represented in
terms of their membership to a community and in relation to topics pertaining to
equality. The semantic categories also suggest that lesbians often appear on other types
of media, such as films and TV shows. This can be related to the notion of visibility,
which also made it into a separate semantic category. To a lesser extent, they are
identified in relation to other aspects of their social lives, such as work, culture, or in
terms of age, and appearance.
The next chapter tries to refine the collocational profile drawn so far. This is
done through a closer analysis of the co-text of the collocates that have been earmarked
in this chapter.

29

CHAPTER FIVE
Data Analysis II: Refining the collocational profile of lesb*
This chapter presents a closer analysis of concordance lines of the collocates singled out
in Chapter Four. The aim is to identify semantic preferences and prosodies so as to
determine what these reveal about the representations of lesbians in the data set. An indepth focus on specific collocates is expected to refine the collocational profile and
enhance the set of representations identified in the corpus. This is done using qualitative
analysis.
5.1 Ties: Couple and couples
In the semantic category that designates ties, couple and couples are the most recurrent
collocates. The occurrences of these two lexical items combined make up more than
half of the total number of tokens in the semantic category they belong to. Because of
this, the concordances of these two collocates of lesb* are analysed in this section.
A concordance analysis of the collocates couple and couples shows they tend to
co-occur near terms that can be grouped into seven main thematic categories. These
categories are (1) relationships, (2) parenthood, (3) legal matters, (4) statistics (5)
politics and law making, (6) religion, and (7) arguments and disagreements.
Additionally, there are a number of semantic categories that are exclusive of the
singular collocate couple. These categories are (8) damage, (9) anger, (10) harmful
behaviour, and (11) violence. To facilitate the presentation of the findings, these
categories have been divided into three groups. Group 1 includes categories 1, 2, 3, and
4; group 2 categories 5, 6, and 7; and group 3 categories 8, 9, 10, and 11.
5.1.1 Group 1: Relationships, parenthood, legal matters, and statistics
Within the relationships category (category 1), we find collocates that refer to ties,
gender and sexuality, marital status, separation, and sex. Within these semantic
preferences we come across words such as partner, women, straight, partnership,
divorce, dissolutions, sex and pleasure. The collocates pertaining to separation suggest a
negative semantic prosody. In the parenthood category (category 2), we come across
words that pertain to family and fertility with tokens such as upbringing, birth, children,
father, sperm, insemination, adopt, and pregnant. Regarding legal matters (category 3),
there are frequent occurrences of terms that designate court cases, with words such as

30

judgement, court and ruled. A close scrutinising of the concordance lines featuring legal
references shows the texts that contain them invariably refer to lawsuits about sperm
donors parenting rights. Finally, the category that pertains to statistics (category 4)
includes tokens such as rate, quadrupled, and peaked, designating trends and figures. A
concordance analysis shows that the majority of the information being passed on using
statistical terms designates the domains of marriage and motherhood. With the
exception of the collocates that refer to separation, the semantic categories identified in
group 1 do not seem to have any particular semantic prosodies. Given these four
semantic categories, lesbian couples in our data set seem to be primarily represented
through their sexuality, type of bond, potential to have children, and difficulties arising
from their decision to form a family.
5.1.2 Group 2: Politics and law making, religion, and arguments and
disagreements
In group 2, politics and law making (category 5) includes collocates that pertain to
legislation, politics, equality, and inequality. Within these semantic preferences we
identify collocates such as government, ballot, amendment, party, leaders, recognition,
rights, discrimination, and bias. As can be expected, lexical items that refer to
inequality have a marked negative semantic prosody. The next category (category 6)
includes tokens from the domain of religion, with words such as church, and vicar.
Finally, there are collocates that designate arguments and disagreements (category 7)
with words such as controversy, arguing, complaints, battle, and row. This last category
also displays a negative semantic prosody. In terms of representations, this group of
semantic categories seems to portray lesbian couples in terms of discrimination and
their rights, both their secular ones and those in the domain of religion. Also, they are
portrayed in an ambience of dissent, as the collocates pertaining arguments and
disagreement show.
A sample of concordance lines with collocates from the semantic categories in
groups 1 and 2 are shown in Table 5.1.

31

and 2

32

Table 5.1. Sample of concordance lines of the collocates couple and couples and the query lesb* showing collocates of the semantic categories in groups 1

5.1.3 Group 3: Damage, anger, harmful behaviour, and violence


As was previously stated, the semantic categories in this group come exclusively from
the concordance lines of the singular collocate couple. The reason why the collocate
couple shows a different set of semantic preferences from its plural counterpart may
have to do with the fact that singular forms tend to be more precise. This means that
when couple is used in our data it is likely to designate one specific pair of individuals
that are thus involved in a particular event in the news. If we consider the semantic
categories in this group, it appears that the events the couples in this data set were
involved in were of a predominantly negative nature. These categories, namely (8)
damage, (9) anger, (10) harmful behaviour and (11) violence, include lexical items such
as harm, threatening, rant, fury, addict, antisocial, tortured, killed, and slashed. As can
be seen, all of these collocates suggest very strong negative prosodies.
A closer analysis of concordance lines also reveals that the collocates associated
with damage and anger (categories 8 and 9) tended to be used when the lesbians in the
articles were depicted as victims of acts of discrimination. This is shown in Table 5.2,
lines 1, 2, 3, 4 and 5 (page 35). On the other hand, the collocates denoting harmful
behaviour and violence (categories 10 and 11) appeared when the lesbian couples were
considered (or thought to be) the agents of vicious acts. This can be clearly seen in
Table 5.2, lines 6, 7, 8, 9, and 10 (page 35). What these findings suggest is that lesbians
are being represented both as victims and perpetrators. In the reporting of the pieces of
news where they feature, they can undergo discrimination, but they are also capable of
committing crimes. Moreover, in this data set, their crimes are of an even crueller nature
than the ones they are reported as victims of.
As can be seen, the concordance analysis of the collocates couple and couples
has provided some interesting insights about the representations of lesbians in the 2013
corpus. Now let us consider the domain of entertainment.
5.2 Entertainment: Star
Within the semantic category that pertains to entertainment, the collocate star occupies
the second place in frequency. However, this collocate seems particularly interesting
due to the fact that it refers to individuals whose actions tend to be sensationally and
extensively reported in the media as a result of their popularity. Because of this, the
representations the collocate star suggests are very likely to reach a wider audience.

33

The concordance lines of the collocate star show it frequently co-occurs with
other words related to entertainment, such as pop, show and actress. Additionally, it
shows a semantic preference for lexical items that refer to disclosing information,
gender and sexual identity, and sex. Among these collocates we find the words
confessions, secret, women, straight, threesome, and fling. Also in relation to sex, we
come across tokens that designate women in terms of their sexual attractiveness with
words such as babe, and beauty. However, a concordance analysis shows these terms
refer to an individual who is not identified as being a lesbian. Overall, these terms do
not seem to display any particular semantic prosody.
However, there are also a variety of lexical items that have distinctively negative
semantic prosodies. Some of these collocates refer to unruly and troubled individuals,
arguments and disagreements, and violence. Examples from these semantic categories
are the words thug, tearaway, rapist, rows, controversy, suicide, assault, and brawl.
Similarly, there are collocates that designate a variety of feelings involving anger and
shock, with collocates such as outrage, sickening, and terror.
As these semantic categories show, secrecy, sex, trouble and violence are
recurrent topics when the query lesb* occurs in close proximity to the collocate star.
Once again we identify collocates that bring lesbians closer to topics involving some
type of wrongdoing and violence. Some concordance lines that provide support for
these observations are shown in Table 5.3 (page 35).
In the next section we expect to enhance the current picture by focusing on
collocates that designate occupations.

34

35

Table 5.3. Sample of concordance lines of the collocate star and the query lesb*

Table 5.2. Sample of concordance lines of the collocate couple and the query lesb* showing collocates of the semantic categories in group 3

5.3 Occupations: queen, housekeeper, vet, MP, and ministers


Leaving their specificity aside, the collocates in this semantic category are distinctive
for two reasons. First of all they refer to individuals from an aspect of their social life
that is not closely related to their sexual identity. And secondly, they represent different
domains, such as politics, religion, and trades and professions. Due to this, a
concordance analysis of these collocates is expected to shed some light on new aspects
of the representations of lesbians in the corpus.
5.3.1 Queen
The examination of the co-text of the collocate queen contains lexical items associated
with homosexuality, marriage, insemination, law, and the royalty. Among these
semantic categories we find words like gay, same-sex, marriage, egg, birth, legalise,
succession, heir, and royal. These collocates do not suggest any particular semantic
prosodies. We also have frequent occurrences of the words Tory, lady, woman, and
child. A concordance analysis shows that this set of words identifies the individuals
involved in the situation reported in the texts. As it turns out, the collocate queen occurs
near the query lesb* due to the reporting of some comments made by a conservative
politician. These comments referred to the possibility of having a lesbian queen who
decides to have children using artificial insemination. These ideas were presented in the
context of the discussion about same-sex marriage and the laws of royal succession.
A concordance analysis also reveals several collocates with negative semantic
prosodies. These designate lack of intelligence, oddness, difficulty, potential danger,
and arguments. Among these semantic categories we find words such as silly, idiot,
eccentric, bizarre, worries, troubling, threat, spectre, much-feared, row, and kerfuffle.
The concordance lines show that the collocates in the first two semantic categories are
used to evaluate the comments made by the politician. Finally, there are some collocates
that refer to sex relations that tend to be frowned upon in the Western culture, with
words such as adultery, and incest. As can be seen, these lexical items also display a
negative semantic prosody. A concordance sample showing some of these collocates is
provided in Table 5.4 (page 38).
Regarding representations, the semantic categories identified contribute to
portray lesbians in terms of their choice to form a family. This is shown by the semantic
categories of marriage, and insemination. These issues are linked to their rights and are
presented as a threat. The collocates pertaining to law, difficulty, danger, and arguments

36

provide support for these observations.


5.3.2 Housekeeper and MP
Concordance analyses of the collocates housekeeper and MP show that the majority of
the lines where these words occur near lesb* refer to the same incident. As a
consequence, these two collocates show very similar semantic preferences. Because of
this, the results of their analyses are presented jointly.
The incident that featured these collocates was a trial where a Tory MP was
accused by his (lesbian) housekeeper of offering her money to have sex with him and
his wife. Due to this, concordance lines contain frequent instances of collocates that
provide contextual information, with words such as Tory, wife, and maid; and terms that
refer to the legal procedure, such as action, and tribunal. Other collocates that designate
the context pertain to sex, with words such as threesome, and sex. These collocates and
semantic categories do not seem to display a particular semantic prosody.
Other semantic preferences that the collocates housekeeper and MP display is
for words that suggest persuasion, untruthfulness, innocence, or refer to sexual
harassment. Here we find collocates such as lure, enticed, begged, falsely, lies,
exonerated, cleared, groping, and molested. With the exception of the semantic
category pertaining to innocence, these collocates display a negative semantic prosody.
A concordance sample showing some of these collocates is provided in Table 5.5 (page
38).
Given that housekeeper has twice as many tokens as MP does, it is not
surprising to come across other semantic preferences within the concordance lines of
this collocate. These semantic preferences are for lexical items that designate strong
negative emotions, with words like humiliating, shocked, scared, and suicidal. Once
again, these words exhibit a negative semantic prosody. Finally, the collocate
housekeeper frequently co-occurs with an adjective that designates age, namely middleaged. Some of these collocates are shown in Table 5.6, lines 8 and 9 (page 39).
The semantic categories of the collocates housekeeper and MP contribute to
represent lesbians as sexual desirable, particularly in terms of heteroflexibility. This is
supported by the occurrence of collocates that refer to sex such as threesome, and the
ones that designate sexual harassment. Again, they are represented in relation to
conflict, this time involving dishonesty and legal matters.

37

38

Table 5.5. Sample of concordance lines of the collocate MP and the query lesb*

Table 5.4. Sample of concordance lines of the collocate queen and the query lesb*

39

Table 5.6. Sample of concordance lines of the collocate housekeeper and the query lesb*

5.3.3 Vet
A concordance analysis of the collocate vet shows the term invariably refers to the
character on a British sitcom aired in 2013. Concordance lines show frequent collocates
that designate technicalities of the show, with words such as writing, starring, and
characters. Other collocates evaluate the show, suggesting amusement and dullness
with words such as hilarious, funny, awkward, and boring. These semantic categories
have positive and negative semantic prosodies respectively. Additionally, there are
various collocates that refer to ties with lexical items such as parents, relations, and
mates. Finally, a list of collocates provides contextual information that pertains to the
main plot of the show. This revolves around the life of an adult woman who intends to
reveal her lesbian sexual identity to her parents. Because of this, we come across a set
of collocates such as closet, the phrase come out, and words that designate difficulty
with words such as struggling, and afraid. These display a marked negative semantic
prosody. Some of these collocates are shown in Table 5.7.
As the analysis shows, the collocate vet helps represent lesbians mainly in terms
of their social bonds, sexual identity, and the distress that making their sexuality public
involves.
5.3.4 Ministers
Not surprisingly, the concordances of ministers show the collocate tends to co-occur
with other words that refer to religion such as congregation, and church. Other semantic
categories identified include lexical items that refer to elections, permission, sexual
identity, discussions, disagreement, danger, and reluctance. Some collocates from these
categories are votes, allow, gay, heterosexual, debate, controversial, narrowly,
backlash, terrified, warned and compromise. As can be seen, the last four categories
display distinct negative semantic prosodies. Some of these words are shown in Table
5.8.
In conclusion, the semantic categories identified contribute to represent lesbians
in terms of their inclusion in the domain of religion, as the words that refer to
permission suggest. The semantic category that pertains to elections hints at the idea of
change. However, there is also an ambience of divergence, and peril.
In the next section we expect to enhance the set of representations drawn so far
by focusing on a set of collocates that hint at diversity.

40

41

Table 5.8 Sample of concordance lines of the collocate ministers and the query lesb*

Table 5.7. Sample of concordance lines of the collocate vet and the query lesb*

5.4 Colour, Nationality, and Creed: Black, French, and Muslim


The collocates in this semantic category refer to three features that can be used to
characterise humans. In different degrees, these collocates can be connected to the
notion of culture. As they pertain to three different human qualities, a concordance
analysis of these collocates is expected to considerably enhance the set of
representations identified so far.
5.4.1 Black
The analysis of the concordance lines of the collocate black resulted in the identification
of a variety of semantic categories. First of all, we find several terms from the field of
politics, with words such as party, liberal, Democrat, Republican, conservatism, and
radical. Likewise, we find lexical items that refer to the leftist ideology, with collocates
such as leftwing, communism, left, comrade, and Sandinista. We also come across terms
pertaining to militancy and elections, with words such as organisation, movement,
collective, candidate, and voter. As Jaworska and Krishnamurthy (2012: 413) assert
radicalism, militancy, and leftist ideology () have distinctively negative
connotations. In this way, the semantic categories that express these ideas would have
negative semantic prosodies.
Additionally, there are references to race, human rights, and activism, with
words such as racial, white, equally, freedom, revolution, and activist. Another semantic
category includes references to arts, with the collocates theatrical, music, and
photographer. Terms that hint at diversity are also common, with words such as
Muslim, gay, and queer. These words do not seem to suggest any particular semantic
prosody. Finally, collocates pertaining to bias, such as supremacist, and discrimination,
can be considered to display negative semantic prosodies. Some of the collocates of
black can be seen in the sample concordance in Table 5.9 (page 44).
To summarise, a concordance analysis of the collocate black reveals that when
skin colour is mentioned lesbians are represented in relation to activism, militancy and
politics, particularly from the leftist ideology. Also, they are represented in relation to
discrimination, diversity, and the arts.
5.4.2 French
A concordance analysis shows that the collocate French almost invariably occurs in
reference to films. Most of the tokens are about one particular film that was first

42

screened at the Cannes Film Festival in 2013. Because of this, we find frequent
collocates that refer to technical aspects of the movie, with words such as directed,
scenes; or references to its genre, with the collocates romance, and drama. There are
also collocates that refer to duration, with terms such as length, lengthy, and three-hour.
A concordance analysis shows these collocates refer to either a scene in the film or the
film proper. Other frequent collocates designate success, with words like top, and win.
Concordance lines show these collocates refer to the film competition at Cannes and the
performance of the film in it. Additionally, we come across collocates that pertain to
age, a sexual bond, or imply the notion of shock. In these three semantic categories we
find words such as teenagers, young, old, couple, affair, sex, relationship, scandal, and
taboo-busting. Some of these collocates appear in the sample of concordance lines in
Table 5.10 (page 44).
As can be seen, the analysis shows that the collocates of French help to
represent lesbians in terms of their relationship with other lesbians and the sexual
quality of those relationships. Additionally, there seems to be a focus on their age. They
are also represented in relation to the idea of shock.
5.4.3 Muslim
The scrutinising of concordance lines shows the frequent occurrence of the collocate
Muslim comes from the reporting of the wedding of two Muslim women. The event was
identified as the first of its kind to take place in the UK. Because of this, we find
frequent collocates that provide contextual information, with words such as first, two,
and women; and others that refer to the union, with the words married, partnership,
brides, and wedding. Finally, there are collocates that designate courage, and
endangerment with terms such as defied, brave, death, and threats. The last two
categories display positive and negative semantic prosodies respectively. A
concordance sample of Muslim is shown in Table 5.11 (page 45).
Once again, the semantic categories identified help to represent lesbians with a
focus on their connection to other lesbians. In this case, that connection is of a lawful
kind, implying a long-term commitment. The lesbians in the data set are represented as
rebellious, but also as facing potential danger. In the following section we focus on age.

43

44

Table 5.10 Sample of concordance lines of the collocate French and the query lesb*

Table 5.9 Sample of concordance lines of the collocate black and the query lesb*

45

Table 5.11 Sample of concordance lines of the collocate Muslim and the query lesb*

5.5 Age: Young


The fact that age is specified when writing about the lesbians in our data set, suggests
that there may be differences between the various age groups. Youth is the only life
stage that made it into the list of collocates obtained in chapter four. Because of this,
focusing on this age group specifically is likely to produce some interesting findings.
A concordance analysis of young shows collocates that designate social ties and
kinship, with words such as couples, friends, parents, and family. We also come across
several references to sexual orientation and equality, with collocates such as sexuality,
equal, and rights. These lexical items do not seem to show a particular semantic
prosody. However, the concordance lines of the collocate young also contain various
collocates with marked negative semantic prosodies. Among these there are terms that
pertain to vulnerability, with words such as harm, victim, victimise, high-risk, fears, and
doubts. There are also words related to abuse, difficulties, and seclusion, with the
collocates rape, bullying, struggle, overcome, and isolation. Likewise, we find
collocates that designate violence, and dying, with words such as stabbings, attacks,
strangled, death, and suicide. Some of these collocates are shown in Table 5.12 (page
48).
As with previous analyses, the representations of lesbians near the collocate
young portray them in reference to their ties, often involving friendship and family.
Their rights and sexuality are also a distinctive feature. Finally, they are represented as
helpless, lonely individuals. In this way, youth seems to be an extremely difficult time
for lesbians, where abuse, death, and violence are pervasive aspects in their lives.
In the next section we analyse the concordances of the collocate fat. This is the
last collocate that is discussed in this chapter.
5.6 Appearance: Fat
The collocate fat stands out due to its specificity and the distinctive negative
connotations it has in the Western culture. A concordance analysis of this collocate
allows us to address an aspect that has not been previously referred to in the findings,
namely appearance. With that in mind, it is possible to identify the following findings.
The close scrutinising of the concordance lines of the collocate fat shows it
invariable occurs in the reporting of a trial. This trial involved a beauty salon owner
accused of making sexist remarks that expressed his desire to employ only fat, gay, and
lesbian workers so as to avoid having female employees taking maternity leave. The
46

nature of this piece of news explains the frequent occurrence of collocates that refer to
the context and individuals involved. These individuals are designated in terms of their
profession, and sexual identity or gender, with words such as salon, stylist,
hairdressers, gay, and women. Similarly, we identify collocates that refer to
employment, maternity, and the justice system. Some lexical items from these semantic
categories are boss, employ, staff, pregnant, babies, tribunal, and lawyer. Up to this
point, the collocates do not seem to have any particular semantic prosodies. However,
the collocate fat also shows a semantic preference for words that designate harm,
annoyance/anger, and resentment, thus displaying a distinctive negative semantic
prosody. Within these semantic categories we find the collocates threatened, sick,
angry, rant, and aggrieved. Finally, there are collocates that pertain to discrimination
and abuse with words such as bias, sexist, and bully. These lexical items also display a
negative semantic prosody. Some of these collocates are shown in the concordance
sample in Table 5.13.
As a concordance analysis shows, the collocate fat does not denote lesbians.
However, its occurrence in close proximity to the query lesb* puts lesbians and
overweight individuals in the same category. This category represents them through
their potential to have children, as the collocates that pertain to maternity suggest. Other
semantic categories identified help to portray lesbians in connection to bigotry, and
within an atmosphere of hostility. These findings coincide with previous representations
in the corpus.

47

48

Table 5.13 Sample of concordance lines of the collocate fat and the query lesb*

Table 5.12 Sample of concordance lines of the collocate young and the query lesb*

5.7 Concluding Remarks


A closer analysis of the concordance lines of the collocates singled out in chapter four
resulted in the identification of various semantic preferences. These were made into
semantic categories. As the findings in this chapter suggest lesbians are represented in
terms of their sexual identity, and also regarding their potential to have children. This is
demonstrated by the occurrence of collocates that designate parenthood, and
insemination. Similarly, they are characterised in terms of their ties to other individuals
and to their partners. This can be observed in the semantic preferences that designate
ties, sex, and marriage. Other semantic preferences identified reveal lesbians tend to be
represented in relation to equality, activism, and legislation. This is evidenced by the
collocates that refer to their rights, militancy, politics, and law making. All of the
semantic preferences mentioned so far do not seem to display any clear-cut semantic
prosodies, although Jaworska and Krishnamurthy (2012) suggest militancy and
references to left wing politics, which also represent lesbians in the corpus, have a
negative semantic prosody.
In spite of what the majority of the previous findings suggest, there is a set of
semantic preferences that contributes to represent lesbians through unequivocal negative
semantic prosodies. This is evinced by the frequent occurrence of collocates that
designate trials, hostility, disagreements, arguments, danger, violence, and/or
discrimination. These semantic preferences portray this sexual identity group in a
setting where conflict and insecurity prevail. More importantly, these semantic
preferences are present, to different degrees, in the concordance lines of all the
collocates analysed in this chapter. In this way, the pervasive property of these semantic
preferences and prosodies permeates the representations in the corpus with a
characteristic negative quality. Particularly interesting are findings that portray lesbians
as perpetrators of vicious acts, although they are also represented as victims of
discrimination, and violence in the data set. The latter is especially the case when it
comes to young lesbians.
The following chapter offers an overview of the findings in Chapters Four and
Five of this study. Additionally, an assessment of its strengths and limitations, and
suggestions for future research are given.

CHAPTER SIX
Conclusion
The aim of this chapter is to revisit the main objective and research questions that gave
origin to this research project. Additionally, the principal findings and conclusions are
summarised. Finally, an assessment of the limitations of this study and suggestions for
future research are provided.
6.1 Main findings and implications
This study was designed with the aim to identify frequent lexical patterns that were used
to represent lesbians in British newspapers in 2013. This idea was motivated by three
main assumptions. First of all, that language has an important role in shaping,
maintaining, and changing relations of power. Second, that linguistic repetition is a
means through which a given representation of the social world can become naturalised.
And finally, that newspapers play a part in making these representations known, having
the potential to influence their audiences perception and attitudes about the people and
events they write about. This influence is believed to be exerted in subtle ways. With
this in mind, the analysis carried out in this study was expected to reveal hidden
subtexts that may evoke particular discourses in the data set. The aim is to bring
awareness to both news producers and consumers of the power of the linguistic choices
they make and are exposed to respectively.
Additionally, this study intended to contribute in two different ways. Firstly, to
the body of studies that combine a CL and CDA approach, and secondly to the study of
language, gender and sexuality. In the latter case, by focusing on how a sexual
minority that has been relatively understudied in the field is represented in newspapers.
As mentioned in Chapter Three, this study aimed to answer the following
overarching question:
What does an approach that combines CL and CDA methodologies reveal about the ways
in which British newspapers represented lesbians in the year 2013?
In order to answer this question, two specific sub-questions were posed. These subquestions were addressed through the creation and analysis of a corpus of newspaper
articles published in Great Britain that contained the words lesbian, lesbianism, or

50

lesbo. The findings emerging from these sub-questions are discussed below.
1. What does a collocational analysis of the corpus reveal about the semantic categories
used in the representations of lesbians?
Taking into account collocate frequency, the collocational analysis of the corpus
revealed two major findings. The first one suggests that lesbians are primarily
represented in relation to their sexual identity and that of others. In his study about gay
men, Baker (2005) identified different discourses that constructed these individuals.
One of these is what he labelled the gay identity discourse, where gay men are defined
as a social group. This coincides with what was found through the collocational analysis
of the corpus. As Baker suggests this discourse also implies the idea of deserving equal
political rights, and freedom of expression. In the BN13L corpus, the presence of this
discourse is supported by semantic categories such as sexual identity and gender, LGBT
movement, and groups of people (categories 1, 4, and 7, respectively, in Table 4.2, page
26).
The second major finding suggests that lesbians are represented in relation to
their bonds to other people, particularly to their partners. This is supported by the
frequent occurrence of collocates that refer to sexual or romantic ties (see Table 4.3,
page 27). Other semantic categories that support these claims are the ones pertaining to
sex/love, and marriage (categories 3, and 8 in Table 4.2, respectively). However, based
on the frequency of occurrence of the collocates in these last two categories, there
seems to be a special emphasis on relationships of a sexual rather than a romantic kind.
In this way it could be said that the lesbians in the corpus are primarily represented in
terms of their sexual desire. This complies with contemporary post-feminist
representations of lesbians that have been identified in other media, as discussed by
Jackson (2009) and McRobbie (1996, 2004). These representations, Jackson (2009)
asserts, have the potential to position lesbians for consumption by a male audience.
2. What does a closer analysis of concordance lines reveal about the semantic preferences
and the semantic prosodies featuring in the representation of lesbians in the corpus?
The concordance analysis of the collocates scrutinized in this section revealed four
major groups of semantic preferences in the corpus. The first group represents lesbians
in terms of sexual identity, appearing often in association with their male counterpart.
Additionally, lesbians are represented in close relation to their political rights. These
51

ideas are supported by the identification of semantic categories that pertain to elections,
politics, law making, equality, militancy, activism, and human rights. As can be seen,
this pattern also seems to correlate with the gay identity discourse identified by Baker
(2005).
The second group of semantic preferences to emerge also coincides with
findings in the previous section. The semantic categories that make up this group
represent lesbians in relation to their ties to their partners and family. Regarding their
partners, the findings in this section revealed more references to a formal type of bond
rather than a sexual one. This representation helps to complete the picture that emerged
in the previous part of the analysis, where the bond was predominantly sexual. Semantic
categories that designate marital status, and marriage help to support these claims. In
connection to these semantic categories, the group also includes references to
motherhood, with collocates that designate fertility, insemination, and maternity. In this
way, the formal bond of lesbian partners that features in this category contributes to
represent lesbians in relation to their potential to form a family. However, concordance
analysis also showed evidence of a sexual desire discourse, particularly interesting is
the collocate housekeeper which represents lesbians as heteroflexible individuals. This
coincides with contemporary representations identified in other studies, as mentioned by
Jackson (2009) and McRobbie (1996, 2004).
The third group of semantic preferences represents lesbians in two different
ways. First of all, they are commonly represented as vulnerable, endangered individuals,
victims of violence, harm, and abuse. These situations are associated with
discrimination, inequality and, to a lesser extent, sexual abuse. Secondly, lesbians are
represented as agents of violent or cruel acts that are not connected to their sexual
identity. Semantic preferences that suggest this representation were less common than
those in the first group. However, this is a very interesting phenomenon, as it constitutes
a clear example of homophobia. As Baker (2005: 223) suggests, one of the ways in
which public homophobia is manifested includes seemingly casual collocations
between homosexuality and other lexis which imply relationship transience. With this
in mind, mentioning sexual identity when reporting events where sexual identity or
sexuality does not play a role is a clear example of public homophobia in our corpus. As
can be seen, there are conflicting discourses in this group of semantic preferences,
where lesbians are both victimised and demonised.
Finally, the fourth and last major group of semantic preferences in the data
52

represents lesbians in relation to conflicts and disagreements. These conflicts can be


related to lawsuits, personal struggles, or arguments and disagreements pertaining their
rights. Similarly, there are semantic categories in this group that represent lesbians in
relation to anger, annoyance, and shock. What makes this group particularly interesting
is that these semantic preferences appear in the concordance lines of every collocate
analysed in Chapter Five.
Regarding semantic prosodies, the analysis of concordance lines revealed that
these were predominantly negative. The majority of the semantic prosodies identified in
the corpus were part of the semantic preferences in the fourth group that was just
discussed, with semantic categories that suggest conflicts and disagreements. As these
semantic preferences occur near each of the collocates analysed qualitatively in Chapter
Five, so do their semantic prosodies. In other words, negative semantic prosodies are
pervasive throughout all the representations identified through concordance analysis,
thus helping to represent lesbians in a distinct negative way.
As can be seen, there is no single or dominant representation of lesbians in the
corpus, but a number of often conflicting ones. Overall, they coincide with findings in
previous studies that have examined representations of lesbians in other media, and also
representations of gay men. However, the analysis of the corpus has also revealed new
representations such as those pertaining to motherhood, vulnerability, and conflict.
6.2 Assessment of the study
The present study aimed to provide a picture of the representations of lesbians in
newspaper articles. This picture is quite specific, being limited to westernised, Englishspeaking discourses produced in one year, and one geographical area. Because of this,
its findings should not be generalised or applied to other contexts or times, but taken as
a frame of reference from which new studies can be conducted. To the best of my
knowledge, this study is the first one to use a corpus-based CDA approach in the
identification of representations of lesbians in newspaper articles. One of the most
valuable benefits of applying this type of interdisciplinary approach is that it contributes
to remove researcher bias. This is mainly achieved through the calculation of statistical
measures that provide scientifically grounded evidence to support and guide our
findings. In spite of this important benefit, some limitations and shortcomings were
identified in the implementation of this project.
The first criticism that can be raised about this study concerns the collection of
53

the corpus data. This was done through the LexisLibrary database. Using the search
interface of this database presented two main technical difficulties. First, each search
could only return a maximum of one thousand articles. This meant that each newspaper
in the data set had to be search individually for articles that contained the query term
lesb*. Although this did not constitute a big problem, it is an important aspect to bear
in mind when collecting articles within a longer time span or containing terms which
may occur with more frequency. The second difficulty of using LexisLibrary concerns
the duplicate option on the search interface. As mentioned in Chapter Three, this tool
did not work as expected, as duplicates were retrieved regardless of the duplicate option
being activated. As a consequence, the creation of the corpus was a very timeconsuming task, which not only required cleaning each of the files collected, but also
making sure that the data set did not contain more than one version of the same article.
Another limitation of this study pertains to the second part of the analysis. As
was mentioned in Chapter Three, the collocates whose concordance lines were carefully
scrutinised in Chapter Five were only a part of the collocates of lesb* retrieved by the
concordancer in Chapter Four. Although the analysis of this selection resulted in new
and interesting findings, the picture they provide represents only a part of the whole
data set, which is not as comprehensive as it could be. So as to provide a wider picture,
a considerable amount of additional work would have been necessary. However, such a
big investment of time could not be justified within the scope of a Masters study.
A further limitation of this study concerns the identification of semantic
prosodies. One of the characteristics of CDA is that it does not consider objectivity an
essential element for the interpretation of findings. Nevertheless, when determining
semantic prosody I was faced with a dilemma. This arose from a desire to avoid letting
my own particular worldviews determine the type of semantic prosody, so as to provide
the most neutral interpretation of the data as possible and avoid my cultural
idiosyncrasies to get in the way of the analysis. In order to accomplish this goal, I
decided to determine semantic prosodies with the help of the reference dictionary. After
doing this, I realised I could have done it in a different way, by comparing my findings
to those in a larger corpus. However, it was not possible to do this due to time
constraints and the lack of access to a synchronic corpus.
6.3 Suggestions for future research
This chapter concludes by suggesting ways in which the present study and its findings
54

can be expanded and complemented. The suggestions presented below incorporate


different perspectives and methods. I trust these ideas will be beneficial for future
studies of similar characteristics.
One of the ways in which this study can be improved is by adopting a diachronic
perspective. As Baker (2010) suggests, doing it can be a practical way of identifying
whether trends in representations continue or have changed throughout the years. The
most appropriate way to do this, he asserts, is by including more than three not too far
apart points in time, spread over a wide time span. In this way it is possible to avoid
obtaining results that may be due to chance.
Further work could also incorporate other CL tools such as keyword analysis.
Using statistical measures, this tool can identify words that are more frequent in a
corpus when compared to a reference one. Consequently, they can indicate the topics
that are distinctive in the data set under study. Another way of supplementing our
findings is to focus on parts of speech. By doing this it could be possible to determine
which nouns, verbs, or adjectives collocate with a particular lexical item. This last
approach has proved very fruitful in revealing patterns of representation in projects of
similar characteristics (see Pearce 2008; Baker et al. 2013).
The current research can also be expanded by including methods of analysis
from other CDA approaches. By way of illustration, the Discourse Historical Approach
could be used for closely scrutinising particularly interesting texts. This approach offers
the opportunity to focus on different discourse strategies. Among these we find
predication strategies, which can help to facilitate the identification of semantic
prosodies (see Wodak and Meyer 2009).
Finally, future work could compare lesbians against other social groups with
similar characteristics. Through this approach it could be possible to identify differences
and similarities between the newspaper representations of the different groups. This
type of analysis would position lesbians within a broader social context considerably
enriching our findings.
Despite the limitations of this study, I believe it has allowed me to highlight
some interesting ways in which lesbians were represented in British newspapers in
2013. I hope this study is able to promote further interest in the media representations of
this sexual identity group. I expect further studies to be conducted across different
languages, cultures and types of media.

55

REFERENCES
Atanga, Lilian. 2012. The discursive construction of a model Cameroonian woman
within the Cameroonian Parliament. Gender and Language 6(1), 21-45.
Baker, Paul. 2005. Public Discourses of Gay Men. London: Routledge.
Baker, Paul. 2006. Using Corpora in Discourse Analysis. London: Continuum.
Baker, Paul. 2008. Sexed Texts: Language, Gender and Sexuality. London: Exquinox.
Baker, Paul. 2010. Will Ms ever be as frequent as Mr? A corpus-based comparison of
gendered terms across four diachronic corpora of British English. Gender and
Language 4(1), 125-129.
Baker, Paul. 2012. Acceptable bias?: Using corpus linguistics methods with critical
discourse analysis. Critical Discourse Studies 9(3), 247-256.
Baker, Paul & Tony McEnery. 2005. A corpus-based approach to discourses of refugees
and asylum seekers in UN and newspaper texts. Journal of Language and
Politics 4(2), 197-226.
Baker, Paul, Costas Gabrielatos, Majid Khosravinik, Michal Krzyzanowski, Tony
McEnery, & Ruth Wodak. 2008. A useful methodological synergy? Combining
critical discourse analysis and corpus linguistics to examine discourses of
refugees and asylum seekers in the UK press. Discourse and Society 19(3), 273306.
Baker, Paul, Costas Gabrielatos, & Tony McEnery. 2013. Sketching Muslims: a corpus
driven analysis of representations around the word Muslim in the British press
19982009. Applied Linguistics 34(3), 255-278.
Bourdieu, Pierre. 1991. Language and Symbolic Power. Oxford: Polity.
Brown, Gillian & George Yule. 1982. Discourse Analysis. Cambridge: Cambridge
University Press.
Burr, Vivien. 1995. An Introduction to Social Constructionism. London: Routledge.
Cameron, Deborah. 1997. Performing gender identity: young mens talk and the
construction of heterosexual masculinity. In Sally Johnson & Ulrike Meinhof
(eds) Language and Masculinity, 47-64. London: Blackwell.

56

Ciasullo, Ann. 2001. Making Her (In)visible: Cultural Representations of Lesbianism


and the Lesbian Body in the 1990s. Feminist Studies 27(3), 577608.
Doty, Alexander & Ben Gove. 1997. Queer Representation in the Mass Media. In Andy
Medhurst & Sally Munt (eds) Lesbian and Gay Studies: A Critical Introduction,
8498. Washington, DC: Cassell.
Ellis, Sonja, Celia Kitzinger & Sue Wilkinson. 2003. Attitudes towards lesbians and gay
men and support for lesbian and gay human rights among psychology students.
Journal of Homosexuality 44(1), 121-138.
Essig, Laurie. 2000. Heteroflexibility.
<http://www.salon.com/2000/11/15/heteroflexibility> [accessed 15 August
2014]
Fairclough, Norman. 1989. Language and Power. London: Longman.
Fairclough, Norman & Ruth Wodak. 1997. Critical Discourse Analysis. In Teun van
Dijk (ed.) Discourse as Social Interaction, 258-284. London: SAGE.
Foucault, Michel. 1972. The Archaeology of Knowledge. London: Tavistock.
Gouveia, Carlos. 2005. Assumptions about gender, power and opportunity: gays and
lesbians as discursive subjects in a Portuguese newspaper. In Michelle Lazar
(ed.) Feminist Critical Discourse Analysis: Gender, Power and Ideology in
Discourse, 229-250. London: Palgrave.
Hardt-Mautner, Gerlinde. 1995. Only Connect: Critical Discourse Analysis and Corpus
Linguistics. UCREL Technical Paper 6. Lancaster: University of Lancaster.
<http://ucrel.lancs.ac.uk/tech_papers.html> [accessed 12 April 2014]
Hayes, Joseph. 1976. Gayspeak. The Quarterly Journal of Speech 62, 256-266.
Hoey, Michael. 2005. Lexical Priming: a new theory of words and language. Oxford:
Routledge.
Holmes, Janet & Stephanie Schnurr. 2006. Doing femininity and work: More than just
relational practice. Journal of Sociolinguistics 10(1), 31-51.
Hunston, Susan. 2002. Corpora in Applied Linguistics. Cambridge: Cambridge
University Press.
Hunston, Susan & Geoffrey Thompson (eds). 2000. Evaluation in Text: Authorial
Stance and the Construction of Discourse. Oxford: Oxford University Press.
57

Jackson, Sue. 2009. Hot Lesbians: Young Peoples Talk About Representations of
Lesbianism. Sexualities 12(2), 199-224.
Jaworska, Sylvia & Ramesh Krishnamurthy. 2012. On the F word: A corpus-based
analysis of the media representation of feminism in British and German press
discourse, 19902009. Discourse & Society 23(4), 401431.
Kim, Kyung Hye. 2014. Examining US news media discourses about North Korea: A
corpus-based critical discourse analysis. Discourse & Society 25(2), 221-244.
Koller, Veronika. 2009. Butch camp: On the discursive construction of a queer identity
position. Gender and Language 3(2), 249274.
Koller, Veronika. 2013. Constructing (non-)normative identities in written lesbian
discourse: a diachronic study. Discourse and Society 24(5), 551-568.
Kosetzi, Konstantia & Alexandra Polyzou. 2009. ,
The perfect man, the proper man Construals of masculinities in
Nitro, a Greek mens lifestyle magazine an exploratory study. Gender and
Language 3(2), 143-180.
Krishnamurthy, Ramesh. 1996. Ethnic, Racial and Tribal: The Language of Racism? In
Carmen Rosa Caldas-Coulthard & Malcolm Coulthard (eds) Texts and
Practices: Readings in Critical Discourse Analysis, 129-149. London:
Routledge.
Lazar, Michelle (ed.). 2005. Feminist Critical Discourse Analysis: Gender, Power and
Ideology in Discourse. London: Palgrave.
Louw, Bill. 1993. Irony in the text or insincerity in the writer? In Mona Baker, Gill
Francis & Elena Tognini-Bonelli (eds) Text and Technology, 157-176.
Amsterdam: Benjamins.
Macmillan Dictionary, online version at < http://www.macmillandictionary.com/>
[accessed 4 July 2014]
Mautner, Gerlinde. 2007. Mining large corpora for social information: the case of
'elderly'. Language in Society 36(1), 51-72.
Mautner, Gerlinde. 2009. Checks and balances: How corpus linguistics can contribute
to CDA. In Ruth Wodak & Michael Meyer (eds) Methods of Critical Discourse
Analysis, 122-143. London: Sage.

58

McRobbie, Angela. 1996. More! New Sexualities in Girls and Womens Magazines. In
John Curran, David Morley & Valerie Walkerdine (eds) Cultural Studies and
Communications, 17294. London: Arnold.
McRobbie, Angela. 2004. Postfeminism and Popular Culture. Feminist Media Studies
4(3), 25564.
Nattinger, James & Jeanette DeCarrico. 1992. Lexical Phrases and Language Teaching.
Oxford: Oxford University Press.
OHalloran, Kieran. 2009. Inferencing and cultural reproduction: a corpus-based critical
discourse analysis. Text & Talk 29(1), 21-51.
OKeeffe, Anne. 2011. Media and Discourse Analysis. In James Gee & Michael
Handford (eds) The Routledge Handbook of Discourse Analysis, 441-454.
London: Routledge.
Orpin, Deborah. 2005. Corpus Linguistics and Critical Discourse Analysis: Examining
the Ideology of Sleaze. International Journal of Corpus Linguistics 10(1), 37
61.
Partington, Alan. 1991. A Corpus-based Study of the Collocational Behaviour of
Amplifying Intensifiers in English. Unpublished M.A. dissertation: University of
Birmingham.
Partington, Alan. 1998. Patterns and Meanings. Amsterdam and Philadelphia: John
Benjamins.
Partington, Alan. 2004. Utterly content in each others company Semantic prosody
and semantic preference. International Journal of Corpus Linguistics 9(1), 131156.
Pearce, Michael. 2008. Investigating the collocational behaviour of MAN and WOMAN
in the British National Corpus using Sketch Engine. Corpora 3(1), 1-29.
Sinclair, John. 1996. The search for units of meaning. Textus IX, 75106.
Stubbs, Michael. 1997. Whorfs Children: Critical comments on Critical Discourse
Analysis (CDA). In Ann Ryan & Alison Wray (eds) Evolving models of
language, 100-116. Clevedon: BAAL in association with Multilingual Matters.
Stubbs, Michael. 2001. Words and Phrases: Corpus Studies of Lexical Semantics.
Oxford: Blackwell.

59

Tannen, Deborah. 1990. You just dont Understand: Women and Men in Conversation.
London: Virago.
van Dijk, Teun. 1995. Aims of Critical Discourse Analysis. Japanese Discourse 1, 1727.
van Dijk, Teun. 2010. Discourse, Power and Symbolic Elites.
<http://w2.bcn.cat/bcnmetropolis/arxiu/en/page5f80.html?id=21&ui=337>
[accessed 10 April 2014]
Wodak, Ruth & Brigitta Busch. 2004. Approaches to media texts. In John Downing,
Denis McQuail, Philip Schlesinger & Ellen Wartella (eds) The Sage Handbook
Of Media Studies, 105-123. London: Sage.
Wodak, Ruth & Michael Meyer. 2009. Methods of Critical Discourse Analysis. Second
Edition. London: SAGE.
Widdowson, Henry. 1995. Review of Fairclough Discourse and Social Change. Applied
Linguistics 16(4), 510516.
Widdowson, Henry. 1996. Discourse and interpretation: Conjectures and refutations
[Reply to Fairclough, 1996]. Language and Literature 5(1), 5769.
Widdowson, Henry. 1998. The theory and practice of critical discourse analysis.
Applied Linguistics 19(1), 136151.
Wilkinson, Sue. 1996. Bisexuality a la mode. Womens Studies International Forum
19(3), 293301.

Software
Anthony, Laurence. 2014. AntConc (Version 3.4.1m) [Computer Software]. Tokyo,
Japan: Waseda University. <http://www.antlab.sci.waseda.ac.jp/> [accessed 12
April 2014]

60

Appendix 1: List of lexical collocates


All lexical collocates of the query lesb* within a -4 and +4 span with a minimum MI value of 4,
sorted by decreasing MI score.
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

Collocate
transgender
bisexual
bisexuals
gays
vet
housekeeper
fling
lover
graphic
kiss
trans
couple
gay
affair
couples
fat
Muslim
explicit
equality
queen
lovers
affairs
charity
families
activists
openly
scenes
heterosexual
porn
community
romance
LGBT
wedding
ex
girlfriend
movement
sitcom
partner
magazine

MI score
8.67468
8.63505
8.62849
7.8661
7.65546
7.6068
7.38748
7.34121
7.29592
7.00046
6.93661
6.85356
6.82954
6.71319
6.69796
6.61255
6.59197
6.37215
6.25545
6.25345
6.22739
6.16509
5.96107
5.9597
5.93335
5.89153
5.74807
5.74097
5.72349
5.5696
5.53538
5.49035
5.48372
5.47469
5.40005
5.39583
5.37551
5.36101
5.33353

61

Frequency
201
294
12
90
18
22
11
59
18
33
13
135
861
38
117
15
13
12
61
32
13
11
17
26
14
20
22
20
17
28
10
34
15
20
17
10
10
36
18

(Continued)
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82

plays
marry
straight
staff
friends
relationships
accused
relationship
ministers
daughter
bar
action
scene
homosexual
sister
parents
pair
famous
men
club
group
love
French
members
sex
MP
black
secret
groups
rights
festival
full
drama
married
star
wife
having
experience
comedy
support
movie
people
young

5.32106
5.30017
5.29698
5.28745
5.28366
5.27651
5.25926
5.20347
5.16034
5.03738
5.004
5.004
4.92395
4.91389
4.89153
4.88331
4.87847
4.86178
4.84311
4.83708
4.8294
4.7606
4.7584
4.75402
4.71827
4.66166
4.65052
4.63397
4.61652
4.59954
4.59648
4.5696
4.56138
4.54719
4.461
4.43552
4.36152
4.35962
4.32732
4.30151
4.27201
4.26669
4.23544

62

20
15
30
15
50
19
12
45
10
26
10
14
18
13
13
35
11
12
63
13
25
70
15
14
96
11
22
11
10
37
13
18
21
23
22
23
33
12
14
19
11
113
31

(Continued)
83
84
85
86

former
single
called
knew

4.18161
4.1415
4.10584
4.09244

63

21
11
23
10