Beruflich Dokumente
Kultur Dokumente
Neal Goldfarb
www.LAWnLinguistics.com
2
Thinking Like a Linguist
B.T. Sue Atkins, Practical Lexicography and its Relation to Dictionary Making,
Dictionaries: Journal of the Dictionary Society of North America (1992/93)
Faced with the overwhelming richness and subtlety of the language in a computerized
corpus, I no longer believe that it is possible to give a faithful, far less a true, account of the
meaning of a word within the constraints of the traditional [dictionary] entry structure.
1. Paragraphing has been altered in some places, and in the interest of readability, not all deletions
have been indicated.
3
Thinking Like a Linguist
Any attempt to write a completely analytical definition of any common word in a natural
language is absurd. Experience is far too diverse for that. What a good dictionary offers
instead is a typification: the dictionary definition summarizes what the lexicographer finds
to be the most typical common features, in his [or her] experience, of the use, context, and
collocations of the word.Necessary and sufficient conditions are fine for a great number
of purposes in the construction of scientific concepts, but they are defective as tools for the
description of natural language or human cognitive processes.
4
Thinking Like a Linguist
Linguistic corpora
A linguistic corpus (plural = corpora) is a computerized database of real-world texts that
enables users to research the real-life use of English. With a corpus that is sufficiently large,
it is possible to identify patterns of use that would otherwise be impossible to see.
Several large corpora hosted by Brigham Young University are available for public use
without charge. These include the Corpus of Contemporary American English (COCA)
and the Corpus of Historical American English (COHA). COCA consists of roughly 520
million words taken from more than 160,000 separate texts from the period 19902015.
These texts are equally divided between five genres: spoken language, fiction, popular
magazines, newspapers, and academic journals. COHA, in turn, consists of approximately
400 million words from the period 1810s2000s (20 million words per decade).
Each word in these corpora is tagged with its part of speech (noun, verb, adjective, etc.),
and the interface makes it possible to perform different kinds of linguistically-oriented
searches. For example, one can search for the collocates of any given wordi.e., the
words that occur together with the target word. This is useful because seeing a words
collocates can provide insight into how the word is used and therefore what it means.
This is the search request in COCA that will produce a list of the nouns that are
modified by the word personal, listed in order of frequency:
5
Thinking Like a Linguist
This is part of the KWIC (Key Words in Context) display for personal life:
A similar display can be called up for each of the other collocates. And for each line in the
KWIC display, it is possible to call up a longer excerpt that provides more of the context.
Justice Thomas Lee of the Utah Supreme Court has used corpus research in concurring
opinions in several statutory interpretation cases, most notably in State v. Rasabout, 356 P.3d
1258, 127590 (Utah 2015) (Lee, J., concurring in part and concurring in the judgment).
However, the rest of the court hasnt yet climbed onto the corpus-linguistic bandwagon.
Despite the Utah courts wariness, the Michigan Supreme Court recently became the
first appellate court in the country (and possibly in the world) to endorse the use of corpus
linguistics in statutory interpretation. People v. Harris, N.W.2d 499 Mich. 332, 2016
WL 3449466 at *5 & nn.2934 (2016); see also id., 2016 WL 3449466 at * 11 n.14 (Markman,
J., concurring part and dissenting in part).
Further reading:
Stephen C. Mouritsen, The Dictionary Is Not A Fortress: Definitional Fallacies and A Corpus-
Based Approach to Plain Meaning, 2010 B.Y.U. L. Rev. 1915 (2010).
Stephen C. Mouritsen, Hard Cases and Hard Data: Assessing Corpus Linguistics As an
Empirical Path to Plain Meaning, 13 Colum. Sci. & Tech. L. Rev. 156 (2012).
Recent Case, Statutory InterpretationInterpretative ToolsUtah Supreme Court Debates
Judicial Use of Corpus Linguistics State v. Rasabout, 356 P.3d 1258 (Utah 2015), 129
Harv. L. Rev. 1468 (2016).
James C Phillips, Daniel M. Ortner, Thomas R. Lee, Corpus Linguistics & Original Public
Meaning: A New Tool to Make Originalism More Empirical, 126 Yale L.J. Forum 20 (2016),
http://www.yalelawjournal.org/forum/corpus-linguistics-original-public-meaning.
Lawrence M. Solan, Can Corpus Linguistics Help Make Originalism Scientific?, 126 Yale L.J.
Forum 57 (2016), http://www.yalelawjournal.org/forum/can-corpus-linguistics-help-
make-originalism-scientific.