Can Bad Science Be Good Evidence

University of Virginia School of Law
Public Law and Legal Theory Research Paper Series No. 2009-14
Can Bad Science Be Good Evidence? Lie Detection, Neuroscience and the Mistaken Conflation of Legal and Scientific Norms Frederick Schauer
University of Virginia School of Law
October 2009
This paper may be downloaded without charge from the Social Science Research Network Electronic Paper Collection: http://ssrn.com/abstract=1448744
A complete index of University of Virginia School of Law research papers is available at Law and Economics: http://www.ssrn.com/link/U-Virginia-LEC.html Public Law and Legal Theory: http://www.ssrn.com/link/U-Virginia-PUB.html
Electronic copy available at: http://ssrn.com/abstract=1448744
Can Bad Science Be Good Evidence? Lie Detection, Neuroscience and the Mistaken Conflation of Legal and Scientific Norms
Frederick Schauer
Abstract As the capabilities of cognitive neuroscience, in particular functional magnetic resonance imaging (fMRI) 'brain scans,' have become more advanced, some have claimed that fMRI-based lie-detection can and should be used at trials and for other forensic purposes to determine whether witnesses and others are telling the truth. Although some neuroscientists have promoted such claims, most aggressively resist them, and arguing that the research on neuroscience-based lie-detection is deeply flawed in numerous ways. And so these neuroscientists have resisted any attempt to use such methods in litigation, insisting that poor science has no place in the law. But although the existing studies have serious problems of validity when measured by the standards of science, and true as well that the reliability of such methods is significantly lower than their advocates claim, it is nevertheless an error to assume that the distinction between good and bad science, whether as a matter of validity or of reliability, is dispositive for law. Law is not only about putting criminals in jail, and numerous uses of evidence in various contexts in the legal system require a degree of probative value far short of proof beyond a reasonable doubt. And because legal and scientific norms, standards, and goals are different, good science may still not be good enough for some legal purposes, and, conversely, some examples of bad science my, in some contexts, still be good enough for law. Indeed, the exclusion of substandard science, when measured by scientific standards, may have the perverse effect of lowering the accuracy and rigor of legal fact-finding, because the exclusion of flawed science will only increase the importance of the even more flawed nonscience that now dominates legal fact-finding. And thus the example of neuroscience-based lie detection, while timely and important in its own right, is even more valuable as a case study suggesting that Daubert v. Merrill-Dow Pharmaceuticals may have sent the legal system down a false path. By inappropriately importing scientific standards into legal decision-making with little modification, Daubert confused the goals of science with those of law, a mistake that it is not too late for the courts to correct.
Forthcoming in Cornell Law Review, vol. 95 (2010), draft of 09/08/2009
CAN BAD SCIENCE BE GOOD EVIDENCE? NEUROSCIENCE, LIE-DETECTION, AND BEYOND
Frederick Schauer1
I. INTRODUCTION
How should the legal system confront the advances in the brain sciences that may possibly allow more accurate determinations of veracity lie detecting than those that now pervade the litigation process? In this Essay I seek to question the view, widespread among the scientists most familiar with these advances, that the neuroscience of lie-detecting is not, or at least not yet, nearly reliable enough to be used in civil and criminal litigation or for related forensic purposes. But in challenging the neuroscientists and their allies, I make no claims for the science of lie-detecting that go beyond the current state of scientific knowledge or, more importantly, my own ability to speak about the relevant scientific developments. Rather, I
1
David and Mary Harrison Distinguished Professor of Law, University of Virginia. This Essay was presented on September 5, 2009, at the Mini-Foro on Proof and Truth in the Law, Institute for Philosophical Research, Universidad Nacional Autonoma de Mxico (UNAM), on April 7, 2009, as an Inaugural Lecture at the University of Virginia School of Law, and on June 1, 2009, at the Duck Conference on Social Cognition. Many of the ideas presented here were generated during various meetings of the John D. and Catherine T. MacArthur Foundations Law and Neuroscience Project, whose tangible and intangible support I am delighted to acknowledge. Detailed and constructive comments by Charles Barzun, Teneille Brown, Greg Mitchell, John Monahan, and Bobbie Spellman have made this version immeasurably better than its predecessors. 1
argue that because laws goals and norms are different from those of science, there is no more reason to impose the standards of science on law than there is to impose the standards of law on science. Law must and should use science, and should always prefer good science to bad, but in some contexts good science may still not be good enough for law, and in other contexts hence the title of this Essay bad science, if measured by the standards of scientists, may still have valuable legal uses. To be clear, however, my goal in this paper is decidedly not to argue that neuroscience-based lie detection should, now or even in the foreseeable future, necessarily be admissible in court or legitimately used for other forensic purposes. Rather, it is to argue that the question whether neuroscience-based lie detection should be used for legalsystem purposes is a question that cannot be answered exclusively according to scientific standards of reliability and validity. Science can (and should) inform the legal system about the facts, including facts about degrees of reliability and the extent of experimental validity, but the ultimate normative and institutional question of whether and when, if at all, a given degree of validity or reliability is sufficient for this or that legal or forensic purpose is a legal and not a scientific question. In important respects the analysis of the potential legal uses of neuroscience-based liedetection is more case study than discrete topic. Most of what I argue here applies to other forms of lie-detection, to other forms of scientific evidence, and indeed to evidence generally. And as I elaborate in the latter part of this Essay, my central theme may call into doubt some dimensions of the modern revolution in the standards for the admission of scientific evidence.
Starting with Daubert v. Merrill Dow Pharmaceuticals, Inc.,2 and continuing through General Electric Co. v. Joiner3 and Kumho Tire Co., Ltd. v. Carmichael,4 the Supreme Court over the past sixteen years has attempted to deal with the very real problem of junk science by imposing increasingly stringent scientific standards of reliability and experimental validity on the admissibility of scientific evidence and expert testimony in the federal courts. In dealing with science and experts but not with the myths and superstitions that pervade the fact-finding process, however, the Court may, unintentionally have lowered the quality of evidence generally. By discouraging poor science while leaving non-science untouched, the Daubert revolution may perversely have increased reliance on the even worse science that dominates the litigation process by not masquerading as science at all. There may not be an easy solution to this problem, but its identification suggests that Daubert may have created as many problems as it solved. The revolution in scientific and expert testimony that started with Daubert , therefore, is, or at least should be, far from over. II. NEUROSCIENCE-BASED LIE-DETECTION: CLAIMS AND COUNTER-CLAIMS I commence by describing the current controversy over the legal and forensic5 uses of neuroscience-based lie-detection. In some respects the controversy should come as no surprise. The common law litigation process places huge reliance on the sworn testimony of
509 U.S. 579 (1993). 522 U.S. 136 (1997). 526 U.S. 137 (1999).
In this Essay I use forensic, as distinguished from legal, to refer to those dimensions of criminal investigation that precede or exist apart from the trial 3
witnesses, a phenomenon that is itself worthy of note.6 Many other methods of factual investigation, after all, employ dramatically different approaches relying far more heavily on primary rather than secondary sources of knowledge. The scientist who seeks to determine whether drinking red wine reduces the likelihood of heart disease does not summon representatives of the wine industry and the Temperance League to each make their cases, following which she decides which of the two advocates is more believable. Rather, she engages in the kind of primary research we call experimentation. So too with historians when they do original archival research, psychologists when they conduct experiments on subjects, empirical economists when they perform multiple regressions with large data sets, oceanographers when the explore the sea with scientific instruments or submersible watercraft, and researchers for policy-makers when they combine techniques such as these and others to determine the factual terrain for which they will make policy.7
Although sworn witness testimony also plays a large role in the civil law, the qualification in the text to the common law is a function of the somewhat larger role that judges play in many civil law countries in managing the process of direct factual investigation, especially in criminal cases. See Mireille Delmas-Marty & J.R. Spencer, European Criminal Procedures (2002); Jacqueline Hodgson, French Criminal Justice: A Comparative Account of the Investigation and Prosecution of Crime in France (2005).
7
Testimony and other forms of indirect evidence do play a significant role outside of law. See, for example, C.A.J. Coady, Testimony: A Philosophical Study (1992); Elizabeth Fricker, Testimony: Knowing Through Being Told, in Handbook of Epistemology 109 (I. Niiniluoto, M. Sintonen, & J. Wolenski, eds., 2004); Axel Gelfert, Indefensible Middle Ground for Local Reductionism about Testimony, 22 Ratio (new series) 170 (2009); John Hardwig, The Role of Trust in Knowledge, 88 J. Phil. 693 (1991). Nevertheless, law is noteworthy in relying on testimony and authority more than most other disciplines, and, conversely, and especially in courtroom settings, relying substantially less on direct investigation and experimentation. *A+uthority and hierarchy play a role in law that would be inimical to scientific investigation. Richard A. Posner, The Problems of Jurisprudence 62 (1990). 4
Once we grasp the diverse array of primary techniques for determining facts -- for figuring out what is or was the case we can understand how unusual the legal system is in routinely using party-generated witnesses to provide information as to which they, but not the trier of fact, have first-hand knowledge,8 and as to which the trier of fact is ordinarily precluded from obtaining the first-hand knowledge that in other domains remains the gold standard for empirical reliability. Still, the legal system we have, idiosyncratic as it is within the realm of empirical inquiry in relying so heavily on second-hand knowledge, is one in which it is often important to determine which of two opposing witnesses is telling the truth. Of course not all of litigation involves a conflict between a truth-teller and a liar. Honest misperceptions and more-or-less honest omissions, exaggerations, shadings, fudgings, slantings, bendings, and hedgings9 are an omnipresent feature of modern litigation. But so too is flat-out lying, and it should come as no surprise that because the legal system relies far more heavily on the reports of witnesses than on primary investigation by the trier of fact, the law has long been preoccupied with trying to assess whether the witnesses who testify in court (or otherwise provide information for legal or forensic decision-making) are telling the truth. Historically, the law relied on the oath to serve the truth-warranting function. When people genuinely believed that lying under oath would send them to hell, the law could
Fed. R. Evid. 609.
See Frederick Schauer & Richard Zeckhauser, Paltering, in Deception: From Ancient Empires to Internet Dating 38 (Brooke Harrington, ed., 2009). 5
comfortably rely on a witnesss fear of burning for eternity to provide confidence that witnesses were unlikely to say things they believed to be untrue.10 As religious belief diminished, or at least as laws confidence in it as a guarantor of truth waned, the legal system had its own substitute, increasingly relying on faith in the lie-exposing powers of vigorous cross-examination. As celebrated in the Perry Mason television series of the 1950s and 1960s, and as reinforced by numerous items of popular culture since, the legal system has long believed that cross-examination will so reduce the effectiveness of lying that a truth-determining system relying on witness testimony and subsequent cross-examination would not be unacceptably vulnerable to intentional deception.11 More importantly, and because cross-examination is far less effective in exposing lies and liars than television writers and their viewers believe, the legal system has placed its faith in judges and juries. Now, the task of determining veracity, and credibility in general, has been assigned largely to the trier of fact,12 most visibly even if not most frequently the jury, which in engaging in this task is asked
10
See Thomas Raeburn White, Oaths in Judicial Proceedings and Their Effect upon the Competency of Witnesses, 51 U. Pa. L. Rev. 373 (1903); Daniel Blau, Note, Holy Scriptures and Unholy Strictures: Why the Enforcement of a Religious Orthodoxy Demands a More Refined Establishment Clause Analysis of Courtroom Oaths, 4 First Amend. L. Rev. 223, 226-29 (2006).
11
See 5 John Henry Wigmore, Evidence in Trials at Common Law 32 (James Chadbourn ed., 1974). See, e.g., United States v. Barnard, 490 F.2d 907, 912 (9th Cir. 1973) (the jury is the lie detector in the courtroom); State v. Christensen, 163 P.3d 1175 (Idaho, 2007) (holding that admission of polygraph evidence would invade the province of the jury); Bloom v. People, 185 P.3d 797, 807 (Colo. 2008) (same); State v. Lyon, 744 P.2d 231, 240 (Or. 1987) (Linde, J., concurring) (same). See also United States v. Thompson, 615 F.2d 329, 332-33 (5th Cir. 1980) (stating that issues of credibility in general are for jury); State v. Myers, 382 N.W.2d 91, 97 (Iowa, 1986) (same). A comprehensive historical account is George Fisher, The Jurys Rise as Lie Detector, 107 Yale L.J. 575 (1997).
12
to assess, among other things, the demeanor of a witnesses, their past record of truth-telling (or not), the internal coherence of their stories, and the external coherence of their stories with the stories of others, all in order to determine who is telling the truth and who is not. Enter science. Because the criteria traditionally employed by judges and juries to evaluate the veracity of witnesses have been notoriously unreliable,13 the quest for a scientific way of distinguishing the truth-teller from the liar has been with us for generations. Indeed, what for many years was the prevailing legal standard for determining the admissibility of scientific evidence the Frye test14 arose in 1923 in the context of an unsuccessful attempt to admit into evidence a rudimentary lie-detection machine invented by William Moulton Marston. Marston is perhaps better known as the creator of the comic book character Wonder Woman, whose attributes included possession of a magic lasso, forged from the Magic Girdle of Aphrodite, which would make anyone it encircled tell the truth without fail. The device at issue in Frye was a simple polygraph and not a magic lasso, but not only did Frye set the standard for the admission of scientific evidence for more than half a century, its exclusion of lie-detection
13
Useful summaries of the research are in Aldert Vrij, Detecting Lies and Deceit: The Psychology of Lying and the Implications for Professional Practice (2000); Jeremy A. Blumenthal, A Wipe of the Hands, A Lick of the Lips: The Validity of Demeanor Evidence in Assessing Witness Credibility, 72 Nebr. L. Rev. 1157 (1993); Lindsley Smith, Juror Ability to Determine Deception and Veracity, 4 Comm. L. Rev. 4 (2000); Olin Guy Wellborn III, Demeanor, 76 Corn. L. Rev. 1075, 1082-88 (1991).
14
Frye v. United States, 293 F. 1013, 1014 (D.C. Cir. 1923) (holding that scientific evidence must use methods generally accepted in the relevant scientific community). 7
technology also paved the way for the continuing exclusion, with few exceptions, of liedetection evidence in American courts.15 The science of lie-detection has improved considerably since 1923, but not by so much as to lead to large-scale changes in judicial attitudes. Indeed, after the replacement (in federal courts) in 1993 of Fryes general acceptance test by Daubert v. Merrill Dow Pharmaceuticals Inc.16 and its insistence on various indicia of scientific reliability as a precondition of the admissibility of evidence purporting to be scientific, the situation has remained much the same.17 What makes the foregoing important is the rapidly changing state of cognitive neuroscience the study of human thinking using various methods of (indirectly) measuring See, e.g., Brown v. Darcy, 783 F.2d 1389, 1394 (9th Cir. 1986) (excluding lie-detection evidence); United States v. Gilliard, 133 F.3d 809, 815 (11th Cir. 1998) (same); United States v. Sanchez, 118 F.3d 192 (4th Cir. 1997) (same); Wilkins v. State, 190 P.3d 957 (Kan. 2008) (same); State v. Jones, 753 N.W.2d 677 (Minn. 2008) (same); People v. Richardson, 183 P.3d 1146 (Cal. 2008) (same). Decisions more sympathetic to the use of polygraph evidence under some conditions include United States v. Cordoba, 104 F.3d 225, 228 (9th Cir. 1997) (holding that a per se exclusion of polygraph testimony does not survive Daubert); Rupe v. Wood, 93 F.3d 1434 (9th Cir. 1996) (holding polygraph evidence admissible to support defendants statement at sentencing hearing); United States v. Posado, 57 F.3d 428 (5th Cir. 1995) (rejecting per se exclusion); United States v. Crumby, 895 F. Supp. 1354 (D. Ariz. 1995) (limiting polygraph use to corroborating or impeaching defendants testimony); United States v. Galbreth, 908 F. Supp. 877, 896 (D.N.M. 1995) (allowing polygraph evidence if examiner is properly qualified); Commonwealth v. Duguay, 720 N.E.2d 458, 463 (Mass. 1999) (same). The issue is sometimes dealt with by statute. Such statutes typically exclude polygraph evidence, as in California Evidence Code 351.1 (2008), but New Mexico is an exception. New Mexico Rules of Evidence 11-707 (2008).
15 16
509 U.S. 579 (1993). Further refinements were added by the Supreme Court in Kumho Tire Co., Ltd. v. Carmichael, 526 U.S. 137 (1999), and General Electric Co. v. Joiner, 522 U.S. 136 (1997).
17
See Christopher B. Mueller & Laird C. Kirkpatrick, Evidence 701-04 (4th ed., 2009). 8
brain activity, as opposed simply to examining the overt behaviors that such brain activity generates. The tools of modern neuroscience are numerous, but the most prominent of them is fMRI functional magnetic resonance imaging.18 Commonly called brain scanning, fMRI examination holds out the possibility of being able to determine which parts of the brain are being used for which cognitive tasks. Although novices seeing an fMRI scan sometimes believe that certain parts of the brain light up when engaged in certain tasks, 19 what actually occurs is that the portion of the brain being used recruits more oxygenated blood cells to help it in its task, and thus what appears to be a lit up part of the brain is actually a part of the brain that has more oxygenated hemoglobin in it than it had when it was less or differently cognitively engaged. As should by now be apparent, the development of fMRI technology has led some researchers to believe that this technology can be effective in distinguishing liars from truthtellers.20 If and it is a huge if different parts of the brain are used for lying than for truth
18
Accessible explanations of the fMRI include Scott A. Huettel, Allen W. Song, & Gregory McCarthy, Functional Magnetic Resonance Imaging (2004); Marcus Raichle, A Brief History of Functional Brain Mapping, in Brain Mapping: The Systems 33 (Arthur W. Toga & John Mazziotta eds., 2000); Marcus E. Raichle, A Brief History of Human Brain Mapping, 32 Trends in Neuroscience 118 (2008); Marcus E. Raichle & Mark A. Mintun, Brain Work and Brain Imaging, 29 Ann. Rev. Neuroscience 449 (2006).
19
See Teneille Brown & Emily Murphy, Functional Neuroimaging as Evidence of a Defendants Past Mental States, 62 Stan. L. Rev. (forthcoming 2010), manuscript, at 64.
20
See Christos Davatzikos, et al., Classifying Spatial Patterns of Brain Activity with Machine Learning Methods: Application to Lie Detection, 28 NeuroImage 663 (2005); G. Ganis, et al., Neural Correlates of Different Types of Deception: An fMRI Investigation, 13 Cerebral Cortex 830 (2003); Joshua D. Greene & Joseph M. Paxton, 106 Patterns of Neural Activity Associated with Honest and Dishonest Moral Decisions, Proceedings of the National Academy of Sciences USA 12506 (2009); F. Andrew Kozel, Tamara M. Padgett, & Mark George, Brief Communications: A 9
telling, or for deception rather than honesty, then the possibility exists of using a brain scan to determine whether the person whose brain is being scanned is lying or telling the truth. Or so it is claimed. And so especially is it claimed by those who see the commercial potential for just this technology. For-profit companies, in particular No Lie MRI21 and CEPHOS,22 have already begun marketing their lie-detection services, and these companies and their principals have been at the forefront of those touting the courtroom and forensic potential of the new technology.
Replication Study of the Neural Correlates of Deception, 118 Behavioural Neuroscience 852 (2004); Andrew Kozel, et al., Detecting Deception Using Functional Magnetic Imaging, 58 Biological Psychiatry 605 (2005); Andrew Kozel, et al., A Pilot Study of Functional Magnetic Resonance Imaging Brain Correlates of Deception in Healthy Young Men, 16 J. Neuropsychiatry & Clinical Neuroscience 295 (2004); Daniel D. Langleben, Detection of Deception with fMRI: Are We There Yet?, 13 Legal Criminological Psych. 1 (2008); Daniel D. Langleben, et al., Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI, 26 Human Brain Mapping 262 (2005), pp. 262-72; Daniel D. Langleben, et al., Brain Activity During Simulated Deception: An Event-Related Functional Magnetic Resonance Study, 15 NeuroImage 727 (2002); Tatia M.C. Lee, et al., Neural Correlates of Feigned Memory Impairment, 28 NeuroImage 305 (2005); Tatia M.C. Lee, et al., Lie Detection by Functional Magnetic Resonance Imaging, 15 Human Brain Mapping 157 (2002); Donald H. Marks, Mehdi Adineh, & Sudeepa Gupta, Determination of Truth From Deception Raising Functional MRI and Cognitive Engrams, 5 Internet J. Radiology (2006); Feroze Mohamed, et al., Brain Mapping of Deception and Truth Telling abut an Ecologically Valid Situation: Function MR Imaging and Polygraph Investigation Initial Experience, 238 Radiology 679 (2006); Jennifer Maria Nunez et al., Intentional False Responding Shares Neural Substrates with Response Conflict and Cognitive Control, 25 NeuroImage 267 (2005); Sean A. Spence, et al., Speaking of Secrets and Lies: The Contribution of Ventrolateral Prefrontal Cortex to Vocal Deception, 40 NeuroImage 1411 (2008), pp. 1411-18; Sean A. Spence, at al., Behavioural and Functional Anatomical Correlates of Deception in Humans, 12 Brain Imaging NeuroReport 2849 (2001); Paul Root Wolpe & Daniel D. Langleben, Lies, Damn Lies, and Lie Detectors, 86(2) Harv. Bus. Rev. 25 (2008); Paul Root Wolpe, et al., Emerging Neurotechnologies for Lie_Detection: Promises and Perils, 5 Am. J. Bioethics 39 (2005).
21
See http://noliemri.com See www.cephoscorp.com. 10
22
Neuroscience-based lie detection follows a long history of lie-detection technology. The earliest polygraphs were based on blood pressure, and more modern techniques include electroencephalography, which measures electric current generated by the brain,23 the analysis of facial micro-expressions of the kind developed by the psychologist Paul Ekman and now at the center of the television series Lie to Me,24 periorbital thermography,25 which measures the temperature around the eyes, and near-infrared spectroscopy,26 which uses infrared light to measure changes in blood flow and is thus the precursor of the even newer fMRI technology. These techniques each have their adherents, but I focus here on fMRI primarily because it appears by all accounts to be the most reliable of these techniques, although plainly not reliable enough, as we will see, to be endorsed for courtroom or forensic use by the vast majority of those most familiar with the technology. Indeed, in legal and policy debates almost as much as in physics, every action appears to produce an equal and opposite reaction. And so it has been with the reaction of mainstream academic neuroscientists to the claims regarding the lie-detection potential of fMRI scans. A prominent article by Stanford law professor Henry Greely and neuroscientist Judy Illes surveyed
23
See Lawrence A. Farwell & Sharon S. Smith, Using Brain MERMER to Detect Knowledge Despite Efforts to Conceal, 46 J. Forensic Sci. 135 (2001). Paul Ekman, Telling Lies: Clues to Deceit in the Marketplace, Politics, and Marriage (3 rd ed., 2001).
24 25
Ioannis Pavlidis & James Levine, Monitoring of Periorbital Blood Flow Rate Through Thermal Image Analysis and its Application to Polygraph Testing, 2 Engineering Med. & Biol. Socty 1183 (2002).
26
See Britton Chance, et al., A Novel Method for Fast Imaging of Brain Function, Non-Invasively, with Light,, 1 Optics Express 411 (1998). 11
all of the existing studies of neuroscience-based lie-detection through 2006, concluding that each of the studies fell far short of existing scientific standards of rigor and that the studies, both individually and in the aggregate, did not come close to establishing the reliability of fMRIbased lie-detection.27 Accordingly, Greely and Illes urged a legally imposed moratorium on the use of the technology for courtroom or forensic purposes until its reliability according to scientific standards could be established to the satisfaction of a federal regulatory agency. 28 Similarly, one leading neuroscientist has insisted that the data offer no compelling evidence that fMRI will work for lie detection in the real world.29 Another has concluded that *a+t present we have no good ways of detecting deception despite our very great need for them. 30 And still another has concluded that using laboratory findings on fMRI lie-detection in settings that can potentially impact individuals legal rights should, on the current state of knowledge,
27
Henry T. Greely & Judy Illes, Neuroscience-Based Lie Detection: The Urgent Need for Regulation, 33 Am. J. L. & Med. 377 (2007). See also Henry T. Greely, Neuroscience-Based LieDetection: The Need for Regulation, in Using Imaging to Identify Deceit: Scientific and Ethical Questions 46 (American Academy of Arts and Sciences, 2009); Henry T. Greely, Premarket Approval Regulation for Lie Detection: An Idea Whose Time May Be Coming, 5 Am. J. Bioethics 50 (2005); Jane Campbell Moriarty, Visions of Deception: Neuroimages and the Search for Truth, 42 Akron L. Rev. 739 (2009); Michael S. Pardo, Neuroscience Evidence, Legal Culture, and Criminal Procedure, 33 Am. J. Crim. L. 301 (2006).
28
Greely & Illes, supra note 27, at 419-20.
29
Nancy Kanwisher, The Use of fMRI Lie Detection: What Has Been Shown and What Not , in Using Imaging to Identify Deceit, supra note 27, at 7, 12. The word compelling is important, because it sets the standard, or at least Kanwishers standard, for usability. Perhaps there is plausible evidence, or some evidence, or more persuasive than not evidence, even if there is not compelling evidence, and the question, to which I shall turn presently, is why compelling rather than some other standard is the one to be used.
30
Marcus E. Raichle, An Introduction to Functional Brain Imaging in the Context of Lie Detection, in Using Imaging to Identify Deceit, supra note 27, 3, at 6. 12
remain a research topic, instead of a legal tool.31 An editorial in Nature Neuroscience joined the chorus of skepticism,32 as did a report from a National Research Council committee.33 And several published articles by researchers and practitioners from various disciplines insisted that fMRI lie-detection was not ready for the real world.34 Lying at the core of the campaign against the use of fMRI in real world legal settings is the conviction that the existing state of the research is poor science. And it is poor science, it is said, not only because of doubts about the reliability rates of neuro-science-based lie detecting methods, but also because the validity of the research alleged to support those methods and to determine the announced reliability rates is deeply flawed. The tests that have been conducted, they critics claim, are different in material ways from real-world lying and truthtelling, and thus it is a mistake to draw an inference from the accuracy of neural lie-detecting in experimental settings to the potential accuracy of those methods in detecting real-world liars.35
31
Elizabeth A. Phelps, Lying Outside the Laboratory: The Impact of Imagery and Emotion on the Neural Circuitry of Lie Detection, in Using Imaging to Identify Deceit, supra note 27, 14, at 20.
32
Editorial, 11 Nature Neuroscience 1231 (2008).
33
National Research Council, Committee on Military and Intelligence Methodology for Emergent Neurophysiological and Cognitive/Neural Research, Emerging Cognitive Neuroscience and Related Technologies (2008).
34
James R. Merikangas, Functional MRI Lie Detection, 36 J. Am. Acad. Psychiatry & L. 499 (2008); Michael S. Gazzaniga, The Law and Neuroscience, 60 Neuron 412 (2008); Jed S. Rakoff, Lie Detection in the Courts: The Vain Search for the Magic Bullet, in Using Imaging to Identify Deceit, supra note 27, at 40; Joseph R. Simpson, Functional MRI Lie Detection: Too Good to Be True?, 36 J. Am. Acad. Psychiatry & L. 491 (2008); Sean A. Spence, Playing Devils Advocate: The Case Against fMRI Lie Detection, 13 Legal & Criminological Psych. 11 (2008).
35
Greely & Illes, supra note 27. To the same effect is the argument that *r+eports of finding brain patterns of activation corresponding to deception almost always use subjects (often university students) who are told to lie about something (usually a relatively unimportant 13
Part of the difference, indeed the major difference, is that in most instances the experimental subjects have been instructed to lie, and whether an instructed lie is even a lie at all presents substantial questions of construct validity whether the experiment measures what it purports to measure that cast significant doubts on the research conclusions.36 There are also serious questions about sample size, levels of reliability, potentially confounding variables (whether subjects are left- or right-handed, for example, or whether the subject moved, factors that make a difference in evaluating fMRI results), external validity (whether experimental results would exist in even a parallel real-world situation), and the significant possibility that subjects could take counter-measures that would render the test results especially unreliable.37 Thus, the existing research stands accused of being flawed even as pure laboratory research, and far less applicable to non-laboratory settings than its proponents have typically claimed. The charges against the existing research go even further. Many of the results have not been published in peer-reviewed journals and have not been replicated, thus failing to satisfy the normal standard for assessing scientific outcomes.38 And quite a few of the experiments indeed, most of them -- have been conducted by those with some connection with No Lie MRI or CEPHOS and consequently with a commercial interest in the outcome. And, finally, the
matter). Equating the lies told in such an artificial setting to the kind of lies people tell in reality is pure fantasy at this point. Editorial, supra note 32, at 1231.
36
See especially Kanwisher, supra note 29, at 12. See also Greely, Neuroscience-Based LieDetection,, supra note 27, at 50.
37
Greely, Neuroscience-Based Lie-Detection, supra note 27, at 50-51; National Research Council, supra note 33; Raichle, supra note 30, at 6.
38
Kanwisher, supra note 29, at 12. 14
alleged degree of accuracy as high as 90%, according to some claims39 of neural liedetecting is considerably higher than what the experimental data actually show. 40 To the extent that the proponents of neural lie-detecting maintain that their claims about the reliability and accuracy of their methods are scientifically sound as the product of scientifically valid experimentation, they appear to have been exposed as relying on flawed science. Without better evidence of external validity, without dealing with the construct validity problem of distinguishing the genuine lie from following an instruction to utter words that are not literally true, without more rigorous scrutiny of the existing claims of reliability, without higher verified rates of accuracy, without replication, and without subjecting the research to peer review by financially disinterested scientists, the claimed ability to use fMRI testing to identify liars appears to be just that a claim and far from what good scientists take to be a sound scientific conclusion. III. ON THE RELATIONSHIP BETWEEN SCIENTIFIC AND LEGAL STANDARDS The fact that the science to date appears both to be methodologically flawed and less than compelling in its conclusions is far from the end of the story, the arguments of the skeptics notwithstanding. The rest of the story, however, is not a story about science, but is instead primarily a story about law,41 and about why there may be good reasons to doubt that the
39
Kozel (2005), supra note 20. See also Langleben (2005), supra 20 (76%); Davatzikos, supra note 20 (89%).
40
Greely, Neuroscience-Based Lie Detection,, supra note 27, at 51.
41
See Dale A. Nance, Reliability and the Admissibility of Experts, 34 Seton Hall L. Rev. 191, 203 (2003) (arguing for use of legal standards in evaluating scientific expertise). 15
many scientific failings of fMRI-based lie-detection are or should be dispositive for the legal system. First, law is about far more than putting criminals in jail, although that particular type of legal decision appears to generate the fear that motivates so much of the existing scientific criticism. One of the scientists quoted above has also said that the results would be especially unreliable if the subject believed that the results could send him to prison, 42 and another participant at the same symposium worried about a future in which the police may reque st a warrant to search your brain.43 These may well be legitimate worries, but their seriousness depends largely on a view of the Fifth Amendment privilege against self-incrimination that would characterize an involuntary lie-detection test of whatever kind as physical and nontestimonial,44 an outcome that seems unlikely, albeit not impossible, under current law. Given that law enforcement authorities may not require a suspect to talk at all, it is difficult to imagine a legal state of affairs in which a defendants statement is subject to an involuntary neuroscientific evaluation of its validity, and thus the circumstances in which an involuntary fMRI would be usable against a defendant would not only require a court to reject an earlier Supreme Court statement that the results of lie detector tests would be testimonial and thus
42
Statement of Nancy Kanwisher, as quoted in A Good Lie Detector is Hard to Find, http://web.mit.edu/newsoffice/2007/lying.html.
43
See Science News, Science Daily, February 19, 2007, at http://www.sciencedaily.com/releases/2007/02/070218184515.htm.

44
Schmerber v. California, 384 U.S. 757 (1964). 16
encompassed by the Fifth Amendment,45 but would also require that the fMRI not be used in conjunction with, or to test the validity of, any statement made by the defendant. 46 But even conceding for the sake of argument that a future that included brain scan warrants is a legitimate worry,47 it is nevertheless a rather large leap to transpose that worry to a concern about numerous other potential courtroom and forensic uses of lie-detection technology. Most importantly, the question of what evidence the prosecution can use against a defendant is very different from the question of what evidence a defendant can use to attempt to establish his innocence under existing defendant-protective burdens of proof. Suppose, to attach some arbitrary but conservative numbers to the existing research, that an fMRI evaluation of a defendants claim of innocence I was somewhere else, or He started the fight, for example has an accuracy rate of 70%.48 It is of course clear that we should not imprison people on a 70% probability of their guilt, and we do not do so. But whether to
45
Schmerber, 384 U.S. at 764.
46
See Benjamin Holley, Its All in Your Head: Neurotechnological Lie Detection and the Fourth and Fifth Amendments, 28 Developments in Mental Health L. 1 (2009); Sarah E. Stoller & Paul R. Wolpe, Emerging Neurotechnologies for Lie Detection and the Fifth Amendment , 33 Am. J. L. & Med. 359 (2007).
47
The worry is fueled in part by a case in Mumbai, India, in which an involuntary fMRI scan was used by the prosecution to challenge the veracity of a criminal defendant. See Editorial, supra note 32.
48
Note that the accuracy rate for identifying truth may differ from the accuracy rate for identifying a lie. Suppose a defendant claims he was somewhere else when the crime was committed and suppose further that an fMRI indicates he is telling the truth. On the existing state of the research, see Kanwisher, supra note 29, at 8, this fMRI result is more reliable has a smaller percentage of errors than an fMRI result that indicated that the defendants statement was false. In other words, truths are identified as lies less often than lies are identified as truths. 17
imprison people who are 70% likely to be guilty is not the question, or at least not the only question. An equally important question is whether if there is a 70% chance that a defendants claim of innocence is accurate, we would want to conclude that his guilt has been proved beyond a reasonable doubt? Indeed, this was exactly the issue in the 1998 Supreme Court case of United States v. Scheffer,49 involving a defendant in a military court martial who sought to introduce in his own defense a polygraph test supporting the accuracy of his assertion of innocence. The test results had been excluded under Rule 707 of the Military Rules of Evidence,50 and the defendant challenged the constitutionality under the Due Process51 and Compulsory Process52 Clauses of Rule 707s absolute exclusion of polygraphic evidence. The Supreme Court, over Justice Stevens dissent and in the face of at least some concern about a blanket rule of exclusion expressed by four other Justices who concurred in the judgment but not with what all of Justice Thomas wrote in announcing the conclusion of the Court, held that there was no federal constitutional right of a defendant to offer polygraphic exculpatory evidence. That there may not be such a constitutional right, however, does not answer the non-constitutional legal policy question of whether such evidence ought to be admitted when offered by a defendant under these or similar circumstances. Nor does it answer the question whether the majoritys simple distinction between reliable and unreliable masks the important distinction between how reliable evidence must be in order to allow its use to a
49
523 U.S. 303 (1998). Mil. R. Evid. 707. U.S. Const. amend. V. U.S. Const. amend. VI. 18
50
51
52
defendant who wishes to raise the possibility of a reasonable doubt as to his guilt, whether by buttressing the defendants claim of innocence or, perhaps even more likely, by attacking the credibility of a police officer or some other prosecution witness. Of course any scientific test will have some level of reliability. Whether that level of reliability is high enough for admissibility, however, depends on the purposes for which it is being employed. If the outcome of a test is being used as the sole or principal evidence of whether a defendant should go to prison, as it more-or-less is in some current uses of DNA identification,53 we should demand extremely high levels of reliability. If the evidence is being used as merely one component of a larger story about whether a defendant should go to prison, then perhaps the level of reliability can be lower a brick is not a wall, as the famous adage in the law of evidence goes.54 After all, it is not true that just because the standard of proof for conviction of a crime is proof beyond a reasonable doubt that every piece of evidence admissible to (cumulatively) establish proof beyond a reasonable doubt must be capable of individually proving beyond a reasonable doubt that the defendant is guilty. 55 Nor is it true that every piece of evidence introduced by the prosecution must be reliable beyond a reasonable doubt, for such a conclusion would collapse the standard for determining guilt into the See, for example, United States v. Beasley, 102 F.3d 1440 (8th Cir. 1996); United States v. Cuff, 37 F. Supp. 2d 279 (S.D.N.Y. 1999); State v. Bartylla, 755 N.W.2d 8 (Minn. 2008). See generally Peter Donnelly & Richard Friedman, DNA Database Searches and the Legal Consumption of Scientific Evidence, 97 Mich. L. Rev. 931 (1999).
53
McCormick on Evidence 185, at 640-41 (John W. Strong ed., 5th ed., 1999). To the same effect, it is not to be supposed that every witness can make a home run. Judson F. Falknor, Extrinsic Policies Affecting Admissibility, 10 Rutgers L. Rev. 574 (1956).
54 55
On confusing admissibility with sufficiency generally, see Dale A. Nance, Conditional Relevance Reinterpreted, 70 B.U. L. Rev. 447, 449-59 (1990). 19
standard for determining the admissibility of an individual piece of evidence.56 Moreover, if we are using the evidence to show why a defendant should not go to prison, then the level of reliability can be lower still, arguably much lower.57 After all, we do not have a system in which a defendant goes to prison unless he can prove beyond a reasonable doubt that he is not guilty. Many of the same considerations apply to civil cases as well. The American legal system employs the standard of proof by a preponderance of the evidence in almost all civil cases because the system is committed to the view that failure to award damages, say, to an injured or otherwise wronged plaintiff is every bit as serious an error as is wrongly requiring a non56
See United States v. Glynn, 578 F. Supp. 2d 567, 574-75 (S.D.N.Y. 2008); In re Ephedra Products Liability Litigation, 393 F. Supp. 2d 181, 187-88 (S.D.N.Y. 2005).
57
For an extended argument about the advantages of asymmetry between prosecution and defense in the standards for admission of scientific evidence, see Christopher Slobogin, Proving the Unprovable: The Role of Law, Science, and Speculation in Adjudicating Culpability and Dangerousness 131-44 (2007). The strongest response to the argument for asymmetry is that the justifiable defendant-skewed epistemic goals of the criminal justice system are already incorporated in the presumption of innocence and the proof-beyond-a-reasonable-doubt burden of proof, and to overlay special evidentiary burdens on the prosecution (or special evidentiary benefits on the defense) would thus be a form of double-counting. See Larry Laudan, Truth, Error, and the Criminal Law: An Essay in Legal Epistemology 123-28, 144 (2006). But this argument rests on the assumption, perhaps a justified one but perhaps not, that the existing standard of proof achieves the socially proper distribution of errors of false acquittal and of false conviction. If it does not, then, given the historical provenance (and thus resistance to modification) of the beyond-a-reasonable-doubt-standard, adjusting the results of that standard (whether upwards or downwards) by the use of other evidentiary, substantive, or procedural devices seems hardly inappropriate. Nor is there any reason to believe that the best way to achieve the optimal distribution of error is with one burden of proof rule rather than a combination of multiple evidentiary and procedural rules. See Raphael M. Goldman & Alvin I. Goldman, Review of Truth, Error and the Criminal Law: An Essay in Legal Epistemology, by Larry Laudan, 15 Legal Theory 55 (2009); Michael S. Pardo, On Misshapen Stones and Criminal Laws Epistemology 86 Texas L. Rev. 347 (2007) (book review). Regardless of the outcome of the debate about asymmetry, however, the very existence of the debate, and the terms on which it is conducted, demonstrates the folly of trying to determine questions of the legal usability of evidence without taking account of legal goals and legal standards. 20
culpable defendant to pay damages.58 And once again, therefore, it is hardly clear that a party in a civil lawsuit seeking to bolster his assertions about facts should be precluded from doing so by use of a method whose reliability is nowhere near sufficient to send someone to jail. Awarding damages is less serious than imprisoning someone, or so our legal system believes, and as long as that is so then it is a mistake to assume that a uniform standard of reliability should govern all legal uses of a particular type of evidence. Although the foregoing is about reliability and not validity, the same analysis applies to questions of experimental validity as well. The experiments alleged to establish the reliability of fMRI lie-detection have been attacked as lacking in external and construct validity, 59 but, like questions of reliability, these issues of validity are also matters of degree,60 and whether some degree is or is not good enough will again depend on the uses for which the experiment is being put. Consider first the question of external validity, the question whether laboratory results on
58
On the decision theoretic aspects of burdens of proof in civil cases in general, see James Brook, Inevitable Errors: The Preponderance of the Evidence Standard in Civil Litigation, 18 Tulsa L.J. 627 (1982); Bruce Hay & Kathryn Spier, Burdens of Proof in Civil Litigation: An Economic Perspective, 26 J. Legal Stud. 413 (1997); John Kaplan, Decision Theory and the Factfinding Process, 20 Stan. L. Rev. 1065 (1968); Dale A. Nance, Civility and the Burden of Proof, 17 Harv. J.L. & Pub. Pol. 647 (1994); Frederick Schauer & Richard Zeckhauser, On the Degree of Confidence for Adverse Decisions, 25 J. Legal Stud. 27 (1996). An important challenge to the conventional view about burdens of proof in civil cases is Ronald J. Allen, The Nature of Juridical Proof, 13 Cardozo L. Rev. 373 (1991); Ronald J. Allen, A Reconceptualization of Civil Trials, 66 B.U. L. Rev. 401 (1986), arguing, correctly, that because the plaintiff must typically prove each of multiple elements of an offense by a preponderance of the evidence, then the actual burden on the plaintiff is substantially higher than the burden on the defendant.
59
See notes 27-37 and accompanying text, supra.
60
See Kenneth R. Foster & Peter W. Huber, Judging Science: Scientific Knowledge and the Federal Courts 17 (1999); Erica Beecher-Monas, A Ray of Light for Judges Blinded by Science: Triers of Science and Intellectual Due Process, 33 Ga. Rev. 1047, 1062 (1999); Nance, supra note 41, at 200. 21
a certain subject population can be employed to reach conclusions or make predictions about a different and non-laboratory subject population. The issue often arises in psychological experiments in which conclusions drawn from experiments using university undergraduates a common pool of experimental subjects can be applied to the behavior of non-undergraduates in non-laboratory settings. What makes the experimental research useful, however, is that there are other experiments that have demonstrated a substantial correlation for many kinds of studies between the results reached in the laboratory with experimental subjects and the results observed in non-laboratory settings.61 These correlations are not perfect, of course, but they are positive to a substantial degree, and whether that substantial degree is good enough will depend on the uses to which the research is to be put. When the research is to be used to enact policy with negative consequences for some segment of the population, for example, we should demand a higher correlation between laboratory results and non-laboratory conclusions than we would if, for example, the population were merely being warned to be aware of a dangerous phenomenon that thus far had been demonstrated only in laboratory settings on subjects motivated differently from subjects in the external (non-laboratory) world. Although perhaps less obvious, the same considerations apply to construct validity as well. Suppose we wish to determine whether there is a relationship between eating a big breakfast and increased proficiency in performing mathematical tasks. And then suppose
61
See Craig J. Anderson, James J. Lindsay, & Brad J. Bushman, Research in the Psychological Laboratory: Truth or Triviality, 8 Current Directions in Psychological Sci. 3 (1999); Leonard Berkowitz & Edward Donnerstein, External Validity is More than Skin Deep: Some Answers to Criticisms of Laboratory Experimentation, 37 Amer. Psych. 245 (1982); Douglas J. Mook, In the Defense of External Invalidity, 38 Amer. Psych. 379 (1983). 22
someone were to have conducted an experiment demonstrating a relationship between eating a big breakfast and an increased ability to avoid misspellings for the rest of that day. If this experiment were used to support a claim about breakfast and mathematical proficiency, it would be open to the charge of construct validity, because what the experiment measured spelling ability was not the same as what the experiment was being offered to show mathematical proficiency. But if there were to exist a demonstrated correlation between avoiding spelling mistakes and avoiding mathematical errors, then an experiment showing an effect on the former would provide some evidence that there would be an effect on the latter. It would not be conclusive evidence, but the deficiencies in construct validity would not render the experiment actually conducted totally spurious in terms of the ability to draw conclusions about a different but correlated effect. So too, perhaps, with the flaws in construct validity in many existing experiments about neuroscience-based lie-detection. Although there are important exceptions,62 most of the experiments purportedly establishing the reliability of fMRI lie-detection are experiments in which subjects are told by the experimenter to lie or not to lie. When the experimenter is told to lie, it is argued, following that instruction is not lying at all, and thus an fMRI result demonstrating a certain kind of brain activity for following an instruction to lie tell us nothing about the kinds of brain activity involved in actual lying.63 But even though this gap between the instructed lie and the real lie is a significant problem of construct validity, it would render
62
See especially Greene, supra note 20. See Greely & Illes, supra note 27; Kanwisher, supra note 29. 23
63
the experimental results totally useless only if there were no correlation at all between the brain activity involved in the real lie and that involved in the instructed lie. As yet we do not know whether such a correlation exists, but if there is some correlation, even if small, once again it would be incorrect to conclude that the existing studies condemn the use of fMRIbased lie-detection as completely without evidentiary value, as opposed to offering, say, only slight evidence. Although slight evidence ought not to be good enough for scientists,64 it is a large part of the law. Not only do basic principles of evidence law (as well as human thinking) routinely allow the cumulation of weak but not spurious pieces of evidence, whether in holistic storycreation form,65 or for the related purpose of prompting an inference to the best explanation, 66 or in Bayesian more linear fashion,67 and not only might weak evidence be sufficient to allow a defendant to resist a prosecutions claim to have established his guilt beyond a reasonable doubt, but low standards of proof or persuasion pervade the legal system. A plaintiff in some states can resist a defendants motion for a directed verdict by offering only a scintilla o f
64
See David H. Kaye, Statistical Significance and the Burden of Persuasion, 46 Law & Contemp. Probs. 13 (1983). And see Raichle, supra note 30, at 5 (equating validity with high statistical quality).
65
See Inside the Juror (Reid Hastie ed., 1993); Reid Hastie, The Role of Stories In Civil Judgments, 32 Univ. Mich. J.L. Ref. 1 (1999); Reid Hastie & Nancy Pennington, A Cognitive Theory of Juror Decision-Making: The Story Model, 13 Cardozo L. Ref. 519 (1991).
66
See Amalia Amaya, Inference to the Best Legal Explanation, in Legal Evidence and Proof: Statistics, Stories, Logic (H. Kapstein, H. Prakken, & B. Verheij, eds., forthcoming 2009); Ronald J. Allen & Michael S. Pardo, Juridical Proof and the Best Explanation, 27 L. & Phil. 223 (2008).
67
See Alvin Goldman, Quasi-Objective Bayesianism and Legal Evidence, 42 Jurimetrics 108 (2002). 24
evidence.68 In many contexts, evidence that is substantial but less than a preponderance can generate legal results.69 And the police may stop and frisk a person upon reasonable suspicion70 and can obtain a search warrant by showing probable cause t o believe that the search will yield usable evidence.71 For all of these purposes, and many more, weak (and thus potentially flawed) evidence serves pervasive functions in the legal system, and to require that only compelling or conclusive or even highly reliable evidence, certified as such on the basis of highly valid scientific processes, would be to dramatically revamp the legal system as we know it. IV. JUDGES, JURIES, AND THE DANGERS OF MISUSE In resisting much of the foregoing argument, it is sometimes said that juries are simply not very adept at distinguishing one piece of evidence from another, or at evaluating technical evidence critically, and that as a result superficially persuasive pseudo-scientific evidence will have more of an effect on deliberations than it should have.72 In effect, the claim is that one brick will in fact constitute the entire wall for most jurors. Yet even apart from the diminishing
68
See, e.g., AALAR, Ltd. v. Francis, 716 So.2d 1141, 1147 (Ala. 1998). See, e.g., De la Luente v. FDIC, 332 F.3d 1208, 1220 (9th Cir. 2003). Terry v. Ohio, 392 U.S. 1 (1968). U.S. Const. amend. IV.
69
70
71
72
See, for example, Walter Sinnott-Armstrong, Adina Roskies, Teneille Brown, & Emily Murphy, Brain Images as Legal Evidence, 5 Episteme 359 (2008). 25
role that juries play in American litigation,73 the state of the empirical evidence on jury overvaluation is decidedly mixed.74 Indeed, were we (and the neuroscientists) to subject the common claims of jury over-valuation to the same scrutiny that we subject scientific evidence, we might well find that some of the scientific basis for excluding bad scientific evidence is itself an example of less than ideal science. A good example is the research purportedly showing specifically that people will take brain scan images as having more evidentiary value than such images, in some instances, actually have.75 One of the studies,76 however, compared the effect of textual neurobabble with that of accurate explanations, another compared brain scans to
73
Some of the data are reported in Frederick Schauer, On the Supposed Jury-Dependence of Evidence Law, 155 U. Penn. L. Rev. 165 (2006).
74
See Laudan, supra note 36, at 214-18. Good summaries of the existing primary research, much of which suggests that juries are not nearly as inept at evaluating scientific or expert evidence as is often supposed, can be found in, for example, Richard D. Friedman, Minimizing the Jury Over-Valuation Concern, 2003 Mich. St. L. Rev. 967 (2003); Michael S. Jacobs, Testing the Assumptions Underlying the Debate About Scientific Evidence: A Closer Look at Juror Incompetence and Scientific Objectivity, 25 Conn. L. Rev. 1083, 1086-93 (1993); Daniel A. Krauss & Bruce D. Sales, The Effect of Clinical and Scientific Expert Testimony on Juror Decision Making in Capital Sentencing, 7 Psychology, Pub. Pol. & L. 267 (2001); Richard O. Lempert, Civil Juries and Complex Cases: Taking Stock After Twelve Years, in Verdict: Assessing the Civil Jury System 181, 235 (Robert E. Litan ed., 1993); Dale A. Nance & Scott B. Morris, Juror Understanding of DNA Evidence: An Empirical Assessment of Presentation Formats for Trace Evidence with a Relatively Large and Quantifiable Random-Match Probability, 42 Jurimetrics 403 (2002); Neil Vidmar & Shari S. Diamond, Juries and Expert Evidence, 66 Brook. L. Rev. 1121 (2001). Joseph Sanders, The Paternalistic Justification for Restrictions on the Admissibility of Expert Testimony, 33 Seton Hall L. Rev. 881, 937-38 (2003), endorses the jury over-valuation worry, but appears to based it more on problems of complexity of evidence rather than on jury mis-understanding of science or expertise in non-complex cases.
75
D.P. McCabe & A.D. Castel, Seeing is Believing: The Effect of Brain Images on Judgments of Scientific Reasoning, 107 Cognition 343 (2008); Deena Skolnick Weisberg et al., The Seductive Allure of Neuroscience Explanations, 20 J. Cognitive Neuroscience 470 (2008).
76
Weisberg et al., supra note 75. 26
plain text and to black and white bar graphs,77 and a third compared the effect of a neuroimage with testimony that was read aloud.78 As a result of failure to exclude the potentially confounding variables of complexity and of photographic representation and color, however, we have no idea whether the allegedly distorting effect of brain scans is in fact an effect of brain scans or is instead the effect of any photographic image,79 or an effect of any image (or even drawing) in color, or the effect of complex information presented without opposing explanations and without opportunity for cross-examination. And thus we lack evidence supporting the belief that judges and juries will overvalue brain scan evidence as such, although we do have general evidence indicating that jurors may understand more than we think they do.80 Moreover, in practice if not in theory, the admissibility and use of some types of evidence, like the actual practice with respect to hearsay and many other exclusions, may vary with whether it is judge or jury who is serving as the trier of fact. Because juries make none of the decisions regarding reasonable suspicion to stop, probable cause to search, and other decisions as to which the credibility of (especially) a police officer is at issue, and because juries in fact make only a small percentage of the decisions in trials themselves, both criminal and
77
McCabe & Castel, supra note 70.
78
Jessica R. Gurley & David K. Marcus, The Effects of Neuroimaging and Brain Injury on Insanity Defenses, 326 Behavioral Sci. & L. 85 (2008).
79
See Adina L. Roskies, Are Neuroimages Like Photographs of the Brain, 74 Phil. Sci. 860 (2007); Sinnott-Armstrong et al., supra note 72, at 367-68. On the distorting effect of photographs generally, see David A. Bright & Jane Goodman-Delahunty, Gruesome Evidence and Emotion: Anger, Blame, and Jury Decision Making, 30 L. & Human Behavior 183 (2006). And on the distorting effect of colored images, see Aura Hanna & Roger Remington, The Representation of Color and Form in Long Term Memory, 24 Memory & Cognition 322 (1996).
80
See authorities cited in note 74, supra. 27
civil, it may be a mistake to extrapolate what we know about jury decision-making or jury comprehension of scientific evidence to the legal system generally. Admittedly, there would be difficulties in designing an evidentiary system in which admissibility varied depending on whether the trier of fact was judge or jury, and formally if not informally the American legal system has rejected such an approach, but whether what we know about juries, even if the skepticism about juror comprehension of science is well-grounded, is the appropriate model for all of law is once again a determination that cannot be made without regard to the normative goals and varied tasks of the legal system. That the standards of sufficient reliability should vary with their legal uses remains subject to the concern about contamination across the different uses, leading ti inappropriate uses as well as appropriate ones.81 There is a concern, for example, that allowing fMRI lie detection by a defendant as a way of supporting his claim of innocence might lead to allowance of the same technique by the prosecution as a way of sending a (possibly innocent) defendant to prison or by a plaintiff as a way of obtaining damages against a possibly non-culpable defendant in a civil case. And a related worry would be that allowing judges to hear fMRI lie detection evidence in evaluating the credibility of a police officer at a suppression hearing before a judge or magistrate might again lead, eventually, to allowing such evidence to be heard by juries in determining ultimate guilt or innocence. These worries may not be completely fanciful, but again the claims are empirical and causal ones about the effect of one action on another, and it is more than a bit ironic that those who are most insistent about
81
See D. Michael Risinger, Navigating Expert Reliability: Are Criminal Standards of Certainty Being Left on the Dock?, 64 Alb. L. Rev. 99 (2000). 28
finding a sound scientific and empirical basis for the admission of various forms of evidence seem often to be comfortable abandoning the science in favor of their own hunches when the question is about the potential downstream dangers of allowing certain forms of evidence to be used for a particular purpose. Those dangers may well exist, but to date there is no scientific evidence to support them. For now, the empirical support for the view that allowing fMRI lie detection by a defendant in a criminal case will lead to allowing fMRI lie detection by the prosecution against an unwilling defendant appears to be weaker than the admittedly weak empirical support for the view that fMRI lie detection can actually distinguish liars from truth tellers. V. COMPARED TO WHAT? In law as well as in science, one of the most important questions is, compared to what? And the compared to what? question can usefully be applied to questions about the determination of witness veracity in courts of law. Traditionally, the task of assessing witness credibility and veracity has been left to the scientifically-unaided determination of the trier of fact often the jury but just what are the mechanisms that the trier of fact uses to make these determinations? We know that jurors often use characteristics other than the content of what a witness says to evaluate the truth of a witnesss claim, we know that these non -content characteristics include factors such as whether a witness looks up or down, fidgets, speaks slowly or quickly, and testifies with apparent confidence, and we know from serious research that these alleged indicators of veracity are at best highly unreliable, and at worst totally
29
random.82 And this is why numerous studies of the ability of untrained people to determine truth-telling in others rarely rises above 60%, where 50% would be random.83 Moreover, the problem of unreliable determination of veracity by judges and juries is exacerbated by the rules of evidence themselves, which presume on the basis of scarcely more than venerable superstition that those who have been convicted of serious crimes, even crimes not involving dishonest statements, are more likely to lie than those who have not,84 which allow witnesses to offer testimony about whether other witnesses have a reputation in the community for lying or truth-telling,85 and which allow witnesses to offer their personal opinions about the general credibility of other witnesses.86 We can thus reframe the question. The question is not whether fMRI based lie detection is reliable enough in the abstract to be used in court. Rather, it is whether there are good reasons to prohibit the use of evidence of witness veracity that may well be better than the evidence of witness veracity that now dominates the litigation process, and at the very least is probably no worse. The choice is not between very good evidence of veracity and less good bad, if you will fMRI evidence. Rather it is between admittedly bad fMRI evidence and the
82
See authorities cited in note 13, supra.
83
See Charles F. Bond, Jr. & Bella M. DePaulo, Accuracy of Deception Judgments, 10 J. Personality & Soc. Psych. Rev. 214 (2006); Maureen OSullivan, Why Most People Parse Palters, Fibs, Lies, Whoppers, and Other Deceptions Poorly, in Deception, supra note 8, at 74; Aldert Vrij et al., Increasing Cognitive Load to Facilitate Lie Detection: The Benefit of Recalling an Event in Reverse Order, 32 L. & Human Behavior 253, 253 (2008).
84
Fed. R. Evid. 609(a)(1). Fed. R. Evid. 608(a). Fed. R. Evid. 608(a). 30
85
86
even worse evidence that is not only permitted, but in fact forms the core of the common law trial process. VI. ON THE USES AND LIMITATIONS OF SCIENCE: DOUBTING DAUBERT The tone of some of the foregoing notwithstanding, it is decidedly not my purpose in this Essay to argue in favor of the admissibility of fMRI-based lie-detection evidence, whether in the courtroom itself or for related forensic purposes. Rather, it is to suggest that the reliability and validity standards for scientific evidence that the courts should use must be standards that come, ultimately, from the legal goals of the legal system and not from the scientific goals of the scientific system.87 Science can tell us that a certain scientific process has, say, a 12% error rate, or some rate of false positives and some other rate of false negatives. And scientists must decide for scientific purposes whether such a rate is sufficient, for example, to assert that something is the case, to conclude that a finding is adequate for publication, or to find a research program promising enough for renewal of a research grant. But whether such an error rate is sufficient for a trier of fact to hear it, put someone in jail, keep someone out of jail, or support an injunction or an award of damages is not itself a scientific question. The same applies to the methods of inquiry. Science properly relies on peer-review, replication, and other indicia of sound methodology. But whether those indicia are the right ones for purposes of non-scientific action, including but not limited to courtroom verdicts, is
87
See Margaret A. Berger, Upsetting the Balance Between Adverse Interests: The Impact of the Supreme Courts Trilogy on Expert Testimony in Toxic Tort Litigation, 64 L. & Contemp. Prob. 289, 300-02 (2001) (distinguishing legal and scientific standards of causation). 31
not itself a scientific question,88 and to think otherwise is to believe erroneously that one can derive a legal or policy ought from a scientific is. Evidence cognoscenti will detect in much of this a challenge to Daubert itself, and that may be so. Daubert and its successors, General Electric v. Joiner and Kumho Tire Co. v. Carmichael,89 were directed primarily to the problem of products liability cases and mass tort verdicts that have been based on allegedly persuasive but seemingly unreliable junk science. 90 Indeed, no one who reads the description of the tire failure expert in Kumho Tire can fail to recognize that junk science really does exist.91 And there is no doubt that the legal system must guard against a world in which experts in astrology, phrenology, and countless other bogus -ologies, some of which appear superficially more plausible than astrology and phrenology but have little more grounding in fact, have a place in the courtroom. Moreover, and as an important and very recent National Academy of Sciences study has documented in detail,92 many of the traditionally-used methods of forensic identification methods bitemarks,
88
See David L. Faigman et al., Science in the Law: Standards, Statistics and Research Issues 13.5.1, at 43 (2002) (determining value of scientific expert opinion is a matter of policy, not science).
89
See note 16, supra.
90
See United States v. Starzepyzel, 880 F. Supp. 1027, 1036 (S.D.N.Y. 1995); Joseph Sanders, Benedictin on Trial: A Study of Mass Tort Litigation (1998).
91
On the junk science problem generally, see Peter W. Huber, Galileos Revenge: Junk Science in the Courtroom (1991).
92
National Research Council of the National Academies, Strengthening Forensic Science in the United States: A Path Forward (2009). 32
shoe prints, handwriting analysis, ballistics, tool marks, and even fingerprints93 may have less scientific backing than their proponents have claimed and than the legal system has historically accepted.94 Identification of the problem is thus comparatively straightforward: prior to the Daubert revolution and its insistence on reliability and scientific validity,95 American courts admitted into evidence experts and tests purporting to demonstrate defective manufacture, causation, or identification, but which were based on empirical findings that do not have a sound scientific basis as measured by the standards of science. Without Daubert, so the argument goes, this kind of scientific evidence will continue to be admitted and juries will be persuaded by it notwithstanding its scientific weakness, causing innocent defendants to be wrongfully identified and convicted, and non-culpable defendants in tort cases to be held liable. These steps to what we might call the Daubert conclusion are based on empirical claims as to which there is little empirical evidence. We do not know, for example, how often the admission of scientifically substandard evidence has produced an erroneous verdict. It would do so whenever the admission of substandard evidence was unaccompanied by better evidence
93
See United States v. Monteiro, 407 F. Supp. 2d 351, 355 (D. Mass. 2006) (excluding ballistics testimony); United States v. Green, 405 F.2d 104, 120-22 (D. Mass. 2005) (same). Cf. Robert Epstein, Fingerprints Meet Daubert: The Myth of Fingerprint Science is Revealed, 75 S. Cal. L. Rev. 605 (2002) (questioning scientific basis for fingerprint matching).
94
See Michael J. Saks, Explaining the Tension Between the Supreme Courts Embrace of Validity as the Touchstone of Admissibility of Expert Testimony and Lower Courts (Seeming) Rejection of Same, 5 Episteme 329 (2008); Michael J. Saks, Merlin and Solomon: Lessons from the Laws Formative Encounters with Forensic Identification Science, 49 Hastings L.J. 1069 (1998).
95
Daubert, 509 U.S. at 591 n.9. 33
leading to the same conclusion, whenever the admission of the substandard evidence caused the trier of fact to find for the prosecution or plaintiff where without that evidence the verdict would have been otherwise, and whenever the defendant was not in fact guilty or culpable. It is possible that there are many cases in which all three of these facts are present, but we do not know how many there are, or what proportion of all cases, verdicts, accidents, or crimes they constitute. So it is far from clear how much of a problem there is, and how effective Daubert has been in solving it. Nor is it clear that the problem of junk science is best solved at the point of admissibility, because solving it, or at least ameliorating it, by the vigorous use of summary judgment and dismissal could likely achieve the same result without mistakenly importing the all-things-considered determination of whether the plaintiff or prosecution should prevail into the determination of the admissibility of particular pieces of evidence.96 But even if Daubert has significantly reduced the number of erroneous verdicts actually caused by poor science, the compared to what question still looms. Bad science is worse than good science, but not necessarily worse than the non-science that lurks in the heads of judges and jurors. And flawed science may still be superior to the superstitions and urban legends that influence so much of public policymaking and legal decision-making. Daubert is based on the sound premise that manufacturers of products should not be held liable for damages unless there is a basis for believing that some negligent act of a manufacturer actually caused injury to the user of the product, but it is important to consider what occurs when bad science, measured by scientific standards, is excluded from litigation. We do not know with
96
See Richard D. Friedman, Squeezing Daubert Out of the Picture, 33 Seton Hall L. Rev. 1047 (2003); Nance, supra note 33, at 252. 34
certainty the answer to this question in all circumstances, but we do know that because scientific reliability and validity is not a prerequisite for the admission of all evidence, much non-scientific evidence might well fill the gap left by the excluded flawed scientific evidence. We do not, after all, have a civil litigation system in which we prohibit those who are injured in automobile accidents from recovering unless they can show with scientific reliability that the defendant was driving negligently, nor do we have a criminal litigation system in which we prohibit defendants from offering a wide variety of non-scientific evidence to keep them out of prison. Requiring that all legal determinations of guilt or innocence, liability or non-liability, be guided by science thus seems utopian in both the best and worst senses of that word.97 Best in the sense that such a system might in fact achieve more justice than the one we now have. But worst in the sense that eliminating bad or flawed science from the courtroom, the legal system, and the rules of evidence would require such a dismantling of the entire edifice of common law adjudication that not only is it fanciful to think it might ever happen, but also that attempting to make the existing system more scientific by keeping out bad science while not doing anything about the non-science with which the entire system is infused may produce a system that is in fact less scientific and less reliable just because it keeps out somewhat poor science in favor of keeping in the really poor science that sneaks in the back door by not billing itself as science at all.
97
And perhaps that is why Kumho Tire made clear that although Dauberts broad concept of reliability would be applicable to all expert testimony, using the norms of science to evaluate reliability would be required only where the proposed evidence or testimony purported to be scientific. See Glynn and Ephedra, supra note 56. 35
At the heart of the controversy over the legal systems use of poor science may be the justified concern of scientists to keep their scientific enterprise as free as possible from nonscientific taint. When science that is not yet ready for prime scientific time is commandeered for commercial gain, the enterprise of science suffers, which is exactly what has happened with the commercialization of fMRI-based lie detection. Encouraging the use of shoddy science for legal or policy purposes is worse for science, and in the long run may and this too is an empirical question hurt us all by making science and the use of it by the public and policymakers worse. But the tension between the valuable goals of long-term scientific integrity and short-term uses of scientific output is hardly a new one and hardly unique to lie detection or to the law of evidence. When medical researchers performing placebo-controlled experiments reach a point at which they suspect but do not yet know with scientific confidence that a new drug will cure a fatal disease,98 they face the moral dilemma a dilemma faced by Dr. Martin Arrowsmith in Sinclair Lewiss great novel99 and by countless real research physicians before and since -- about whether to sacrifice science to immediate suffering, or instead to sacrifice peoples health and life to long-term scientific integrity. The stakes with respect to fMRI-based lie detection may be smaller, but the question is the same. If incomplete or shoddy or commercially-motivated science is usable in law, science will suffer. But if incomplete or shoddy or commercially-motivated science is barred from the law in the name of science, laws
See Robert J. Levine, The Ethics and Regulation of Clinical Research (2nd ed., 1986); Sharona Hoffman, The Use of Placebos in Clinical Trials: Responsible Research or Unethical Practice,? 33 Conn. L. Rev. 449 (2001).
98 99
Sinclair Lewis, Arrowsmith (1932). 36
own goals may suffer,100 and the tension and necessary tradeoffs between the goals of law and the goals of science can never be completely eliminated.101 The claim that evaluating science within the legal system must be based on characteristically legal standards and norms is in fact an example of a larger issue about the (partial) distinctiveness of legal thinking, legal analysis, and legal decision-making.102 Judges routinely based their decisions on precedent and stare decisis, but every elementary textbook on informal logic treats an argument from past practice as a fallacy. Lawyers are expected to rely on authority, but thoughtful scientists recognize that reliance on scientific authority is in great tension with scientific method. When Blackstone observed that it is better that ten guilty persons escape, than that one innocent suffer,103 he not only drew on ideas now
100
Especially important in this context is the obligation of law simply to reach a decision, and the ability to postpone a judgment until better evidence is available is rarely available to law. See Neil B. Cohen, The Gatekeeping Role in Civil Litigation and the Abdication of Legal Values in Favor of Scientific Values, 33 Seton Hall L. Rev. 943 (2003).
101
In suggesting that the decision about the use or non-use of neuroscience-based lie-detection for trial or forensic purposes must be made according to legal standards, I do not mean to suggest that the decision should be made solely by lawyers and judges. Committees or other decision-making processes on which both legal and scientific professionals are represented would be preferable to leaving the decision solely to legal professionals or solely to scientists, and my principal concern in this paper is only to argue against the view that only scientists should decide on the appropriate uses for science or its conclusions.
102
See Frederick Schauer, Thinking Like a Lawyer: A New Introduction to Legal Reasoning (2009).
103
4 William Blackstone, Commentaries *352. A useful exploration is Alexander Volokh, n Guilty Men, 146 U. Pa. L. Rev. 173 (1997). Modern analyses of the Blackstonian maxim have understood, properly, that we should be interested in the consequences or utilities of four and not just two different outcomes true convictions, false convictions, true acquittals, and false acquittals. See Alan Cullison, Probability Analysis of Judicial Fact-Finding: A Preliminary Outline of the Subjective Approach, 14. U. Toledo L. Rev. 538 (1969); Richard Friedman, Standards of Persuasion and the Distinction Between Fact and Law, 86 Nw. U. L. Rev. 916 (1992); Erik Lillquist, Recasting Reasonable Doubt: Decision Theory and the Virtues of Variability , 36 U.C. 37
associated with Type I and Type II errors and false positives and false negatives, but also made clear that laws own goals required subjugating maximum accuracy to the transcendent (or at least heavily weighted) value of personal liberty.104 In all of these respects, and more, it is a mistake to assume that the job of law is to enforce or replicate the decision-making modes of other disciplines and domains, and so it is with the standards that law uses to determine whether evidence, scientific and otherwise, is sufficiently reliable to be usable for this or that legal purpose. Law should listen to what the neuroscientists say about neuroscience, but it should be attentive to the adjectives and the adverbs. When neuroscientists say there is no compelling evidence of fMRIs lie-detecting reliability, that there is very little basis for confidence in the results produced so far, or that claims about fMRIs have been made prematurely, they are imposing an evaluative standard on experimental results. That is as it must be, for we cannot make sense of any results without having some evaluative standard. But the evaluative standard used by the law, even when science is being evaluated, must be based on laws goals, laws purposes, and laws structures, and as is so often the case what is good enough outside of law may not be good enough inside it. Less obvious but often more important is the corollary that what is not good enough outside of law may be good enough for parts of the law.
Davis L. Rev. 85 (2002); Michael DeKay, The Difference Between Blackstone-Like Error Ratios and Probabilistic Standards of Proof, 21 L. & Soc. Inq. 95 (1996); Laurence Tribe, Trial by Mathematics: Precision and Ritual in the Legal Process, 84 Harv. L. Rev. 1329, 1375 (1971).
104
See David L. Faigman, Expert Evidence in Flatland: The Geometry of a World Without Scientific Culture, 34 Seton Hall L. Rev. 266, 267 (2003) (achieving balance between false positives and false negatives is a legal question). 38

Can Bad Science Be Good Evidence

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Can Bad Science Be Good Evidence

Hochgeladen von

Copyright:

Verfügbare Formate

University of Virginia School of Law

Electronic copy available at: http://ssrn.com/abstract=1448744

Electronic copy available at: http://ssrn.com/abstract=1448744

Forthcoming in Cornell Law Review, vol. 95 (2010), draft of 09/08/2009

CAN BAD SCIENCE BE GOOD EVIDENCE? NEUROSCIENCE, LIE-DETECTION, AND BEYOND

Electronic copy available at: http://ssrn.com/abstract=1448744

Fed. R. Evid. 609.

See http://noliemri.com See www.cephoscorp.com. 10

Greely & Illes, supra note 27, at 419-20.

Editorial, 11 Nature Neuroscience 1231 (2008).

Kanwisher, supra note 29, at 12. 14

Greely, Neuroscience-Based Lie Detection,, supra note 27, at 51.

See Science News, Science Daily, February 19, 2007, at http://www.sciencedaily.com/releases/2007/02/070218184515.htm.

Schmerber v. California, 384 U.S. 757 (1964). 16

Schmerber, 384 U.S. at 764.

See notes 27-37 and accompanying text, supra.

Weisberg et al., supra note 75. 26

McCabe & Castel, supra note 70.

See authorities cited in note 74, supra. 27

See authorities cited in note 13, supra.

Fed. R. Evid. 609(a)(1). Fed. R. Evid. 608(a). Fed. R. Evid. 608(a). 30

See note 16, supra.

Daubert, 509 U.S. at 591 n.9. 33

Sinclair Lewis, Arrowsmith (1932). 36

Das könnte Ihnen auch gefallen