Beruflich Dokumente
Kultur Dokumente
Discovery
FOR
DUMMIES
Prepare and
protect your
electronic data
for legal action
A Reference
Rest of Us!
Electronic Discovery
FOR
DUMMIES
RENEWDATA
SPECIAL EDITION
by Ryan Williams
For general information on our other products and services, please contact our
Customer Care Department within the U.S. at 800-762-2974, outside the U.S. at
317-572-3993, or fax 317-572-4002. For details on how to create a customFor
Dummies book for your business or organization, contact bizdev@wiley.com.
For information about licensing the For Dummies brand for products or services,
contact BrandedRights&Licenses@Wiley.com.
ISBN: 978-0-470-22607-0
Manufactured in the United States of America
10 9 8 7 6 5 4
Contents at a Glance
Introduction.
................
17
23
Results Available
31
....39
Publisher's Acknowledgments
We're proud of this book; please send us your comments through our
online registration form located at www.dummies.com/register/.
For details on how to create a custom ForDummies book for your
business or organization, contact bizdev@wiley.com. For infor
mation about licensing the ForDummies brand for products or
services, contact BrandedRights&Licenses@Wiley.com.
Some of the people who helped bring this book to market include
the following:
Acquisitions, Editorial,
and Media Development
Project Editor: Jan Sims
Editorial Manager: Rev Mengle
Business Development Representative: Karen Hattan
Production
Project Coordinator:
Kristie Rees
Carl Byers,
Stephanie D. Jumper,
Julie Trippetti
Proofreader: Charles Spencer
Special Help: Trisha Wattier
Introduction
about the law than they ever wanted to. If you are read
ing this book, you're off to a good start.
E-discovery can be a complex process, but this book
has reduced it to its essential components, making it a
valuable tool in understanding the fundamental con
cepts. It tells you what you need to know when you
need to know it. You can read through it from front to
back (it's short, so that shouldn't be too much of a
challenge), or you can go straight to a chapter and
read what you need to know right then.
PartI
In This Part
Part II
Computer?
In This Part
Looking at common sources of electronic data
Examining relevant data found in those sources
Catching onto e-discovery technical terms
Network servers
Network servers are basically larger computers that have
been configured to serve as storage space connected to
either the Internet at large or smaller, company-based
networks called intranets. A server receives requests
from PCs and other devices (known collectively as
clients) and either shows data or stores it for future use
on its hard drives. Servers may be located either on-site
or at a remote location it's not uncommon for a com
pany to rent space from another company and store the
information there.
10
Servers can contain more than just documents
or e-mails. Servers can also handle databases,
voicemails, FAX transmissions, and other
types of data. To hold all this data, a typical
server may use several hard drives.
Servers can either see these hard drives as
one large hard drive, or they can use each
hard drive to make exact copies of the data
on several drives. In case one fails, another
has the data ready to go. Make sure you have
access to all the server hard drives when car
rying out e-discovery.
All hard drives will eventually fail. It's the
curse of any machine that includes moving
parts. Most last for several years, but disaster
could be around the corner at any time. That's
why it's important to get to data as soon as
possible and keep backups of all important
data.
Tape backup
Tape backup is usually the last line of defense against
the loss or destruction of data. Magnetic tape is used
to record the data stored on the servers (usually once
every twenty-four hours, at night when network usage is
slow), and these tapes are then stored off-site to prevent
them from being destroyed in any traumatic event that
might physically destroy the servers. In the event that
they must be used, the correct tapes have to be found in
the correct order to restore the data. This information
can also be stored in its raw format (the same size) or
its compressed format (data technology is used to store
more information in the same amount of space).
11
The tapes also have to be used in the same type of
machine as the one that created them. You can't just
connect them to any computer and pull the data off of
them. For that reason, backups are usually handled by
either a company's IT department or an outside vendor
only. Outside vendors, such as RenewData, can extract
data from tapes quickly and easily, even if the tapes are
incomplete, out of order, or damaged.
12
Flash Drives: These tiny devices (also called
"thumb drives") use flash memory, a smaller type
of storage medium than the spinning disk in a
hard drive, to store information. They're easily
portable and can pass easily from computer to
computer.
CompactFlash and other card memory: These
devices are often used in cameras, video
recorders, cell phones, and other portable devices
to store information such as pictures, music, or
other small files. These typically function with
specific devices, and they don't often show up in
PCs or laptops.
13
It's also important to produce the smallest possible
amount of relevant data possible. That may sound
counter-intuitive, but it actually aids the investigation
efforts. As you might expect with a keyword search
like this, a great amount of data might be returned,
including several duplicates of the same information.
E-discovery vendors and software minimize duplication
of information (and the time and effort needed to weed
through this information) to facilitate the presentation
of this information.
14
15
Metadata: As briefly explained in Part I, this term
translates to essentially "data about data." For
example, in the case of an e-mail about the price
fixing, the metadata would include Date Sent
and who was in the To: and From: fields. It may
even include the folder the data was kept in. If a
price-fixing e-mail were kept in a folder called
"CYA," that could be important in helping a judge
or a jury determine what a person's mental state
was when he or she kept a document. Metadata
may also include any tracked changes within a
document to prove when something was altered.
Native Data: If data hasn't been changed from its
original format or file type, it's known as native
data. The file hasn't been altered, and it's opened
up by the software that created it. Other varia
tions of data include paper (files printed out) or
semi-paper (files printed out with associated
metadata, like time and location of creation and
other details).
PST/NSF: A PST file is created in the Microsoft
Outlook e-mail program. It is basically a way to
compact a bunch of e-mails into one file. Users
and IT pros alike use PSTs to back up or archive
their e-mail folders. The Lotus Notes version of
this is called an NSFfile. They are called PSTs and
NSFs because of the alphabet soup that follows
the files when they are first created. For example,
a PST filename looks like f r a z i e r e m a i l . p s t ,
and an NSF file looks like j a k e s e m a i l . n s f .
These files are created automatically by the pro
grams, but the exact parameters of how e-mails
are stored in these files is up to the company.
16
PST and NSF files are the most common for
mats used in corporate e-mails. However,
when you're using Web-based e-mail services
such as Gmail or Hotmail or other mail pro
grams such as Mac OS X Mail, you need to
take those into account and find out more
about these services. In some cases, you need
passwords or other methods of access.
TIFF (Tagged Image File Format): You're probably
used to seeing TIFF files of digital photos, say
from your last vacation. In e-discovery, this file
type contains an exact image of a printed docu
ment, so that it exists as data viewed on a com
puter screen and not on paper. For example, if
your side of a case has a Microsoft Word docu
ment with a secret recipe on it that needs to be
redacted (blacked out) before sending to the
other side's lawyers, the document can be made
into a TIFF image, basically a photograph of the
page, so that the secret formula can't be digitally
uncovered.
Part III
In This Part
Deciding what to keep and what to delete
Keeping the right records for the right
amount of time
Running a "meet and confer" session
Choosing the right person or company
to recover the data
18
communication. And although it's easier to store elec
tronic information than maintain a warehouse full of
paper documents, there's still a finite limit on how
much information can be stored (and how many hard
drives or tapes you can afford to maintain).
Federal regulations
Some professions, like the trading of stocks and securi
ties, are subject to federal regulation. In that case, the
rules for retaining data are clearly set forward (or as
clearly as federal law can be laid out). Stock brokers are
required by the Securities and Exchange Commission to
retain all e-mails between their clients and themselves
for three or six years, depending on the content of
those e-mails.
Sarbanes-Oxley
So what to do with industries that aren't subject to these kinds
of federal regulation? The Securities Exchange Acts of 1933
and 1934 set the foundation on how to handle most of these
cases, but Congress decided to punch these rules up a bit
with some more "teeth" in the wake of the scandals that
surrounded the implosion of companies like Enron and
WorldCom. Enter Sarbanes-Oxley, the regulations that pro
vided "teeth" like jail time and other sanctions for the misuse
or destruction of data and other information. This obviously
raised the stakes on making sure that relevant data is main
tained and made available in case of litigation. Not only must
the owner of that data prove that all the data has been pro
vided, but it must also prove that none of the information was
intentionally deleted or removed before the proper time.
19
See SEC 17(a)(4) for more information on this
subject.
Stockbrokers who are particularly attached to their
licenses will abide by these regulations and keep these
e-mails for the appointed time. There are similar regula
tions for other industries like health care (HIPAA regu
lations) and energy, and it's a good idea to refer to
those specific regulations for more information.
Litigation Holds
The policies discussed in the previous section go out
the window when any threat of litigation occurs. At
that point, anything related to future litigation must be
retained. Deleting that information can result in sanc
tions against the company or individual. The rationale
20
for these sanctions is that you can't have a search for
the truth if the documents are destroyed, and it is
impossible for one side to make their case.
It may be a little hard to attach an exact start date to the
preservation period, but it's always better to err on the
side of caution. In this case, it's vital to make sure that
not only is the document retained intact, but that all the
information surrounding that document (when and how
it was created and altered) is retained as well. It's not
enough to just notify all employees that they have to
keep these documents. The company must actively
move to retain these documents and make sure they
stay intact. (Part IV covers this in more detail.)
Data formats
Part II explains what constitutes native, paper, and
semi-paper data. All have their advantages, and all lack
something another format is stronger in. For that
21
reason, you need to take into account all considera
tions when deciding what format to render the data in.
If there's a concern about when and where the files
were created, you want to preserve all of the associ
ated metadata, so an electronic format might be more
appropriate. If everybody is just concerned over what
information is present inside the actual e-mail or docu
ment, a paper copy might be appropriate. Electronic
images, paper, or semi-paper sources may also be bene
ficial if data had to be extracted from a larger or
unwieldy source of information, such as a database or
encrypted server. Information like that can't just be
opened in a PC without some undue effort.
Choosing a Vendor
Chances are that, unless you're working with a firm
or office that's already had a great deal of experience
with e-discovery, the actual process of dealing with
electronic data might be beyond the scope of your
expertise.
Keep in mind that the consequences for mis
handling electronic data are quite severe. It's
important to make sure that whoever handles
the process does so correctly.
In most cases, it's a good idea to let somebody outside
of the corporation handle e-discovery. When a third
party handles the collection and processing, it can ben
efit the investigation in two ways:
A vendor expert testifies in this case, not a corpo
rate employee
Any anomalies in the data can be explained with
more credibility than somebody involved in the
process.
22
Choosing a vendor is dependent on the needs of the
case. If the vendor can't handle backup tapes, then it
won't be able to help a case involving a multiple stream
server backup for a large corporation. Then again, a
vendor might be overkill if all that's involved is a few
e-mails and a Word document.
Part IV explains exactly what third-party vendors can
add to the security of the e-discovery process and how
to determine who pays for their services.
23
Part IV
In This Part
Understanding common e-discovery legal terms
Notifying involved parties about the need
to hold evidence
Taking possession of the electronic
media responsibly
Figuring out who foots the bill
At
24
Preservation: The preservation process is a criti
cal part of e-discovery. As soon as a party such as
a company realizes it is under a duty to ensure
that documents are not destroyed, it must under
take preservation steps to avoid deleting data.
This may mean making a special copy of a
person's hard drive, or putting backup tapes in
safe storage in case they need to be searched
later. Again, this must be an active and affirmative
process. Just a passive warning to the employee
isn't enough.
Spoliation: When preservation is not done cor
rectly, spoliation occurs. Technically, spoliation is
a basis for a lawsuit when one party accuses the
other of allowing evidence to be changed or
destroyed. The party suing may allege that it
cannot make a case now, and should thus win the
case by default out of fairness. This includes
everything from maintaining the integrity of the
date to monitoring who has access to the evi
dence and what they've done (or not done) to it.
Adverse Inference: When one party is alleging
that another party allowed evidence to be
changed or destroyed, it may ask the judge to
issue an "adverse inference" jury instruction. This
means the judge tells the jury they may infer that
the destroyed evidence was harmful to the party
who allowed the destruction. The law gives the
benefit of the doubt to the party who wasn't
responsible for the destruction.
25
26
Zooming in on the Significance of Zubulake
Laura Zubulake was a stockbroker who worked for UBS
Warburg. She alleged that UBS did not promote her due to
The seven opinions written by the judge during this case are
27
28
cell phone logs. These data sources may con
tain not only actual information, but also
metadata on these files as well.
29
A good vendor takes all of these steps to ensure that
your data remains intact and admissible in court.
Remember, those individuals performing the work are
just as liable to be called in to testify about their work
as those involved in the original disagreement. A federal
test known as the Daubert Standard allows judges to
determine whether the testimony of an expert witness
is admissible. Taking the proper steps (and making sure
that both sides agree that the proper steps are being
taken) ensures that you get usable results.
30
Ability to pay
Which party will benefit more from e-discovery
Exactly what data is being requested
Cost of recovery versus the amount involved in
the controversy
The importance of the information in question
The likelihood of finding relevent data
The purpose of the requested data
Whether that data can come from another source
31
Part V
In This Part
Removing duplicate documents from the evidence
Evaluating the results of de-duplication and culling
Tagging relevant documents for later
Removing or redacting privileged information
32
produces necessary information with a high degree of
accuracy, making it a valuable use of time and money.
33
If you're sure that the focus of the investigation is
restricted to a single individual, this process will
get rid of a lot of information for you.
Within folders or directories within target
user: This process is also called intra-folder deduplication, and it represents a more focused
effort to find information. If the focus of the inves
tigation is restricted to incriminating e-mails
received or originating from a single user, this
process can further reduce the amount of data
you deal with by keeping the focus on the relevant
topics.
Global de-duplication: With this broad approach
to de-duplication, all users' e-mails and user files
are culled, and only one instance of each message/
user file will reside across all target users. This
does generate a large amount of data, but at least
you'll have some information to ferret out the more
useful documents.
Reviewing Results
Now that the vast mountain of information has been
reduced to a few profitable veins of evidence, the con
cerned parties can begin searching through the results
to see what actually goes into the case. Attorneys used
to sit in large rooms with these huge boxes of paper
and read each document, using magic markers and
stickers to code documents. Today, this is all done
online, either over the Internet or using software within
a law firm's or corporate network. It saves time and
effort, but it seems to have cut deeply into the profits
of various food delivery services. Somewhere, a pizza
shop owner is crying.
34
Establishing procedure
There are no exact standards for electronic review. The
tools available may use different methods of presenting
information, and some firms even refuse to perform
extensive reviews, citing cost and comfort issues.
Others just use the search functions loaded on their
computers' operating systems. If it's a small review,
that might be enough. If a legal team is sorting through
hundreds upon hundreds of pages or documents, it
may be time to consider a larger plan.
In the case of such larger plans, it's important to look
at the best way to maximize the time spent while estab
lishing your priorities. It's also important to set up an
appropriate schedule for this review to keep from
delaying matters, and to assure the relevance and qual
ity of the information produced. While there aren't
established standards, there are experts that can help
plan and execute this process. If you're feeling over
whelmed, it may be time to look one up.
35
Privileged: If a document falls within the domain
of attorney-client privilege, it must be excluded or
redacted during the review process.
Confidential: This refers to trade secrets or other
information that shouldn't be revealed because
it could affect the livelihood of an individual or
company.
36
Remember that PDF or TIFF files are easier to
redact than native files. Native files may con
tain tracked changes or metadata that could
inadvertently reveal confidential or privileged
information.
Available Toots
Whether you're doing e-discovery in-house or hiring a
vendor, you'll want to become familiar with the software
tools that are available to aid the review process and
reduce eyestrain from computer monitors and the
amount of time spent. Each one has its own advantages
and appropriate uses.
The following list gives an overview of four common
e-discovery applications:
37
Summation is one of the most common tools for
performing review. It allows for linear, or docu
ment-by-document, review. This tool is most often
installed within the network of a law firm or cor
poration, and it's ideal for smaller reviews.
Concordance is the major competitor to
Summation, and it also conducts a document-by
document review. This tool is ideal for smaller
reviews, and the software can be housed easily
within a corporate network.
iCONECT helps users to do a "native review" of
the important documents, enabling them to view
the files themselves. This procedure can provide
more information than just a TIFF image of files.
This approach means that customers can save
money by converting into TIFFs only those docu
ments that will be actually produced and used in
the case, as opposed to all documents that need
to be reviewed.
Attenex is a state-of-the-art native review tool that
utilizes complex algorithms to categorize data and
interactive visual clustering to speed review and
improve analysis or case assessment. Imagine
seeing a keyword or tag on your screen, with
actual lines drawn to relevant information. Attenex
provides that kind of picture during the review
process. Attenex literally means "At Ten X" (where
the X stands for "times") because it can help con
duct a review at ten times the speed of a linear
document-by-document review.
38
Part VI
Making E-Discovery
Results Available
In This Part
Distributing the data to clients and investigators
Deciding between native files and images
Using additional resources in your e-discovery
process
Now
that the hard work has been done and the data
is ready to be distributed to all involved parties,
it's important to decide how that data will be presented.
A lot of this has probably already been decided by this
point, because of decisions on whether to preserve
metadata and other information about documents aside
from the original content.
40
Many hosted online review tools now can easily be con
figured to allow clients, investigators, and even adverse
parties to simply log in to the review platform to see
documents. Basically, the legal team's IDs are set up to
allow them to see all fields and documents, and the wit
ness or adverse party's ID is set up to be able to see
only small subsets of data. This approach cuts down on
the cost of smaller productions that happen through
out discovery.
Knowing When to
Go Native or Not
As discussed in Part II, native data simply means the
actual computer file in its original state. So for a
Microsoft Word document called memo.doc, a native
production would mean sending a CD with the file on it
rather than a TIFF image of that file. This approach is
advantageous because it saves money by not having to
convert the document to an image, and it preserves all
metadata. However, there are technical challenges
regarding being able to redact, label, or tag a native
document, because doing so would alter the metadata
such as "last modified date." Native data should be
used only when its attributes are specifically required.
Otherwise, it might be more advantageous to use a
format such as TIFF or PDF files, as explained in Part V.
They present an image of the document that contains
the information, but it's also easier to redact and con
trol once it's out of your possession.
No matter how it's presented, remember that
only the information that needs to be distrib
uted should be made available. Not only can
Further Resources
Despite the extraordinary introduction to e-discovery
you've received in this book, it can be helpful to check
the Internet for more information on this subject. Just
like technology itself, the laws and decisions regarding
technology are constantly changing and evolving, and
it's important to keep on top of it. These sites can help
keep you stay up to date.
EDRM
The Electronic Discovery Reference Model (EDRM) is
available online at http://www.edrm.net/, and it
represents a clear path to evaluate and maintain the
integrity of electronic data. The EDRM site helps formal
ize what was once a muddied and unclear process. By
firmly identifying steps to be used in the e-discovery
process, this project helps retain the integrity of elec
tronic data and helps ensure that it will be admissible in
42
courts, no matter where the location. Current projects
include working on keeping the process relevant and
updated, setting forth a clear and voluntary code of
conduct for those involved in e-discovery, and docu
menting the use of metrics data in e-discovery.
RenewDataWeb site
To find out more about e-discovery, you can visit the
e-discovery Resources section of the RenewData Web
site (http://www.renewdata.com). This section
contains dozens of archived expert panel webinars,
peer-reviewed articles from legal journals, and news
updates on e-discovery. You'll also find reference cards
and interactive risk calculators, and you can sign up
for an e-discovery newsletter. For a demonstration of
online review tools, including Attenex, you can send an
e-mail to info@renewdata.com.
43
Notes
Planning
Archiving
Processing
Production
www.renewdata.com
Copyright 2007 Renew Data Corp. All rights reserved. RenewData, the sphere logo and ActiveVault are registered
trademarks of Renew Data Corp. All other company and product names may be trademarks of their respective owners.