Sie sind auf Seite 1von 47

Exploring NTFS:

Forensic Extraction of Data


Jason Medeiros
Grayscale Research 2008

About This Presentation


This presentation is a complement to the
Grayscale white paper, NTFS Forensics:
A Programmers View of Raw Filesystem
Data Extraction.
This paper is available freely in the
Whitepapers and Research section of the
www.grayscale-research.org.

Describing our Intent


Open a raw physical disk
Read through its raw file system structures
to find a file which is locked by the kernel.
Read the sectors off the disk into a file
which can be read later.

Why is this Useful


Certain files such as the page file are
locked by the kernel so that improper
accesses cannot occur.
This is typically to prevent race conditions
and to keep hackers and other
malcontents from accessing portions of
the disc that the operating system does
not want accessed.

Full Access
By going through the raw disk instead of
the file system routines, we get full access
to any file on the disk as well as any
sectors on the disk that may be
inaccessible through file system locking
mechanisms.

Step-by-step process overview

1. Find the partition entry which holds the NTFS partition

2. Navigate to the beginning of the Master File Table

3. Find the root directory metafile entry in the MFT and extract its index allocation
attribute data.

4. Process the INDX records found within the index allocation attribute data one by
one recursively until you find a file that matches the one you are looking for.

5. In the index entry that matched, find the MFT record number and move to that
record position within the MFT.

6. Record the MFT entry and process its standard attribute headers one by one until
the data attribute is encountered.

7. Use the process of attribute data extraction in order to retrieve the data attribute,
which contains the contents of the file that is being accessed.

About NTFS
The NTFS file system has gone through several
iterations each of which brought new features to
end users and application designers.
The current version (3.1) is the primary focus of
this document although the technical details
remain the same for previous versions so this
document is rather universal.

About NTFS Continued


NTFS was created for the purpose of creating a
high performance replacement for the
antiquated FAT file system.
This new file system featured robust journaling,
alternate data streams, disk quotas, sparse files,
reparse points, encryption, compression, and
several more features that are necessary in
modern filesystems.

Filesystems And Binary Trees


It should be noted that all filesystems utilize
binary trees in order to hierarchically store data
effectively.
There are many different binary tree types
available but the most common type of binary
tree utilized for file systems is the B+ binary tree.
NTFS at its core structure is represented by a
B+ binary tree.

B+ Trees
A B+ tree represents sorted data in a way
which allows for extremely efficient
insertion, removal, and retrieval of data.
All data is identified by a key within the
tree which serves the purpose of being a
multilevel index to nodes on the tree.

Explanation of B+ Operations
In a B+ tree, each node in the tree has a record identifier
or a hash.
Each of these hashes can either point, to data on the
disk or another node.
Each node being pointed to can have the same
properties as a node before it.
This hierarchical data storage mechanism allows for a
theoretical infinite depth with fast lookup times as long as
the storage algorithms to sort the data while its being
stored.

Figure of an Example B+

Beginning the Process of Raw Data


Extraction
Use the platform file system routines in
order to open a physical device handle.
Set the file pointer to the beginning of the
file.

Read in 512 bytes from the start of the


disc.

Microsoft API caveat


When reading data from a raw disk utilizing the
Microsoft operating system, it should be noted
that all drive reads will fail if not read in
increments of 512 bytes.
Additionally it should be said, that the real
structures used for any of the operations
documented in this presentation are found in the
figures sections within the accompanying White
paper.

The Master Boot Record


the 512 bytes which were just read in constitute
the first sector of the disc, this data contains the
Master boot record.
The first 440 bytes of this master boot record is
used as the code area for the MBR and is of no
use to us for file extraction.
Past the code area you have a four byte optional
disc signature, and four bytes of null padding.

Master boot record continued


Directly after the no padding at offset 446
(0x1be) you will find the system partition table.
The maximum size of these partition entries is
16 bytes each with a maximum of four possible
entries.

These entries contain the data which will allow


us to identify what types of partitions we are
looking at.

Partition Table
The 16 byte partition records stored in the MBR, at the
fourth of byte offset, contains a one byte valuewhich
represents one of the possible partition types which can
exist.
In the White paper accompanying this presentation, and
the citations and references section there is a link to all
of the different partition types which can be encountered.
However, for the purposes of this presentation it is only
important to know that if this value is the number seven,
it represents an NTFS partition.

Partition Table Continued


At the eight byte offset from the beginning of the
NTFS identified partition, there exists a four byte
value which tells us at which logical block
address or partition starts.
This logical block address is simply the size of
one disk sector (512 bytes), multiplied by the
logical block value.
As soon as this value has been calculated set
the file pointer of your read file to that offset.

The NTFS Boot Sector


After advancing your file pointer to the
correct sector offset youll be ready to read
in a structure known as the NTFS boot
sector.
This boot sector has several very
important pieces of information in it, of
particular note it contains a structure
called the BIOS parameter block.

About The BIOS Parameter Block


The bios parameter block contains various
pieces of information about an NTFS partition.
This information includes the number of bytes
per sector, the number of sectors per cluster, the
number of total sectors, and the volume serial
number of the partition in question.
Most importantly however it contains a logical
cluster number for the Master file table.

The Master File Table


The Master File Table (MFT) is a series of
clusters designated contiguously on the disc for
the purpose of holding an extremely large
structure array, with each array entry1024 bytes
long.
Each entry in this Master File Table represents
one file on the disk. Every single file on the disk
has a Master file table entry.

Finding the MFT


To find the start of the MFT extract the
bytes per sector member of the BIOS
parameter block and multiply it by the
number of sectors per cluster also found in
the BIOS parameter block.
This gives you the number of bytes per
cluster, where clusters are the smallest
storage allocation size for the file system.

Finding The MFT Continued


Take our newly calculated bytes per cluster
value and multiply it by the logical cluster
number of the MFT to move to the starting
position of the Master file table.
As a note, the majority of the calculations
required in processing raw NTFS are relatively
simple and documented fully and algorithmically
within the accompanying white paper.

MFT Record Numbers


When we were discussing B+ trees earlier
in this presentation we talked a bit about
nodes, and hash lookup/record numbers.
In terms of NTFS these lookup records are
in essence, MFT record numbers.
The MFT record number in it self is array
index number, in the MFT.

Filesystem Data Storage Semantics


Data which is being stored on a filesystem is
stored in what are known as data runs which in
essence are comprised of contiguous series of
clusters.
The MFT is no different in the sense that it
occupies multiple data runs.
However, the offset we calculated in the
previous slide brings us to the store of the first
one in the MFT which has special significance.

MFT First Data Run and Metafiles


Metafiles are special files utilized by NTFS
in order to perform this specific functions.
The first data run contains all pertinent
system metafiles as well as the root
directory metafile.
These metafiles are extremely useful for
finding files quickly and efficiently.

NTFS And The Existence Of Multiple Data


Runs Per File
If a file created by the operating system is
relatively small it will literally be stored inside of
its MFT record entry.
However, if it is larger than 1024 bytes, it must
be stored in a data run on the disc.
The difference between these two paradigms is
called resident and nonresident data storage.

MFT Records And Data Runs


As was said earlier each MFT record represents
a unique file on the file system.
Each MFT record also contains the different data
runs and offsets that contain the actual relevant
file content.
That being said, the MFT itself has an MFT entry
which can be used to extract it in full.

Extracting Data From MFT Records:


Theory
The concept of extracting data from MFT records relies
on what are known as MFT record attributes.
There are 16 total attribute types which you will
encounter when dealing with MFT entries. The data
attribute is the one most relevant to data extraction.
The extraction of the data attribute in the retrieval of its
runs will retrieve the actual file content of the file to which
the MFT record belongs.

About MFT Record Attributes


In MFT record entry is actually comprised of
multiple attribute headers. These attribute
headers are responsible for identifying where
the particular attribute resides on disk, as well as
what type of attribute it is pointing to.
The next slide has a table of all available
attribute header types possibly found in an MFC
record.

MFT Record Attributes Table


Attribute Name

Hexidecimal Value

Unused
Standard Information
File Name
Object ID
Security Descriptor
Volume Name
Volume Information
Data
Index Root
Index Allocation
Bitmap
Reparse Point
EA Information
EA
Property Set
Logged Utility Stream
First User Defined Attribute
End of Attributes (records)

0x00
0x10
0x30
0x40
0x50
0x60
0x70
0x80
0x90
0xa0
0xb0
0xc0
0xd0
0xe0
0xf0
0x100
0x1000
0xffffffff

Accessing MFT Attributes


In order to access in the MFT attributes, extract
one 1024 byte MFT record from the start of the
file entry in the MFT, into a usable program
buffer.
The first structure from the beginning offset of
this buffer is the MFT file entry header.
The sixth member of this structure is the start of
attributes offset. This is where we can start
finding our relevant attribute structures.

MFT File Entry Header


typedef struct _NTFS_MFT_FILE_ENTRY_HEADER {
char fileSignature[4];
WORD wFixupOffset;
WORD wFixupSize;
LONGLONG n64LogSeqNumber;
WORD wSequence;
WORD wHardLinks;
WORD wAttribOffset;
WORD wFlags;
DWORD dwRecLength;
DWORD dwAllLength;
LONGLONG n64BaseMftRec;
WORD wNextAttrID;
WORD wFixupPattern;
DWORD dwMFTRecNumber;
} NTFS_MFT_FILE_ENTRY_HEADER, P_NTFS_MFT_FILE_ENTRY_HEADER;

Navigating MFT Attributes


MFT attributes structures are dynamic in type
and purpose.
The internal structures inside the attributes
structure are unioned in a structure, but
adopting different purposes depending on a
static type flag in the beginning of the attribute
header.
The next slide shows the NTFS attributes static
members.

NTFS Attribute Structure


typedef struct _NTFS_ATTRIBUTE {
DWORD dwType;
DWORD dwFullLength;
BYTE uchNonResFlag;
BYTE uchNameLength;
WORD wNameOffset;
WORD wFlags;
WORD wID;
union ATTR { } Attr;
} _NTFS_ATTRIBUTE, *P_NTFS_ATTRIBUTE;

NTFS Attribute Internal Union


union ATTR {
struct RESIDENT {
DWORD dwLength;
WORD wAttrOffset;
BYTE uchIndexedTag;
BYTE uchPadding;
} Resident;
struct NONRESIDENT {
LONGLONG n64StartVCN;
LONGLONG n64EndVCN;
WORD wDatarunOffset;
WORD wCompressionSize;
BYTE uchPadding[4];
LONGLONG n64AllocSize;
LONGLONG n64RealSize;
LONGLONG n64StreamSize;
} NonResident;
} Attr;

The Extraction Of Relevant Attribute


Data
Early in this presentation it was said that if
the file was small enough it would be
stored in the MFT record it self.
Data attributes are just like any other
attribute, if they are marked resident as an
attribute there are stored in the MFT.

Resident Data Extraction Continued


Extract the attributes header resident union structure.
Extract the offset member from that structure.
Moved to the beginning of your MFT entry, and
advanced that position the number of bytes specified in
the offset.
From that position you can read in the attribute data
directly with a length also found in the resident union
structure.

Concluding Resident Data Extraction


That is all there is to extracting a resident
attribute from an MFT record.
Nonresident data extraction however is
much more complicated.

The Extraction Of Nonresident


Attribute Data
Start by extracting the nonresident union
structure from the attribute structure in question.
Create a one byte value which will represent our
length offset size.
Move the file pointer to the resident union
structures data run offset and read in the first
byte into our newly created one byte value.

The Extraction Of Nonresident Data


Continued
The actual byte value recorded is in reality
two four bit values.
You can split these into a union where the
top four bits represent a length and the
last four bits represent an offset.

A Four Bit Union Structure Which Can


Be Utilized
typedef struct _LEN_OFFS_BITFIELD {
union {
BYTE val;
struct {
unsigned char offs:4;
unsigned char len: 4;
} bitfield;
};
} LEN_OFFS_BITFIELD, *P_LEN_OFFS_BITFIELD;

Continuing Extraction
Create an eight byte value for large integer calculations.
This value will hold a copy to record length for the first
data run.
Using the length member of the union bit field in the
previous slide, copy that number of bytes into the new
value. This is the length of the data run as a large
integer.
Advanced the file pointer that number of bytes forward.

Still Continuing Extraction (almost done)


Move the read position forward the number of bytes
which you read, and then create in other eight byte
value.
This next value is our offset placeholder.
Copy in to this New variable the number of bytes
specified in the offset bit field from the previous union.
This eight byte value now contains the proper offset to
our first data run.

Finishing Up
If we were to now move our final pointer to the data run
offset, and read in the number of bytes specified by the
data run length, we would be able to extract a data run
from a file.
Considering that a file can have multiple data runs, and
that their records are all stored contiguously, its often
best to read in all the data run variables before extracting
the actual run data.
However, the final implementation methodology is left to
the developer.

Additional Information
Linux NTFS Driver Project. "NTFS documentation",
Richard Russon and Yuval Fledel 2005
http://data.linux-ntfs.org/ntfsdoc.pdf
Wikipedia entry on NTFS "NTFS",
http://en.wikipedia.org/wiki/NTFS
Wikipedia Entry On The BIOS Parameter Block "BIOS
Parameter Block",
http://en.wikipedia.org/wiki/BIOS_parameter_block
Wikipedia Entry on the Master Boot Record "Master Boot
Record",
http://en.wikipedia.org/wiki/Master_boot_record

Theory Application Demo

Das könnte Ihnen auch gefallen