Sie sind auf Seite 1von 17

Weed File System

Simple and highly scalable


distributed file system (NoFS)

Project Objectives
Yes

1
2

Store billions of files!


Serve the files fast!

Not

Namespaces
POSIX compliant

Design Goals
1
2
3
4

Separate volume metadata from file metadata.


Each volume can optionally have several
replicas.
Flexible volume placement control, several
volume placement policy.
When saving data, use can specify replication
factor, or desired replication policy

Challenges for common FS


POSIX costs space, inefficient
One folder can not store too many
files
Need to generate a deep path

Reading one file requires visiting


whole directory path, each may
require one disk seek
Slow moving deep directories across
computers

Challenges for HDFS


Stores large files, not lots of small
files
Designed for streaming files, not on
demand random access
Name node keeps all metadata
(SPOF), bottleneck

How Weed-FS files are


stored?
Files are stored into 32GB-sized
volumes
Each volume server has multiple
volumes
Master server tracks each volume
location and free space
Master server generate unique keys,
and direct clients to a volume server
to store
Clients remember the fid.

Workflow

Master Node
Generate Unique Keys
Track volume status
<volume id, <url, free size>>
Maintained via heartbeat
Can restart

fid format
Sample File Key:
3,01637037d6

Each Key has 3 components:


Volume ID = 3
File Key = 01
File cookie = 637037d6(4bytes)

Volume Node
Keep several volumes
Each volume keep a map
Map<key, <offset, size>>

File Entry in Volume

Compared to HDFS
HDFS

Namenode stores all


file metadata
Namenode loss can
not be tolerated

WeedFS

MasterNode only
stores volume location
MasterNode can be
restarted fresh
Easy to have multiple
instances (TODO)

Serve File Fast


Each Volume Server maintains an
map<key,<offset,size>> for each of
its volumes.
No disk read for file metadata
Possibly read the file with one disk read,
O(1)
Unless file is already in buffer, or
File on disk is not in one continuous block
(Use XFS to store on continuous block)

Automatic Compression
Compress the data based on mime
types
Transparent
Works with browser if accept gzip
encoding

Volume Replica Placement


1
2
3

Each volume can have several replicas.


Flexible volume placement control, several
volume placement policy.
When saving data, use can specify replication
factor, or desired replication policy

Flexible Replica Placement Policy


1
2
3
4
5

No replication.
1 replica on local rack
1 replica on local data center, but different rack
1 replica on a different data center
2 replicas, first on local rack, random other
server, second on local datacenter, random
other rack.
2 replicas, first on random other rack and same
data center, second on different data center

Future work
Tools to manage the file system

Das könnte Ihnen auch gefallen