Beruflich Dokumente
Kultur Dokumente
Configuration Data
NDFS stores cluster configuration data using a very
small in-memory database backed by solid-state drives
(SSDs). Three copies of this configuration database are
maintained in the cluster at all times. Importantly, there
Metadata
The most important and complex part of a file system is its
metadata. In a scalable file system, the amount of metadata
can potentially get very large. Further complicating the task,
it is not possible to hold the metadata centrally in a few
designated nodes or in memory.
NDFS employs multiple NoSQL concepts to scale the
storage and management of metadata. For example, the
system implements a NoSQL database called Cassandra
to maintain key-value pairs, where the key is the offset in a
particular virtual disk and the value represents the physical
locations of the replicas of that data in the cluster.
When a key needs to be stored, a consistent hash is used
to calculate the locations where the key and value will
be stored in the cluster. The consistent hash function is
responsible for uniformly distributing the load of storing
keys in the cluster. As the cluster grows or shrinks, the
ring self-heals and rebalances key storage responsibility
among the participating nodes. This ensures that every
node will be responsible for managing roughly the same
amount of metadata.
the files backing the virtual disks of the VMs. Each virtual disk,
or any other large file, is converted into a Nutanix vDisk that
is managed as a first-class citizen in the file system.
I/O for a particular VM is served by the local Controller
VM that is running on the host. That local controller VM
acquires the lock for all of the virtual disks backing the VM.
Because virtual disks are not typically shared with other
hosts, NDFS simply uses a vDisk-level lock. As such, there
are no invalidations or cache coherency issues. Even with
a very large number of VMs, NDFS effectively manages
locks with minimal overhead to ensure that the system can
still scale linearly.