Ceph is a free software storage platform designed to present object, block, and file storage from a single distributed computer cluster. Ceph's main goals are to be completely distributed without a single
point of failure, scalable to the exabyte level, and freely-available. The data is replicated, making it fault tolerant.[3]
Ceph software runs on commodity hardware.
The system is designed to be both self-healing and self-managing and strives to reduce both administrator and budget overhead.
Ceph employs four distinct kinds of daemons:[4]
Cluster monitors (ceph-mon) that keep track of active and failed cluster nodes
Metadata servers (ceph-mds) that store the metadata of inodes and directories
Object storage devices (ceph-osd) that actually store the content of files. Ideally, OSDs store their data on a local btrfs filesystem to leverage its built-in copy-on-write capabilities, though other local filesystems can be used instead.[5]
Representational state transfer (
RESTful) gateways (ceph-rgw) that expose the object storage layer as an
interface compatible with
Amazon S3 or
OpenStack Swift APIs
All of these are fully distributed, and may run on the same set of servers.
Clients directly interact with all of them.[6]
Ceph does striping of individual files across multiple nodes to achieve higher throughput, similarly to how
RAID0 stripes partitions across multiple hard drives.
Adaptive load balancing is supported whereby frequently accessed objects are replicated over more nodes.[citation needed]
As of December 2014, underlying filesystems recommended for production environments are ext4 (small scale) and
XFS (large scale deployments), while Btrfs and
ZFS are recommended for non-production environments.[7]
Object storage
An architecture diagram showing the relations between components of the Ceph storage platform
Ceph implements distributed object storage. Ceph’s software libraries provide client applications with direct access to the reliable autonomic distributed object store (
RADOS) object-based storage system, and also provide a foundation for some of Ceph’s features, including
RADOS Block Device (
RBD), RADOS
Gateway, and the
Ceph File System.
The librados software libraries provide access in
C, C++,
Java,
Python and
PHP. The RADOS Gateway also exposes the object store as a RESTful interface which can present as both native Amazon S3 and OpenStack Swift APIs.
Block storage
Ceph’s object storage system allows users to mount Ceph as a thinly provisioned block device. When an application writes data to Ceph using a block device, Ceph automatically stripes and replicates the data across the cluster. Ceph's RADOS Block Device (RBD) also integrates with kernel virtual machines (KVMs).
Ceph RBD interfaces with the same Ceph object storage system that provides the librados interface and the CephFS file system, and it stores block device images as objects. Since RBD is built on top of librados, RBD inherits librados's capabilities, including read-only snapshots and revert to snapshot. By striping images across the cluster, Ceph improves read access performance for large block device images.
The block device is supported in virtualization platforms, including
Apache CloudStack, OpenStack, OpenNebula, Ganeti, and
Proxmox Virtual Environment. These integrations allow administrators to use Ceph's block device as the storage for their virtual machines in these environments.
File system
Ceph’s file system (CephFS) runs on top of the same object storage system that provides object storage and block device interfaces. The Ceph metadata server cluster provides a service that maps the directories and file names of the file system to objects stored within RADOS clusters. The metadata server cluster can expand or contract, and it can rebalance the file system dynamically to distribute data evenly among cluster hosts. This ensures high performance and prevents heavy loads on specific hosts within the cluster.
- published: 28 Jun 2015
- views: 604