|
This article needs to be cleaned up to conform to a higher standard of quality. This article has been tagged since November 2005. See Wikipedia:How to edit a page and Category:Wikipedia help for help, or this article's talk page.
 | It is requested that this article (or a section of this article) be expanded. Please remove this notice after the article has been expanded. Details are elsewhere on this talk page or at Wikipedia:Requests for expansion. Image File history File links Wikipedia Logo File history Legend: (cur) = this is the current file, (del) = delete this old version, (rev) = revert to this old version. ...
| The Zettabyte File System, or ZFS, is a free, open-source file system produced by Sun Microsystems for its Solaris operating system. The previous open source article now exists at open-source software. ...
In computing, a file system is a method for storing and organizing computer files and the data they contain to make it easy to find and access them. ...
Sun Microsystems (Sun Microsystems, Inc. ...
The Solaris Operating System is a computer operating system, based on the open-source UNIX SunOS developed by Sun Microsystems. ...
In computing, an operating system (OS) is the system software responsible for the direct control and management of hardware and basic system operations. ...
ZFS was announced in September, 2004 [1]. Source code for the final product was integrated into the main trunk of Solaris development on October 31, 2005[2] and released as part of build 27 of OpenSolaris on November 16, 2005. OpenSolaris is a project created by Sun Microsystems to create an open source version of their Solaris Operating Environment, a proprietary UNIX. The projects licensing terms are be analogous to Suns selling proprietary StarOffice while also sponsoring the OpenOffice. ...
ZFS was designed and implemented by a team at Sun led by Jeff Bonwick
Capacity
128 bits (18 billion billion times the capacity of 64 bit systems) The limitations of ZFS are designed to be so large that they will never be encountered in any practical operation. Some of the theoretical limits in ZFS are: - 248 — Number of snapshots in any file system (2 × 1014)
- 248 — Number of files in any individual file system (2 × 1014)
- 16 exabytes — Maximum size of a file system
- 16 exabytes — Maximum size of a single file
- 16 exabytes — Maximum size of any attribute
- 3 × 1023 petabytes — Maximum size of any zpool
- 256 — Number of attributes of a file (actually constrained to 248 as that's the number of files in a ZFS file system)
- 256 — Number of files in a directory (actually constrained to 248 as that's the number of files in a ZFS file system)
- 264 — Number of devices in any zpool
- 264 — Number of zpools in a system
- 264 — Number of file systems in a zpool
As an example of how large these numbers are and what they mean if a customer was creating 1,000 files a second it would take them about 9,000 years to reach the limit of the number of files. An exabyte (derived from the SI prefix exa-) is a unit of information or computer storage equal to one quintillion (one long scale trillion) bytes. ...
A petabyte (derived from the SI prefix peta- ) is a unit of information or computer storage equal to one quadrillion (one long scale billiard) bytes. ...
Platforms ZFS is part of Solaris for SPARC and Solaris for x86. zpools and their associated zfs file systems / zvols can be moved between systems using SPARC and x86. The complex block pointer format also allows for filesystem metadata to be stored in an endian-adaptive way, allowing storage containing a ZFS pool to be moved between systems of different byte-order. Individual metadata blocks are written with the native byte order of the system writing the block. When reading, if the endianness doesn't match, the metadata is byte-swapped in memory. Files appear to applications, as is usual in POSIX systems, as simple arrays of bytes, so applications remain responsible for coping with any required byte-swapping within file data. Sun UltraSPARC II Microprocessor SPARC (Scalable Processor ARChitecture or Sun Palo Alto Research Center) is a pure big-endian RISC microprocessor architecture originally designed in 1985 by Sun Microsystems. ...
x86 or 80x86 is the generic name of a microprocessor architecture first developed and manufactured by Intel. ...
Sun UltraSPARC II Microprocessor SPARC (Scalable Processor ARChitecture or Sun Palo Alto Research Center) is a pure big-endian RISC microprocessor architecture originally designed in 1985 by Sun Microsystems. ...
x86 or 80x86 is the generic name of a microprocessor architecture first developed and manufactured by Intel. ...
When integers or any other data are represented with multiple bytes, there is no unique way of ordering of those bytes in memory or in a transmission over some medium, and so the order is subject to arbitrary convention. ...
Copy-on-write transactional model ZFS uses a copy-on-write, transactional object model. All block pointers within the filesystem contain a 256-bit checksum of the target block which is verified when the block is read. Blocks containing active data are never overwritten in place; instead, a new block is allocated, modified data is written to it, and then any metadata blocks referencing it are similarly read, reallocated, and written. To reduce the overhead of this process, multiple updates are grouped into transaction groups, and an intent log is used when synchronous write semantics are required.
Snapshots As ZFS does not overwrite data in place taking a snapshot simply means not releasing the blocks used by the old version of the data. This has the advantage that snapshots are very fast to take and also that they are space efficient as they share unchanged data with the file system. Writable snapshots (known as clones) can be created, this results in two independent file systems being created that share a common set of blocks. As changes are made the file system blocks will diverge but common blocks will only be held once no matter how many clones are in place.
Dynamic striping Dynamic striping across all devices to maximize throughput means that as additional devices are added to the zpool the stripe width automatically expends to include them, thus all disks in a pool are used balancing the write load across all of them.
Variable block sizes ZFS uses variable-sized blocks of up to 128K. The currently available code allows the administrator to tune the maximum block size used as certain workloads do not perform well with large blocks. Automatic tuning to match workload characteristics is contemplated. If compression is enabled the variable block sizes are used, if a block can be compressed to fit into a smaller block size the smaller size is used on the disk so not only is less capacity used but also the IO throughput is better overall (though at the cost of increased CPU overhead for the compression / decompression operations)
Storage pools ZFS is built on top of virtual storage pools (unlike traditional file systems that require a separate volume manager) The storage pools are build of virtual devices (vdev) which can be in a raw device, RAID-1 or RAIDZ format. The storage capacity of the vdevs are then available to all of the file systems in the zpool (unless file system quotas or reservations limit the capacity to specific file systems) Multiple ZFS file systems can be created in a zpool and share the zpools resources. To limit tha amount of space a file system can contain a quota can be applied to it and to guarentee space is available to a file system it can be granted a reservation.
Additional capabilities Please improve this section according to the posted request for expansion. Explicit I/O priority with deadline scheduling Globally optimal I/O sorting and aggregation Multiple independent prefetch streams with automatic length and stride detection Parallel, constant-time directory operations ZFS appears to applications as a standard POSIX file system; no application changes are needed to store data in ZFS. POSIX is the collective name of a family of related standards specified by the IEEE to define the application program interface (API) for software designed to run on variants of the Unix OS. They are formally designated as IEEE 1003 and the international standard name is ISO/IEC 9945. ...
External links |