|
In computer science, in the context of data storage and transmission, serialization is the process of saving an object onto a storage medium (such as a file, or a memory buffer) or to transmit it across a network connection link in binary form. The series of bytes or the format can be used to re-create an object that is identical in its internal state to the original object (actually a clone). Serialization has several meanings: Serialization is publication or broadcasting in serial form. ...
Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. ...
In strictly mathematical branches of computer science the term object is used in a purely mathematical sense to refer to any thing. While this interpretation is useful in the discussion of abstract theory, it is not concrete enough to serve as a primitive datatype in the discussion of more concrete...
This article is about computer files and file systems in general terms. ...
A computer network is an interconnection of a group of computers. ...
A Hexdump of a JPEG image. ...
This process of serializing an object is also called deflating or marshalling an object. The opposite operation, extracting a data structure from a series of bytes, is deserialization (which is also called inflating or unmarshalling). See serial publication for the term in publishing In computer science, serialization means to force one-at-a-time access for the purposes of concurrency control, or to encode a data structure as a sequence of bytes. ...
Uses
Serialization has a number of advantages. It provides: For some of these features to be useful, architecture independence must be maintained. For example, for maximal use of distribution, a computer running on a different hardware architecture should be able to reliably reconstruct a serialized data stream, regardless of endianness. This means that the simpler and faster procedure of directly copying the memory layout of the data structure cannot work reliably for all architectures. Serializing the data structure in an architecture independent format means that we do not suffer from the problems of byte ordering, memory layout, or simply different ways of representing data structures in different programming languages. Remote procedure call (RPC) is a protocol that allows a computer program running on one computer to cause a subroutine on another computer to be executed without the programmer explicitly coding the details for this interaction. ...
A collection of decorative soaps used for human hygiene purposes. ...
It has been suggested that this article or section be merged with Component-based software engineering. ...
Component Object Model (COM) is a platform for software componentry introduced by Microsoft in 1993. ...
In computing, Common Object Request Broker Architecture (CORBA) is a standard for software componentry, created and controlled by the Object Management Group (OMG). ...
In computing, endianness is the byte (and sometimes bit) ordering in memory used to represent some kind of data. ...
When integers or any other data are represented with multiple bytes, there is no unique way of ordering of those bytes in memory or in a transmission over some medium, and so the order is subject to arbitrary convention. ...
A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. ...
One issue that comes up in many serialization schemes is that, because the encoding of the data is serial, extracting one part of the serialized data structure requires that the entire object be read and reconstructed. Even on a single machine, primitive pointer objects are too fragile to save, because the objects to which they point may be reloaded to a different location in memory. To deal with this, the serialization process includes a step called unswizzling or pointer unswizzling and the deserialization process includes a step called pointer swizzling. It has been suggested that Software pointer be merged into this article or section. ...
In computer science, pointer swizzling is the conversion of references based on name or position to direct pointer references. ...
In computer science, pointer swizzling is the conversion of references based on name or position to direct pointer references. ...
Since both serializing and deserializing can be driven from common code, (for example, the Serialize function in Microsoft Foundation Classes) it is possible for the common code do both at the same time, and thus 1) detect differences between the objects being serialized and their prior copies, and 2) provide the input for the next such detection. It is not necessary to actually build the prior copy, since differences can be detected "on the fly". This is a way to understand the technique called Differential Execution. It is useful in the programming of user interfaces whose contents are time-varying - graphical objects can be created, removed, altered, or made to handle input events without necessarily having to write separate code to do those things. Microsoft Foundation Classes, or MFC, is a Microsoft library that wraps portions of the Windows API in C++ classes, forming an application framework. ...
Differential execution refers to a method of executing a computer subroutine (See control flow) in such a way that differences from prior executions can be detected and acted upon. ...
Consequences Serialization, however, breaks the opacity of an abstract data type by potentially exposing private implementation details. To discourage competitors from making compatible products, publishers of proprietary software often keep the details of their programs' serialization formats a trade secret. Some deliberately obfuscate or even encrypt the serialized data. Look up opacity in Wiktionary, the free dictionary. ...
In computing, an abstract data type (ADT) is a specification of a set of data and the set of operations that can be performed on the data. ...
Proprietary software is software with restrictions on copying and modifying as enforced by the proprietor. ...
A trade secret is a formula, practice, process, design, instrument, pattern, or compilation of information used by a business to obtain an advantage over competitors within the same industry or profession. ...
Obfuscation refers to the concept of concealing the meaning of communication by making it more confusing and harder to interpret. ...
Encrypt redirects here. ...
Yet, interoperability requires that applications be able to understand the serialization of each other. Therefore remote method call architectures such as CORBA define their serialization formats in detail and often provide methods of checking the consistency of any serialized stream when converting it back into an object. RMI-IIOP (read RMI over IIOP) denotes the Java RMI interface over the CORBA system. ...
In computing, Common Object Request Broker Architecture (CORBA) is a standard for software componentry, created and controlled by the Object Management Group (OMG). ...
Human-readable serialization In the late 1990s, a push to provide an alternative to the standard serialization protocols started: the XML markup language was used to produce a human readable text-based encoding. Such an encoding can be useful for persistent objects that may be read and understood by humans, or communicated to other systems regardless of programming language. It has the disadvantage of losing the more compact, byte stream based encoding, which is generally more practical. A future solution to this dilemma could be transparent compression schemes (see binary XML). For the band, see 1990s (band). ...
The Extensible Markup Language (XML) is a general-purpose markup language. ...
Binary XML, or Binary Extensible Markup Language, refers to any specification which attempts to encode an XML document in a binary data format, rather than plain text. ...
XML is today often used for asynchronous transfer of structured data between client and server in Ajax web applications. An alternative for this use case is JSON, a more lightweight text-based serialization protocol which uses JavaScript syntax but is supported in numerous other programming languages as well. AJAX redirects here. ...
JSON (JavaScript Object Notation) (Pronounced like Jason, IPA ) is a lightweight computer data interchange format. ...
JavaScript is a scripting language most often used for client-side web development. ...
Programming language support Several object-oriented programming languages directly support object serialization (or object archival), either by syntactic sugar elements or providing a standard interface for doing so. Object-oriented programming (OOP) is a programming paradigm that uses objects and their interactions to design applications and computer programs. ...
Syntactic sugar is a term coined by Peter J. Landin for additions to the syntax of a computer language that do not affect its functionality but make it sweeter for humans to use. ...
An interface is a specification that exists between software components that specifies a selected means of interaction, by means of properties of other software modules, which abstract and encapsulate their data. ...
Some of these programming languages are Ruby, Smalltalk, Python, Objective-C, Java, and the .NET family of languages. Ruby is a reflective, object-oriented programming language. ...
For other uses, see Small talk. ...
Python is a high-level programming language first released by Guido van Rossum in 1991. ...
Objective-C, often referred to as ObjC or more seldomly as Objective C or Obj-C, is an object oriented programming language implemented as an extension to C. It is used primarily on Mac OS X and GNUstep, two environments based on the OpenStep standard, and is the primary language...
Java language redirects here. ...
The Microsoft . ...
There are also libraries available that add serialization support to languages that lack native support for it.
.NET Framework In the .NET languages, classes can be serialized and deserialized by adding the Serializable attribute to the class. The Microsoft . ...
'VB Example <Serializable()> Class Employee // C# Example [Serializable] class Employee If new members are added to a serializable class, they can be tagged with the OptionalField attribute to allow previous versions of the object to be deserialized without error. This attribute affects only deserialization, and prevents the runtime from throwing an exception if a member is missing from the serialized stream. A member can also be marked with the NonSerialized attribute to indicate that it should not be serialized. This will allow the details of those members to be kept secret. To modify the default deserialization (for example, to automatically initialize a member marked NonSerialized), the class must implement the IDeserializationCallback interface and define the IDeserializationCallback.OnDeserialization method. Objects may be serialized in binary format for deserialization by other .NET applications. The framework also provides the SoapFormatter and XmlSerializer objects to support serialization in human-readable, cross-platform XML. The Microsoft . ...
Objective-C In the Objective-C programming language, serialization (most commonly known as archival) is achieved by overriding the write: and read: methods in the Object root class. (NB This is in the GNU runtime variant of Objective-C. In the NeXT-style runtime, the implementation is very similar.) Objective-C, often referred to as ObjC or more seldomly as Objective C or Obj-C, is an object oriented programming language implemented as an extension to C. It is used primarily on Mac OS X and GNUstep, two environments based on the OpenStep standard, and is the primary language...
Example The following example demonstrates two independent programs, a "sender", who takes the current time (as per time in the C standard library), archives it and prints the archived form to the standard output, and a "receiver" which decodes the archived form, reconstructs the time and prints it out. The C standard library is a now-standardised collection of header files and library routines used to implement common operations, such as input/output and string handling, in the C programming language. ...
When compiled, we get a sender program and a receiver program. If we just execute the sender program, we will get out a serialization that looks like: GNU TypedStream 1D@îC¡ (with a NULL character after the 1). If we pipe the two programs together, as sender | receiver, we get received 1089356705 showing the object was serialized, sent, and reconstructed properly. In essence, the sender and receiver programs could be distributed across a network connection, providing distributed object capabilities.
Sender.h #import <objc/Object.h> #import <time.h> #import <stdio.h> @interface Sender : Object { time_t current_time; } - (id) setTime; - (time_t) time; - (id) send; - (id) read: (TypedStream *) s; - (id) write: (TypedStream *) s; @end Sender.m #import "Sender.h" @implementation Sender - (id) setTime { //Set the time current_time = time(NULL); return self; } - (time_t) time; { return current_time; } - (id) write: (TypedStream *) stream { /* *Write the superclass to the stream. *We do this so we have the complete object hierarchy, *not just the object itself. */ [super write:stream]; /* *Write the current_time out to the stream. *time_t is typedef for an integer. *The second argument, the string "i", specifies the types to write *as per the @encode directive. */ objc_write_types(stream, "i", ¤t_time); return self; } - (id) read: (TypedStream *) stream { /* *Do the reverse to write: - reconstruct the superclass... */ [super read:stream]; /* *And reconstruct the instance variables from the stream... */ objc_read_types(stream, "i", ¤t_time); return self; } - (id) send { //Convenience method to do the writing. We open stdout as our byte stream TypedStream *s = objc_open_typed_stream(stdout, OBJC_WRITEONLY); //Write the object to the stream [self write:s]; //Finish up - close the stream. objc_close_typed_stream(s); } @end Sender.c #import "Sender.h" int main(void) { Sender *s = [Sender new]; [s setTime]; [s send]; return 0; } Receiver.m #import "Receiver.h" @implementation Receiver - (id) receive { //Open stdin as our stream for reading. TypedStream *s = objc_open_typed_stream(stdin, OBJC_READONLY); //Allocate memory for, and instantiate the object from reading the stream. t = [[Sender alloc] read:s]; objc_close_typed_stream(s); } - (id) print { fprintf(stderr, "received %dn", [t time]); } @end Receiver.c #import "Receiver.h" int main(void) { Receiver *r = [Receiver new]; [r receive]; [r print]; return 0; } Java Java provides automatic serialization which requires that the object be marked by implementing the java.io.Serializable interface. Implementing the interface marks the class as "okay to serialize," and Java then handles serialization internally. There are no serialization methods defined on the Serializable interface, but a serializable class can optionally define methods with certain special names and signatures that if defined, will be called as part of the serialization/deserialization process. The language also allows the developer to override the serialization process more thoroughly by implementing another interface, the Externalizable interface, which includes two special methods that are used to save and restore the object's state. The marker interface pattern is a design pattern in computer science. ...
An interface in the Java programming language is an abstract type which is used to specify an interface (in the generic sense of the term) that classes must implement. ...
There are three primary reasons why objects are not serializable by default and must implement the Serializable interface to access Java's serialization mechanism. - Not all objects capture useful semantics in a serialized state. For example, a
Thread object is tied to the state of the current JVM. There is no context in which a deserialized Thread object would maintain useful semantics. - The serialized state of an object forms part of its class' compatibility contract. Maintaining compatibility between versions of serializable classes requires additional effort and consideration. Therefore, making a class serializable needs to be deliberate design decision and not a default condition.
- Serialization allows access to non-transient private members of a class that are not otherwise accessible. Classes containing sensitive information (for example, a password) should not be serializable or externalizable.
The standard encoding method uses a simple translation of the fields into a byte stream. Primitives as well as non-transient, non-static referenced objects are encoded into the stream. Each object that is referenced by the serialized object and not marked as transient must also be serialized; and if any object in the complete graph of non-transient object references is not serializable, then serialization will fail. The developer can influence this behavior by marking objects as transient, or by redefining the serialization for an object so that some portion of the reference graph is truncated and not serialized. A Java virtual machine or JVM is a virtual machine that runs Java byte code. ...
It is possible to serialize Java objects through JDBC and store them into a database. [1] Java Database Connectivity, or JDBC, is an API for the Java programming language that defines how a client may access a database. ...
While Swing components do implement the Serializable interface, it is important to remember that they are not portable between different versions of the Java Virtual Machine. As such, a Swing component, or any component which inherits it, may be serialized to an array of bytes, but it is not guaranteed that this storage will be readable on another machine.
Example import java.io.*; import java.util.*; public class Serialize { /** * @param obj Object - The object that is saved. * @param filename String - The filename of the file it is saved to. */ public static void save(Object obj, String filename) throws IOException { ObjectOutputStream objstream = new ObjectOutputStream(new FileOutputStream(filename)); objstream.writeObject(obj); objstream.close(); } /** * @param filename String - The filename for the file to be loaded */ public static Object load(String filename) throws Exception { ObjectInputStream objstream = new ObjectInputStream(new FileInputStream(filename)); Object obj = objstream.readObject(); objstream.close(); return obj; } /** * @param args String[] - the command line arguments */ public static void main(String[] args) { /** * Trying to use serialization to save a Vector to a file. * The vector is read from file, one entry is added and then the vector is written * back to file, overwriting the previous contents. */ Vector v; try { v = (Vector) load("friends.ser"); System.out.println("Read: "+v); } catch(Exception e) { System.out.println("File not found. Creating it."); v = new Vector(); v.addElement("Peter"); v.addElement("John"); v.addElement("Bryan"); System.out.println("Created: "+v); } v.addElement("Friend"+v.size()); try { save(v, "friends.ser"); System.out.println("Saved: "+v); } catch(Exception e) { System.out.print("Error saving file: "); e.printStackTrace(); } } } ColdFusion ColdFusion allows data structures to be serialized to WDDX with the <cfwddx> tag. This article or section does not adequately cite its references or sources. ...
WDDX (Web Distributed Data eXchange) is a mechanism to pass data between different computers. ...
OCaml OCaml's standard library provides marshalling through the Marshal module. While OCaml programming is statically type-checked, uses of the Marshal module may break type guarantees, as there is no way to check whether an unmarshalled stream represents objects of the expected type. Objective Caml (OCaml) is a general-purpose programming language descended from the ML family, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. ...
Perl Several Perl modules available from CPAN provide serialization mechanisms, including Storable and FreezeThaw. Wikibooks has a book on the topic of Perl Programming Perl is a dynamic programming language created by Larry Wall and first released in 1987. ...
CPAN is an acronym standing for Comprehensive Perl Archive Network. ...
Storable includes functions to serialize and deserialize Perl data structures to and from files or Perl scalars. use Storable; # Create a hash with some nested data structures my %struct = ( text => 'Hello, world!', list => [1, 2, 3] ); # Serialize the hash into a file store %struct, 'serialized'; # Read the data back later my $newstruct = retrieve 'serialized'; In addition to serializing directly to files, Storable includes the freeze function to return a serialized copy of the data packed into a scalar, and thaw to deserialize such a scalar. This is useful for sending a complex data structure over a network socket or storing it in a database. When serializing structures with Storable, there are network safe functions that always store their data in a format that is readable on any computer at a small cost of speed. These functions are named nstore, nfreeze, etc. There are no "n" functions for deserializing these structures - the regular thaw and retrieve deserialize structures serialized with the "n" functions and their machine-specific equivalents.
C The tpl library supports serializing C data structures into an efficient, native binary representation. The serialized data can be reversibly converted to a human-readable XML representation.
C++ Boost Serialization, libs11n, and Sweet Persist are libraries that provide support for serialization from within the C++ language itself. They all integrate well with the STL. Boost Serialization and Sweet Persist support serialization in XML and binary formats. The libs11n library supports serialization to and from several text formats (including 3 flavors of XML) as well as to and from sqlite3 and MySQL databases. The Microsoft Foundation Class Library has comprehensive support for serialization to a binary format. It doesn't have support for the STL but does support its own containers. This article does not cite any references or sources. ...
Alternatively XML Data Binding implementations, like XML Schema to C++ data binding compiler, provide support for serialization to and from XML by generating C++ source code from an intermediate specification (e.g. an XML schema). XML data binding refers to the process of representing the information in an XML document as an object in computer memory. ...
Ebenezer Enterprises provides a middleware service that writes C++ marshaling code.
Python Python implements serialization through the built-in pickle, and to a lesser extent, the older marshal modules. Marshal does offer the ability to serialize Python code objects, unlike pickle. Python is a high-level programming language first released by Guido van Rossum in 1991. ...
PHP PHP implements serialization through the built-in 'serialize' and 'unserialize' functions. PHP can serialize any of its datatypes except resources (file pointers, sockets, etc.). For other uses, see PHP (disambiguation). ...
For objects (as of at least PHP 4) there are two "magic methods" than can be implemented within a class — __sleep() and __wakeup() — that are called from within serialize() and unserialize(), respectively, that can clean up and restore an object. For example, it may be desirable to close a database connection on serialization and restore the connection on unserialization; this functionality would be handled in these two magic methods. They also permit the object to pick which properties are serialized.
REBOL REBOL will serialize to file (save/all) or to a string! (mold/all). Strings and files can be deserialized using the polymorphic load function. REBOL, the Relative Expression Based Object Language (pronounced [rebl]), is a data exchange and programming language designed specifically for network communications and distributed computing. ...
The Greek meaning of the words poly and morph together imply that a single entity can take on multiple forms. In the field of computer science, there are two fundamentally different types of polymorphism; subtype polymorphism, and parametric polymorphism. ...
Ruby Ruby includes the standard module Marshal with 2 methods dump and load, akin to the standard Unix utilities dump and restore. These methods serialize to the standard class String, that is they effectively become a sequence of bytes. Ruby is a reflective, object-oriented programming language. ...
Dump is a Unix program used to back up file systems. ...
Some objects can't be serialized (doing so would raise a TypeError exception): - bindings,
- procedure objects,
- instances of class IO,
- singleton objects
If a class requires custom serialization (for example, it requires certain cleanup actions done on dumping / restoring), it can be done by implementing 2 methods: _dump and _load. The instance method _dump should return a String object containing all the information necessary to reconstitute objects of this class and all referenced objects up to a maximum depth given as an integer parameter (a value of -1 implies that depth checking should be disabled). The class method _load should take a String and return an object of this class. class Klass def initialize(str) @str = str end def sayHello @str end end o = Klass.new("hellon") data = Marshal.dump(o) obj = Marshal.load(data) obj.sayHello » "hellon" For other uses, see Small talk. ...
Squeak Smalltalk There are several ways in Squeak Smalltalk to serialize and store objects. The easiest and most used method will be shown below. Other classes of interest in Squeak for serializing objects are SmartRefStream and ImageSegment. Screenshot of the Squeak VM running under X11 on Kubuntu Linux. ...
To store a Dictionary (sometimes called a hash map in other languages) containing some nonsense data of varying types into a file named "data.obj": In computer science, a hash table, or a hash map, is a data structure that associates keys with values. ...
| data rr | data := Dictionary new. data at: #Meef put: 25; at: 23 put: 'Amanda'; at: 'Small Numbers' put: #(0 1 2 3 four). rr := ReferenceStream fileNamed: 'data.obj'. rr nextPut: data; close. To restore the Dictionary object stored in "data.obj" and bring up an inspector containing the data: | restoredData rr | rr := ReferenceStream fileNamed: 'data.obj'. restoredData := rr next. restoredData inspect. rr close. Other Smalltalk dialects Object serialization is not part of the ANSI Smalltalk specification. As a result, the code to serialize an object varies by Smalltalk implementation. The resulting binary data also varies. For instance, a serialized object created in Squeak Smalltalk cannot be restored in Ambrai Smalltalk. Consequently, various applications that do work on multiple Smalltalk implementations that rely on object serialization cannot share data between these different implementations. These applications include the MinneStore object database [2] and some RPC packages. A solution to this problem is SIXX [3], which is an package for multiple Smalltalks that uses an XML-based format for serialization. Ambrai Smalltalk Ambrai Smalltalk is an implementation of the Smalltalk language and development environment for Mac OS X. External Links http://www. ...
Remote procedure call (RPC) is a protocol that allows a computer program running on one computer to cause a subroutine on another computer to be executed without the programmer explicitly coding the details for this interaction. ...
The Extensible Markup Language (XML) is a general-purpose markup language. ...
Lisp Generally a Lisp data structure can be serialized with the functions "read" and "print". A variable foo containing, for example, a list of arrays would be printed by (print foo). Similarly the contents of a stream can be read into a variable by (read foo). These two parts of the Lisp implementation are called the Printer and the Reader. The output of "print" is human readable; it uses lists demarked by parentheses, for example: (4 2.9 "x" y). âLISPâ redirects here. ...
In many types of Lisp, including Common Lisp, the printer cannot represent every type of data because it is not clear how to do so. In Common Lisp for example the printer cannot print CLOS objects. Instead the programmer may write a method on the generic function print-object, this will be invoked when the object is printed. This is somewhat similar to the method used in Ruby. Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard X3. ...
Lisp code itself is written in the syntax of the reader, called read syntax. Most languages use separate and different parsers to deal with code and data, Lisp only uses one. A file containing lisp code may be read into memory as a data structure, transformed by another program, then possibly executed or written out. See REPL. ...
Haskell In Haskell, serialization is supported for types by inheritance of the Read and Show type classes. Every type that inherits the Read class defines a function that shall, on reading in dumped data, be given the string so that it can extract the data. And the Show class contains the show function from which a string representation of the object can be got. The programmer need not define the functions explicitly -- merely declaring a type to be deriving Read or deriving Show, or both, will make the compiler generate the suiting functions.
See also Differential execution refers to a method of executing a computer subroutine (See control flow) in such a way that differences from prior executions can be detected and acted upon. ...
Hibernate is an object-relational mapping (ORM) solution for the Java language: it provides an easy to use framework for mapping an object-oriented domain model to a traditional relational database. ...
Persistor. ...
External links For Java: For C++: |