Appearance
Serialization
Serialization refers to converting a Java object into binary content, essentially a byte[]
array.
Why serialize a Java object? Because after serialization, the byte[]
can be saved to a file or transmitted over a network, effectively storing the Java object in a file or sending it over the network.
With serialization comes deserialization, which converts binary content (i.e., a byte[]
array) back into a Java object. With deserialization, the byte[]
stored in a file can be "turned back" into a Java object, or a byte[]
read from the network can be "converted back" into a Java object.
Let’s look at how to serialize a Java object.
A Java object must implement a special interface called java.io.Serializable
to be serializable. Its definition is as follows:
java
public interface Serializable {
}
The Serializable
interface does not define any methods; it is an empty interface. Such an empty interface is called a "marker interface." Classes that implement a marker interface simply mark themselves without adding any methods.
To convert a Java object into a byte[]
array, we need to use ObjectOutputStream
, which writes a Java object to a byte stream:
java
import java.io.*;
import java.util.Arrays;
public class Main {
public static void main(String[] args) throws IOException {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
try (ObjectOutputStream output = new ObjectOutputStream(buffer)) {
// Write int:
output.writeInt(12345);
// Write String:
output.writeUTF("Hello");
// Write Object:
output.writeObject(Double.valueOf(123.456));
}
System.out.println(Arrays.toString(buffer.toByteArray()));
}
}
ObjectOutputStream
can write both primitive types like int
and boolean
, as well as String
(in UTF-8 encoding), and it can also write objects that implement the Serializable
interface.
Writing an object requires a significant amount of type information, so the content size is large.
Deserialization
Conversely, ObjectInputStream
reads Java objects from a byte stream:
java
try (ObjectInputStream input = new ObjectInputStream(...)) {
int n = input.readInt();
String s = input.readUTF();
Double d = (Double) input.readObject();
}
In addition to reading primitive and String
types, calling readObject()
directly returns an Object
. To convert it into a specific type, you must perform a type cast.
The readObject()
method may throw the following exceptions:
- ClassNotFoundException: The corresponding class was not found.
- InvalidClassException: The class does not match.
ClassNotFoundException
is common when a Java program on one computer serializes a Java object, such as a Person
object, and sends it over the network to another Java program on a different computer that does not define the Person
class, making deserialization impossible.
InvalidClassException
occurs when a serialized Person
object has an int
type field age
, but during deserialization, the Person
class has changed the age
field to a long
type, leading to class incompatibility.
To avoid such incompatibility due to class definition changes, Java serialization allows a class to define a special static variable called serialVersionUID
to identify the serialization "version" of the Java class. This can typically be auto-generated by an IDE. If fields are added or modified, you can change the serialVersionUID
value, which will automatically prevent mismatched class versions:
java
public class Person implements Serializable {
private static final long serialVersionUID = 2709425275741743919L;
}
Important Deserialization Characteristics
During deserialization, the JVM directly constructs the Java object without invoking the constructor. Therefore, any code inside the constructor will not execute during deserialization.
Security
Java's serialization mechanism poses a security risk because it allows an instance to be created directly from a byte[]
array without going through the constructor. A carefully crafted byte[]
array, when deserialized, can execute specific Java code, leading to severe security vulnerabilities.
In fact, Java’s built-in object-based serialization and deserialization mechanisms have both security and compatibility issues. A better serialization method is to use a universal data structure like JSON, which outputs only primitive types (including String
) and does not store any code-related information.
Summary
- Serializable Java objects must implement the
java.io.Serializable
interface; empty interfaces likeSerializable
are called "marker interfaces." - During deserialization, constructors are not called, and a
serialVersionUID
can be set as a version number (not mandatory). - Java's serialization mechanism is only suitable for Java. To exchange data with other languages, a universal serialization method, such as JSON, should be used.