Appearance
Characters And Strings
In Java, characters and strings are two different types.
Character Type
The character type char
is the basic data type, which is the abbreviation of character
. A char
holds a Unicode character:
java
char c1 = 'A';
char c2 = '中';
Because Java always uses Unicode to represent characters in memory, an English character and a Chinese character are both represented by a char
type, and they both occupy two bytes. To display the Unicode encoding of a character, just assign the char
type directly to the int
type:
java
int n1 = 'A'; // The Unicode encoding for the letter "A" is 65
int n2 = '中'; // The Unicode encoding of the Chinese character "中" is 20013
You can also directly use the escape character \u
+Unicode encoding to represent a character:
java
// Note that it is hexadecimal:
char c3 = '\u0041'; // 'A',Because hex 0041 = decimal 65
char c4 = '\u4e2d'; // '中',Because hex 4e2d = decimal 20013
String Type
Different from the char
type, the string type String
is a reference type. We use double quotes "..."
to represent a string. A string can store 0 to any number of characters:
java
String s = ""; // Empty string, containing 0 characters
String s1 = "A"; // contains one character
String s2 = "ABC"; // Contains 3 characters
String s3 = "中文 ABC"; // Contains 6 characters, including a space
Because the string uses double quotes "..."
to indicate the beginning and end, what if the string itself contains exactly one "
character? For example, "abc"xyz"
, the compiler cannot determine whether the middle quote is a string Part of it still indicates the end of the string. At this time, we need to use the escape character \
:
java
String s = "abc\"xyz"; // Contains 7 characters: a, b, c, ", x, y, z
Because \
is an escape character, two \\
represent one \
character:
java
String s = "abc\\xyz"; // Contains 7 characters: a, b, c, \, x, y, z
Common escape characters include:
\"
represents characters"
\'
represents character'
\\
represents characters\
\n
represents a newline character\r
represents the carriage return character\t
means Tab\u####
represents a Unicode encoded character
For example:
java
String s = "ABC\n\u4e2d\u6587"; // Contains 6 characters: A, B, C, \n, 中, 文
String Concatenation
Java's compiler takes special care of strings and can use +
to connect any string and other data types, which greatly facilitates string processing. For example:
java
public class Main {
public static void main(String[] args) {
String s1 = "Hello";
String s2 = "world";
String s = s1 + " " + s2 + "!";
System.out.println(s); // Hello world!
}
}
If you use +
to connect a string with other data types, the other data types will be automatically converted to strings first and then connected:
java
public class Main {
public static void main(String[] args) {
int age = 25;
String s = "age is " + age;
System.out.println(s); // age is 25
}
}
Multiline String
If we want to represent a multi-line string, it will be very inconvenient to use the + sign to connect:
java
String s = "first line \n"
+ "second line \n"
+ "end";
Starting from Java 13, strings can use """..."""
to represent multi-line strings (Text Blocks). For example:
java
public class Main {
public static void main(String[] args) {
String s = """
SELECT * FROM
users
WHERE id > 100
ORDER BY name DESC
""";
System.out.println(s);
}
}
The above multi-line string is actually 5 lines, with a \n
after the last DESC
. If we don't want to add a \n
at the end of the string, we need to write like this:
java
String s = """
SELECT * FROM
users
WHERE id > 100
ORDER BY name DESC""";
It should also be noted that common spaces in front of multi-line strings will be removed, that is:
java
String s = """
...........SELECT * FROM
........... users
...........WHERE id > 100
...........ORDER BY name DESC
...........""";
Spaces marked with .
will be removed.
If the formatting of a multi-line string is irregular, the removed spaces will look like this:
java
String s = """
......... SELECT * FROM
......... users
.........WHERE id > 100
......... ORDER BY name DESC
......... """;
That is, the shortest space at the beginning of the line is always used as the basis.
Immutable Properties
In addition to being a reference type, Java's string also has an important feature, which is that strings are immutable. Examine the following code:
java
public class Main {
public static void main(String[] args) {
String s = "hello";
System.out.println(s); // hello
s = "world";
System.out.println(s); // world
}
}
Observe the execution results. Has the string s
changed? In fact, what is changed is not the string, but the "pointing" of the variable s
.
When executing String s = "hello";
the JVM virtual machine first creates the string "hello"
, and then points the string variable s
to it:
s
│
▼
┌───┬───────────┬───┐
│ │ "hello" │ │
└───┴───────────┴───┘
Immediately afterwards, when executing s = "world";
the JVM virtual machine first creates the string "world"
, and then points the string variable s
to it:
s ──────────────┐
│
▼
┌───┬───────────┬───┬───────────┬───┐
│ │ "hello" │ │ "world" │ │
└───┴───────────┴───┴───────────┴───┘
The original string "hello"
is still there, but we can't access it through the variable s
. Therefore, the immutability of a string means that the content of the string is immutable. As for variables, they can point to the string "hello"
at one time and "world"
at the other time.
After understanding the "pointing" of reference types, try to explain the following code output:
java
// String is immutable
public class Main {
public static void main(String[] args) {
String s = "hello";
String t = s;
s = "world";
System.out.println(t); // Is t "hello" or "world"?
}
}
Null Value
A reference type variable can point to an empty value null
, which means it does not exist, that is, the variable does not point to any object. For example:
java
String s1 = null; // s1 is null
String s2 = s1; // s2 is also null
String s3 = ""; // s3 points to the empty string, not null
Note that the empty value null
and the empty string ""
should be distinguished. The empty string is a valid string object and is not equal to null
.
Practise
Please treat a set of int values as the Unicode encoding of the characters and then piece them together into a string:
java
public class Main {
public static void main(String[] args) {
// Please treat the following set of int values as the Unicode codes of characters and put them together into a string:
int a = 72;
int b = 105;
int c = 65281;
// FIXME:
String s = a + b + c;
System.out.println(s);
}
}
Summary
Java's character type char
is a basic type, and string type String
is a reference type;
Variables of basic types "hold" a certain value, and variables of reference types "point to" an object;
Variables of reference types can be null
;
To distinguish between the empty value null
and the empty string ""
.