Appearance
Set
We know that a Map
is used to store key-value mappings. The objects used as keys must be unique and must correctly override the equals()
and hashCode()
methods.
If we only need to store unique keys without the need for a mapping value, we can use a Set
.
A Set
is used to store a collection of unique elements and primarily provides the following methods:
- Adding an element to
Set<E>
:boolean add(E e)
- Removing an element from
Set<E>
:boolean remove(Object e)
- Checking if an element is contained:
boolean contains(Object e)
Let’s look at some simple examples:
java
import java.util.*;
public class Main {
public static void main(String[] args) {
Set<String> set = new HashSet<>();
System.out.println(set.add("abc")); // true
System.out.println(set.add("xyz")); // true
System.out.println(set.add("xyz")); // false, adding failed because the element already exists
System.out.println(set.contains("xyz")); // true, the element exists
System.out.println(set.contains("XYZ")); // false, the element does not exist
System.out.println(set.remove("hello")); // false, removal failed because the element does not exist
System.out.println(set.size()); // 2, there are two elements in total
}
}
In fact, a Set
is equivalent to a Map
that only stores keys and does not store values. We often use a Set
to remove duplicate elements.
Since the elements placed in a Set
are similar to the keys in a Map
, they must correctly implement the equals()
and hashCode()
methods; otherwise, the element cannot be correctly placed in the Set
.
The most commonly used implementation class of Set
is HashSet
. In fact, HashSet
is simply a straightforward wrapper around HashMap
, and its core code is as follows:
java
public class HashSet<E> implements Set<E> {
// Holds a HashMap:
private HashMap<E, Object> map = new HashMap<>();
// Value to put in HashMap:
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT) == null;
}
public boolean contains(Object o) {
return map.containsKey(o);
}
public boolean remove(Object o) {
return map.remove(o) == PRESENT;
}
}
The Set
interface does not guarantee order, whereas the SortedSet
interface guarantees that elements are ordered:
HashSet
is unordered because it implements theSet
interface but does not implement theSortedSet
interface.TreeSet
is ordered because it implements theSortedSet
interface.
Here’s a diagram to illustrate:
┌───┐
│Set│
└───┘
▲
┌────┴─────┐
│ │
┌───────┐ ┌─────────┐
│HashSet│ │SortedSet│
└───────┘ └─────────┘
▲
│
┌─────────┐
│ TreeSet │
└─────────┘
Now let’s look at the output of HashSet
:
java
import java.util.*;
public class Main {
public static void main(String[] args) {
Set<String> set = new HashSet<>();
set.add("apple");
set.add("banana");
set.add("pear");
set.add("orange");
for (String s : set) {
System.out.println(s);
}
}
}
Note that the output order is neither the order of addition nor the sorted order of the String
. In different versions of the JDK, this order may also vary.
If we replace HashSet
with TreeSet
, the output will be ordered, following the sorting order of the elements:
java
import java.util.*;
public class Main {
public static void main(String[] args) {
Set<String> set = new TreeSet<>();
set.add("apple");
set.add("banana");
set.add("pear");
set.add("orange");
for (String s : set) {
System.out.println(s);
}
}
}
The requirements for using TreeSet
are the same as those for using TreeMap
; the elements added must correctly implement the Comparable
interface. If the Comparable
interface is not implemented, a Comparator
object must be passed when creating the TreeSet
.
Exercise
In a chat application, when the sender sends a message and encounters a network timeout, it will automatically resend, which means the receiver might receive duplicate messages. When displaying to the user, it is necessary to remove duplicates first. Please practice using a Set
to remove duplicate messages:
java
import java.util.*;
public class Main {
public static void main(String[] args) {
List<Message> received = List.of(
new Message(1, "Hello!"),
new Message(2, "Have you received your salary?"),
new Message(2, "Have you received your salary?"),
new Message(3, "Where to eat?"),
new Message(3, "Where to eat?"),
new Message(4, "Bye")
);
List<Message> displayMessages = process(received);
for (Message message : displayMessages) {
System.out.println(message.text);
}
}
static List<Message> process(List<Message> received) {
// TODO: Remove duplicate messages by sequence
return received;
}
}
class Message {
public final int sequence;
public final String text;
public Message(int sequence, String text) {
this.sequence = sequence;
this.text = text;
}
}
Summary
Set
is used to store a collection of unique elements:- The elements placed in a
HashSet
must meet the same requirements as keys in aHashMap
. - The elements placed in a
TreeSet
must meet the same requirements as keys in aTreeMap
.
- The elements placed in a
- Using
Set
can help remove duplicate elements. - Traversing a
SortedSet
iterates according to the sorted order of the elements, and a custom sorting algorithm can also be defined.