Set
A Set<E> is a collection that contains no duplicate elements — adding an element already present is a silent no-op. Sets are the natural choice for membership tests, de-duplication, and mathematical set algebra (union, intersection, difference). The three core implementations differ chiefly in ordering and performance.
HashSet vs LinkedHashSet vs TreeSet
| Implementation | Backing structure | Iteration order | add/contains/remove | null allowed |
|---|---|---|---|---|
HashSet | HashMap | Unordered (hash) | O(1) average | One null |
LinkedHashSet | LinkedHashMap | Insertion order | O(1) average | One null |
TreeSet | Red-black tree | Sorted (natural / comparator) | O(log n) | No null* |
* TreeSet rejects null because it must compare elements.
Choose
HashSetfor raw speed,LinkedHashSetwhen you need predictable insertion order, andTreeSetwhen you need elements kept sorted or range queries (first,last,headSet,subSet).
Set<String> hash = new HashSet<>(List.of("c", "a", "b"));
Set<String> linked = new LinkedHashSet<>(List.of("c", "a", "b"));
Set<String> tree = new TreeSet<>(List.of("c", "a", "b"));
System.out.println("hash : " + hash);
System.out.println("linked : " + linked);
System.out.println("tree : " + tree);
Output:
hash : [a, b, c]
linked : [c, a, b]
tree : [a, b, c]
The HashSet order is incidental — never depend on it. The LinkedHashSet preserves the order you inserted; the TreeSet sorts.
Uniqueness via equals and hashCode
A HashSet/LinkedHashSet decides duplication using both hashCode() and equals(). Two objects collide into the same bucket when their hash codes match, then equals confirms identity. If you override one without the other, your set will misbehave.
record Point(int x, int y) {} // record auto-generates equals + hashCode
Set<Point> pts = new HashSet<>();
pts.add(new Point(1, 2));
pts.add(new Point(1, 2)); // equal -> ignored
System.out.println(pts.size()); // 1
Output:
1
If Point were a plain class without overriding equals/hashCode, the two instances would be treated as distinct and size() would be 2. For a TreeSet, uniqueness is decided by compareTo/Comparator returning 0 — not by equals — so keep your comparator consistent with equals.
The contract: if
a.equals(b)thena.hashCode() == b.hashCode(). The reverse need not hold (collisions are allowed). Violating this breaks every hash-based collection.
Common Operations and Set Algebra
Set<Integer> a = new HashSet<>(Set.of(1, 2, 3, 4));
Set<Integer> b = Set.of(3, 4, 5, 6);
Set<Integer> union = new HashSet<>(a);
union.addAll(b); // {1,2,3,4,5,6}
Set<Integer> intersect = new HashSet<>(a);
intersect.retainAll(b); // {3,4}
Set<Integer> diff = new HashSet<>(a);
diff.removeAll(b); // {1,2}
System.out.println(union + " " + intersect + " " + diff);
Output:
[1, 2, 3, 4, 5, 6] [3, 4] [1, 2]
TreeSet additionally offers navigational methods: ceiling, floor, higher, lower, headSet, tailSet, and subSet for efficient range work.
Best Practices
- Default to
HashSet; upgrade toLinkedHashSetonly when deterministic order matters. - Use a
TreeSetfor sorted iteration or range queries — not by sorting aHashSetrepeatedly. - Always override
equalsandhashCodetogether (records and Lombok@EqualsAndHashCodedo this for you). - Never mutate fields used by
hashCodeafter inserting an element — the entry becomes unreachable. - Use
Set.of(...)for immutable, duplicate-free constants; it throws if you pass duplicates.