The Set
interface extends the Collection
interface and, by definition, forbids duplicates within the
collection. All the original methods are present and no new
methods are introduced. The concrete Set
implementation
classes rely on the equals()
method of the object added
to check for equality.
HashSet and TreeSet classes
The Collections Framework provides two general-purpose
implementations of the Set
interface: HashSet
and TreeSet
. More often than not, you will use a HashSet
for storing your duplicate-free collection. For efficiency,
objects added to a HashSet
need to implement the hashCode()
method in a manner that properly distributes the hash codes.
While most system classes override the default hashCode()
implementation in Object
, when creating your own classes
to add to a HashSet
remember to override hashCode()
.
The TreeSet
implementation is useful when you need to
extract elements from a collection in a sorted manner. In order
to work properly, elements added to a TreeSet
must be sortable. The Collections Framework
adds support for Comparable
elements
and will be covered in detail in "Comparable interface" in Sorting. For now, just assume a tree
knows how to keep elements of the java.lang
wrapper
classes sorted. It is
generally faster to add elements to a HashSet
, then
convert the collection to a TreeSet
for sorted
traversal.
To optimize HashSet
space usage, you can tune the
initial capacity and load factor. The TreeSet
has no
tuning options, as the tree is always balanced, ensuring log(n)
performance for insertions, deletions, and queries.
Both HashSet
and TreeSet
implement the Cloneable
interface.
Set usage example
To demonstrate the use of the concrete Set
classes,
the following program creates a HashSet
and adds a group
of names, including one name twice. The program then prints out
the list of names in the set, demonstrating the duplicate name
isn't present. Next, the program treats the set as a TreeSet
and displays the list sorted.
import java.util.*;
public class SetExample {
public static void main(String args[]) {
Set set = new HashSet();
set.add("Bernadine");
set.add("Elizabeth");
set.add("Gene");
set.add("Elizabeth");
set.add("Clara");
System.out.println(set);
Set sortedSet = new TreeSet(set);
System.out.println(sortedSet);
}
}
Running the program produces the following output. Notice that
the duplicate entry is only present once, and the second list
output is sorted alphabetically.
[Gene, Clara, Bernadine, Elizabeth]
[Bernadine, Clara, Elizabeth, Gene]
AbstractSet class
The AbstractSet
class overrides the equals()
and hashCode()
methods to ensure two equal sets return
the same hash code. Two sets are equal if they are the same size
and contain the same elements. By definition, the hash code for a
set is the sum of the hash codes for the elements of the set.
Thus, no matter what the internal ordering of the sets, two equal
sets will report the same hash code.