Вы находитесь на странице: 1из 26

Java Collections In Depth

Last modified: April 26, 2018  

You've already seen how we use lambda expressions to make working with


collections easier. In this tutorial, we'll look at Java collections in depth.

We'll introduce Java collections by explaining what they are, the collection
hierarchy, and general benefits of using Java collections. We'll then examine
each Collection interface including in depth explanations and examples for
each.

What are collections in Java?


A collection is an object that groups elements together into a single unit.
Collections make it easy to group similar elements together and perform
iterations and aggregations on groups of elements.

A basic example of a collection in Java:

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<String> myList = new ArrayList<String>();

        myList.add("hello");

        myList.add("world");

        for (String item : myList)

            System.out.println(item);

    }

If you run this code, you'll get the following output:


hello

world

This example demonstrates how we use a collection to iterate over a list of


strings. We've created an ArrayList which implements the List interface.

Notice how we use for(String item : myList) to iterate over our


collection. In this example, we simply print out each item
via System.out.println(item).

The Collections Framework


In the last example, we used the List interface provided by the Collections
framework. It turns out that this List interface is one of several core
collection interfaces. This hierarchy of generic interfaces is the heart and soul of
the Java Collections framework...

Notice how Set, List, Queue, and Deque all extend the Collection interface.


We'll be looking at each one of these in more detail.

The Collection interface

The Collection interface represents a group of elements. It provides generic


functionality for converting collections of different types, returning size (or
number of elements), modifying its members, and more.

The Collection interface is the root of the collection hierarchy. With the
exception of the Map class, every collection we'll look at in this tutorial extends
the Collection interface.

Basic Functionality of Java Collections:

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        Collection<String> hashStrings = new HashSet<String>();
        hashStrings.add("hello");

        hashStrings.add("goodbye");

        List<String> arrayStrings = new ArrayList<String>(hashStrings);

        //easily convert different types of collections

        arrayStrings.size();

        //returns 2

        arrayStrings.isEmpty();

        //returns false

        arrayStrings.contains("hello");

        //returns true

        arrayStrings.contains("world");

        //returns false

        arrayStrings.add("another string");

        //adds element to collection

        arrayStrings.removeAll(hashStrings);

        for(String item :  arrayStrings)

            System.out.println(item);

        //prints 'another string'
    }

Look closely at this example. It demonstrates a lot of the common


functionality included with the Collection interface.

Collections are generic meaning they parameterize data types. This makes it


easy to convert between different types of collections whose elements have the
same data type. Notice how we first define a HashSet (an implementation of
the Set interface) collection hashStrings. We then create a
separate ArrayList collection arrayStrings from the
original hashStrings collection.

This is possible because of the Collection's conversion constructor.


Since hashStrings and arrayStrings both specify <String> we can create
different types of collections from the same set of elements.

PLEASE NOTE: This does not mean we can convert the data type of the
collection's elements themselves. For example, we would not be able to create
an ArrayList of <Integer> using the original hashStrings collection because
it specifies a <String> data type.

Also notice how we add another element to our collection


via arrayStrings.add("another string"). We then remove all of the
original elements copied
from hashStrings via arrayStrings.removeAll(hashStrings).

This leaves us with just the element we added via the add() method.

Iterating over collections

There are several ways to iterate over a collection in Java...

1) Use for-each

We've already seen how for-each can be used to iterate over a collection:

Using for-each to loop through a collection:

import java.util.*;

public class MainClass {
    public static void main(String[] args) {

        List<String> arrayStrings = new ArrayList<String>();

        arrayStrings.add("hello");

        arrayStrings.add("goodbye");

        for(String item : arrayStrings)

            System.out.println(item);

    }

If you run this code, you'll see:

hello

goodbye

2) Use an Iterator

The Collection interface extends the Iterable interface. This makes it easy to


create an Iterator for any given collection:

Using an iterator to loop through a collection:

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<String> arrayStrings = new ArrayList<String>();

        arrayStrings.add("hello");

        arrayStrings.add("goodbye");

        for (Iterator<String> it = arrayStrings.iterator(); it.hasNext(); )

            if (it.next() != "hello")

                it.remove();

    }

}
Notice how we create an Iterator by calling the iterator() method on
our arrayStrings collection. While the hasNext() method returns true if the
iteration has more elements, the next() method returns the next element in
the iteration. The remove() method removes the specified element from the
collection.

When should I use an Iterator instead of for-each?

Use an iterator when you want to remove elements from the collection. Iterators are good for
filtering data and iterating over multiple collections in parallel. This is because for-each doesn't
support the removal of elements as it hides the iterator used to implement its functionality.

3) Use streams

Oracle recommends using streams to iterate over collections in JDK 8 or later.


Such iterations often leverage lambda expressions, which you can read more
about here.

Using streams to loop through a collection:

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<String> arrayStrings = new ArrayList<String>();

        arrayStrings.add("hello");

        arrayStrings.add("goodbye");

        arrayStrings.stream()

        .filter(s -> s != "goodbye")

        .forEach(s -> System.out.println(s));

    }

Notice how we call stream() to obtain a stream of


our arrayStrings collection. We can then call aggregate operations
like filter() and forEach() with lambda expressions to iterate over the
collection in a more expressive way.
Remember that the Collection interface is at the root of the collections
hierarchy. With the exception of Map, all of the interfaces we'll discuss next
extend the Collection interface.

Let's take a closer look at the other interfaces that make up the Collections
framework...

The List Interface


A List is an ordered Collection. A List can contain duplicate elements (unlike
some other collections we will look at).

Remember that List extends the generic Collection interface. It can do all the
same things a regular Collection can do with some added functionality,
including:

1) Positional access

A list can manipulate elements based on their numerical position:

Examples of positional access with lists

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<String> arrayStrings = new ArrayList<String>();

        arrayStrings.add("hello");

        arrayStrings.add("goodbye");

        //regular add method shared by all collection interfaces

        arrayStrings.add(1, "how are you?");

        //inserts element at position 1

        arrayStrings.remove(0);

        //removes element at position 0
        arrayStrings.stream()

        .forEach(s -> System.out.println(s));

    }

Run this code and you will get the following output:

how are you?

goodbye

Notice how we can optionally specify a position as the first argument


to arrayStrings.add(). This allows us to add an element at a specified index.

We can do the same thing with remove(). Notice how we pass in an index


argument 0 rather than an Object.

Positional access applies to other methods like set(), get(), addAll(), etc.

2) Search

A list can search for a specified object and returns numerical position:

Examples of searching with lists

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<String> arrayStrings = new ArrayList<String>();

        arrayStrings.add("hello");

        arrayStrings.add("how are you");

        arrayStrings.add("goodbye");

        arrayStrings.add("hello");

        arrayStrings.indexOf("hello");
        //returns 0

        arrayStrings.lastIndexOf("hello");

        //returns 3

    }

Notice how we've added hello twice to our arrayStrings list.


While indexOf() returns the first index that matches the
argument, indexOfLast() returns the last index that matches the argument.

3) Iteration

We already saw how the Collection interface provides an iterator() method.


The List interface provides an additional ListIterator which allows you to:

 traverse the list in either direction


 modify the list during iteration
 obtain current position of the iterator

Example of using the ListIterator:

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<Character> arrayChars = new ArrayList<Character>();

        arrayChars.add('a');

        arrayChars.add('b');

        arrayChars.add('c');

        ListIterator<Character> litr = arrayChars.listIterator();

        while(litr.hasNext()) {

            System.out.println(litr.next());;
        }

        while(litr.hasPrevious()) {

            System.out.println(litr.previous());;

            litr.set('a');

        }

        arrayChars.stream()

        .forEach(System.out::println);

    }

Run this code and you will get the following output:

Notice how we create a new instance of ListIterator from


calling arrayChars.listIterator().

In addition to hasNext() and next(), the ListIterator also has access


to hasPrevious() and previous() methods. These work in the same manner
as hasNext() and next(), just in the opposite direction.

Also notice how we are able to modify the list during the iteration. By
calling litr.set('a'), we modify every element by replacing it with 'a'.
4) Range-View Operations

The List interface also defines range-view operations:

Example of using the subList() method:

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<Character> arrayChars = new ArrayList<Character>();

        arrayChars.add('a');

        arrayChars.add('b');

        arrayChars.add('c');

        arrayChars.subList(0, 1).clear();

        //easily operate on sublist only

        arrayChars.stream()

        .forEach(System.out::println);

    }

If you run this code, the output will be:

Notice how we use the subList() method to specify a sub range of


our arrayChars list. Please note that the first argument specified in the range
is inclusive while the last is exclusive. This is why the first element 'a' is
cleared, but not 'b'.

List Implementations
There are several implementations of lists worth discussing:

1) ArrayList

All of the examples in this tutorial have used the ArrayList implementation


of the List interface. This is because the ArrayList is the most commonly used
and is usually the best option.

An ArrayList is implemented as a resizable array.

2) LinkedList

A LinkedList is implemented as a double linked list. Every element in a


LinkedList is wrapped in a node object. This makes a LinkedList better for
things like add() and remove() but worse for get() and set()

3) Vector

The Vector class implements a growable array of objects. Unlike an


ArrayList, a Vector is synchronized.

There are a few special-purpose List implementations


(namely CopyOnWriteArrayList) however these are used less frequently.
The ArrayList is the most commonly used implementation and is usually the
best choice for your needs.

You can find more on these implementations at javatpoint. You can also find a
good explanation of the differences between these implementations here.

The Set Interface


The Set interface can't contain duplicate elements. The only major difference
between Set and Collection is the additional restrictions Set puts on duplicate
elements.

Using Set to Find Duplicates

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<Character> myList = new ArrayList<Character>();
        Set<Character> singleEntries = new HashSet<Character>();

        Set<Character> dupEntries = new HashSet<Character>();

        myList.add('a');

        myList.add('b');

        myList.add('b');

        myList.add('c');

        myList.add('d');

        myList.add('d');

        for(char item : myList){

            if(!singleEntries.add(item))

                dupEntries.add(item);

        }

        singleEntries.removeAll(dupEntries);

        //destructive method

        System.out.println(singleEntries);

        //prints [a, c]

        System.out.println(dupEntries);

        //prints [b, d]

    }

In this example, we leverage the Set interface to return unique values of the
ArrayList myList.
Notice how we iterate through each element in myList with for-each. Since a
Set can only include unique values, we are able to separate unique values from
duplicates with if(!singleEntries.add(item)).

We can then remove duplicate entries from our singleEntries HashSet


with removeAll(). Note that this method is destructive.

Set Implementations

There are three general-purpose Set implementations that are commonly


used:

1) HashSet

The HashSet is the most commonly used. It's faster than TreeSet but does
not guarantee order. It's implemented using a hash table. The time complexity
of basic methods like add(), remove() and contains() is O(1).

2) TreeSet

The TreeSet is implemented using a tree structure. The elements in a TreeSet


are sorted. Basic methods for a TreeSet have a time complexity of O(log(n)).

A TreeSet does not preserve the insertion order of its elements and is not
thread-safe. A TreeSet sorts it's elements in ascending order. For a more in
depth read on TreeSet, check out this Baeldung article.

3) LinkedHashSet

A LinkedHashSet is between a HashSet and a TreeSet. It's implemented as a


hash table but also has a linked list running through it. This allows
LinkedHashSet to provide order insertion. The time complexity for basic
operations is O(1).

With a LinkedHashSet, the order of the output is guaranteed.

When to use a HashSet vs TreeSet vs LinkedHashSet?

More often that not, you'll want to use HashSet. Consider using TreeSet if the value order matters.
An example would be a set of names that you always want sorted in alphabetical order. Use
LinkedHashSet when you want to preserve the order that elements were inserted but want similar
performance to HashSet.
Most TreeSet operations have log-time complexity whereas HashSet and LinkedListSet operations
have constant-time complexity. This makes HashSet the preferred implementation for most use
cases.

The Queue Interface


The Queue interface is used to hold elements prior to processing. A Queue
orders its elements in a FIFO (first-in-first-out) manner.

Example of a Queue implementation and its methods

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        Queue<Character> myQueue = new LinkedList<Character>();

        myQueue.add('a');

        myQueue.add('b');

        myQueue.add('c');

        myQueue.add('d');

        myQueue.add('e');

        //adds 'e' to end of myQueue

        myQueue.offer('f');

        //adds 'f' to end of myQueue

        myQueue.remove();

        //removes 'a'

        myQueue.poll();

        //remove 'b'
        myQueue.element();

        //returns 'b'

        myQueue.peek();

        //returns 'b'

        System.out.println(myQueue);

        //prints [c, d, e, f]

    }

Notice how we create a new Queue myQueue with a LinkedList. We can do


this because LinkedList implements the Deque interface (an extension of the
Queue interface).

There are six methods demonstrated in this example. You'll notice


that add() and offer() both add an element to the end of the queue.
The remove() and poll() methods both remove and return the first element
in the queue. The element() and peek() method both return (but not
remove) the first element in the queue.

So why have multiple methods doing the same thing? The answer is subtle and
has to do with what's returned. While add(), remove(), and element() all
throw exceptions, offer(), poll(), and peek() return false or null if they
fail.

Queue Implementations

Remember that LinkedList implements the Queue interface so can be


considered an implementation. The PriorityQueue is another general
purpose implementation.

A PriorityQueue does not accept null elements. The ordering of a


PriorityQueue is based on a priority heap size and elements are ordered
according to their natural ordering.

When should I use a PriortyQueue over a LinkedList?


A LinkedList preserves insertion order. A PriorityQueue does not. Iteration order changes with a
PriorityQueue. Use a PriorityQueue when you want to respect the natural ordering of elements in a
Queue and insertion order does not matter.

Remember that these implementations are not thread safe. For more on
synchronized Queues, check out the official documentation.

The Deque Interface


The Deque interface is a double-ended queue. This means you can insert and
remove elements at both ends.

The Deque implements both the Stack and Queue interface. LinkedList
implements the Deque interface.

Example of a Deque implementation and some of its methods

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        Deque<Character> myQueue = new LinkedList<Character>();

        myQueue.add('a');

        myQueue.add('b');

        myQueue.add('c');

        myQueue.add('d');

        myQueue.addFirst('e');

        //adds 'e' to start of myQueue

        myQueue.offerLast('f');

        //adds 'f' to end of myQueue

        myQueue.removeFirst();

        //removes 'e'
        myQueue.pollLast();

        //remove 'f'

        myQueue.getFirst();

        //returns 'a'

        myQueue.peekLast();

        //returns 'd'

        myQueue.removeFirstOccurrence('b');

        //removes first occurance of 'b'

        System.out.println(myQueue);

        //prints [a, c, d]

    }

Notice how these methods are very similar to the Queue methods. The major
difference is the additional methods specifying either first or last. Since
Deque is a double-ended queue, it supports methods for both direction. For
example, while the example calls pollLast() there is also a
corresponding pollFirst() method.

The Deque interface includes a removeFirstOccurrence() method. This


simply removes the first occurrence of the specified object. If the object does
not exist, the Deque instance remains unchanged.
The removeLastOccurance() method works in a similar fashion...

For a complete list of Deque methods, check out the official documentation.

Deque Implementations
We've already seen how LinkedList implements the Deque interface.
The ArrayDeque interface is another general-purpose implementation of
Deque. The ArrayDeque is a resizable array implementation of the Deque
interface.

When to use LinkedList vs ArrayDeque

LinkedList is generally more flexible than ArrayDeque. While ArrayDeque is faster for add and
remove operations, LinkedList is better for removing the current element in an iteration.

LinkedList consumes more memory than ArrayDeque but implements optional List methods. Things
like null elements are allowed with LinkedList but not ArrayQueue.

It's important to remember that the Deque interface does not support
concurrent access from multiple threads. Check out
the LinkedBlockingDeque class for a concurrent implementation of the Deque
interface.

The Map Interface


A Map object maps keys to values. Every key in a Map must be unique and a
key cannot map to more than one value.

Example of a Map implementation and some of its methods

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        Map<String, Integer> myMap = new HashMap<String, Integer>();

        myMap.put("Sam", 33);

        myMap.put("Erica", 49);

        myMap.put("Ted", 12);

        myMap.size();

        //returns 3

        myMap.get("Erica");
        //returns 49

        myMap.remove("Ted");

        //removes key/value pair from myMap

        myMap.isEmpty();

        //returns false

        myMap.containsKey("Erica");

        //returns true

        myMap.containsValue(33);

        //returns true

    }

Notice how we create a new Map instance by specifying Map<String,


Integer>. Remember that a Map is a collection of key,pair values. <String,
Integer> indicates the data type associated with the keys/values.

Notice how the put() adds a new key/value pair to the map.


While remove() deletes a key/value pair for the specified
key, containsKey() returns true/false based on the specified key's presence.

Iterating over Map objects

There are several ways to iterate over a Map object...

Example of iterating over a Map Object

import java.util.*;

public class MainClass {

    public static void main(String[] args) {
        Map<String, Integer> myMap = new HashMap<String, Integer>();

        myMap.put("Sam", 33);

        myMap.put("Erica", 49);

        myMap.put("Ted", 12);

        for(String key : myMap.keySet())

            System.out.println("Printing key " + key);

        for(Integer value : myMap.values())

            System.out.println("Printing value " + value);

        for(Map.Entry<String, Integer> e : myMap.entrySet())

            System.out.println(e.getKey() + ", " +  e.getValue());

    }

Run this code to get the following output:

Printing key Ted

Printing key Erica

Printing key Sam

Printing value 12

Printing value 49

Printing value 33

Ted, 12

Erica, 49

Sam, 33

This example demonstrates three different ways to iterate over a Map object.
Notice how we can use the keySet() method to return an iterable set of the
map's keys.
We can do the same thing with the map's values via the values() method.

We can also use the Entry interface along with the entrySet() method to


iterate over each entry in the map object. Notice how we
call getKey() and getValue() to retrieve the entry values.

Map Implementations

There are 3 general-purpose Map implementations that you will commonly


see. These are similar to the Set implementations (HashSet, TreeSet,
LinkedHashSet) and have similar appropriate use cases, etc.

1) HashMap

The HashMap is among the most commonly used Map implementations.


Like a HashSet, it's the most efficient but does not guarantee order. The Map
examples you've seen in this tutorial have all used the HashMap
implementation.

2) TreeMap

Like TreeSet, the TreeMap sorts its entries in ascending order based on the


key values. Use a TreeMap when you want elements stored in a sorted order
based on key values.

3) LinkedHashMap

The LinkedHashMap is less efficient than HashMap but maintains insertion


order (similar to LinkedHashSet). The LinkedHashMap has some extra
functionality, including ordering elements based on access rather than
insertion. Looking up a key moves that key/value pair to the end of the map.

LinkedHashMap also uses a removeEldestEntry() method which can be


perfect for implementing a cache etc.

Other Map Implementations

There are other special-purpose Map implementations such


as EnumMap, WeakHashMap, and IdentityHashMap. Be sure to check
out the original documentation for more specifics on these implementations.

Object Ordering
If a collection's elements implement the Comparable interface, you can sort
it with the static sort() method...

Example of sorting a collection with Collections.sort()

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        List<Integer> myList = new ArrayList<Integer>();

        myList.add(3);

        myList.add(1);

        Collections.sort(myList);

        //sorts myList in ascending order

        System.out.println(myList);

        //prints [1, 3]

    }

This works for our ArrayList myList because it's elements are of


the Integer type. More specifically, the Integer class implements the
Comparable interface.

The Comparable interface is a functional interface that implements a


single compareTo() method. This method returns a positive integer , negative
integer, or 0 based on the equality of two objects. You can implement your
own custom comparables by extending the Comparable interface and
implementing the compareTo() method yourself.

The SortedSet and SortedMap Interface

A TreeSet is an implementation of the SortedSet interface. A TreeMap is an


implementation of the SortedMap interface. Remember that both TreeSet
and TreeMap sort elements in their ascending natural order. This is because of
these interfaces, which provide additional functionality for returning sorted
values like:

 Range view - allows arbitrary range operations on the sorted set/map


 Endpoints - returns first or last elements in set/map
 Comparator access - returns the Comparator used sort the set/map

Example of SortedSet implemented as a TreeSet

import java.util.*;

public class MainClass {

    public static void main(String[] args) {

        SortedSet<Integer> myList = new TreeSet<Integer>();

        myList.add(3);

        myList.add(1);

        myList.add(6);

        System.out.println(myList);

        //prints [1, 3, 6]

    }

Notice how the elements are sorted in their natural ascending order. For a
deeper dive on the added functionality provided by SortedSet and SortedMap,
check out this as well as the official documentation.

Algorithms
The Collections class provides a number of static methods used throughout
the Java platform.

Examples of algorithms that come from the Collections Class

import java.util.*;

public class MainClass {
    public static void main(String[] args) {

        List<Integer> myList = new ArrayList<Integer>();

        myList.add(3);

        myList.add(1);

        myList.add(6);

        Collections.reverse(myList);

        System.out.println(myList);

        //prints [6,1,3]

        Collections.shuffle(myList);

        System.out.println(myList);

        //prints [1,6,3]

    }

While the reverse() method reverses the order of the collection,


the shuffle() method randomly changes the order of the collection.

The majority of such methods operate on List implementations, however some


apply to all Collection instances. For more on working with these algorithms,
check out the original documentation.

Conclusion
The Collections framework is a cornerstone of the Java programming
language. In this tutorial, you saw how the Collections framework represents a
hierarchy of core collection interfaces.

These interfaces both inherit common functionality (like for-each) from the
Collection interface and add their own. This makes them more appropriate for
certain use cases. We saw how the List interface adds positional access to a
collection and how the Set interface puts restrictions on duplicate elements.
We saw how the Queue interface gives a FIFO order to elements and how the
Map interface lets you represent elements as key/value pairs.
We also saw how the Collections class provides algorithms used throughout
the Java language via static methods like shuffle() and reverse().

Write a response...

Вам также может понравиться