Java: The Misuse of a TreeSet

Articles —> Java: The Misuse of a TreeSet

The Set is defined as a Collections class containing unique objects. This is useful in many context in which one needs unique collections of objects, for instance to find unique names in a list of names. A TreeSet extends off of this principle, but does so by ordering the objects it contains, doing so based upon an underlying binary tree type of data structure.

A question on a Java Programming Forum peaked my interest because the question provided evidence for some unexpected behavior when a TreeSet was used out of context. The issue pertained to the sorting of objects using a TreeSet: in this scenario the programmer was using a TreeSet for the purpose of sorting their objects based upon a field variable of the objects. Yet uniqueness of objects was not required, a situation which is often better suited for a List of array./p>

Below is a modified SSCCE of the problem and the output of running the code:


import java.util.Comparator;

import java.util.TreeSet;

 

/**

 * Test class to demonstrate misuse of a TreeSet to sort objects. 

 * @author Greg Cope

 *

 */

public class TreeSetTest {

 

	/**

	 * TreeSet with a custom Comparator, defining comparisons upon object equality and 

	 * the z field variable of the object. 

	 */

	private static final TreeSet<Wrapper> wrapperSet = new TreeSet<Wrapper>(new Comparator<Wrapper>() {

		public int compare(Wrapper o1, Wrapper o2) {

			if (o1.equals(o2)) {

				return 0;

			}

			if (o1.getZ() >= o2.getZ()) {

				return 1;

			}

			return -1;

		}

	});

 

	/**

	 * Main entry method.

	 * @param args

	 */

	public static void main(String[] args) {

		Wrapper wrapper1 = new Wrapper();

		Wrapper wrapper2 = new Wrapper();

		Wrapper wrapper3 = new Wrapper();

 

		wrapper1.z = 0;

		wrapper2.z = 0;

		wrapper3.z = 1;

 

		wrapperSet.add(wrapper1);

		wrapperSet.add(wrapper2);

		wrapperSet.add(wrapper3);

		

		System.out.println("Size of Set before attempt to remove wrapper1: " + wrapperSet.size());

		System.out.println("Is wrapper1 removed? : " + wrapperSet.remove(wrapper1));

		System.out.println("Size of Set before attempt to remove wrapper1: " + wrapperSet.size());

	}

 

	

	/**

	 * Inner class used as a wrapper class to a value

	 * @author Greg Cope

	 *

	 */

	private static class Wrapper {

 

		private int z;

 

		public int getZ() {

			return z;

		}

	}

 

}

Output:

The above code adds three objects to a TreeSet, sorting the objects based upon their z value: two object with a z value of 0 and one with a z value of 1. The code then attempts to remove the first object. When ran it produces this output:



Size of Set before attempt to remove wrapper1: 3

Is wrapper1 removed? : false

Size of Set before attempt to remove wrapper1: 3


The code suggests that after addition of the three wrapper objects, the first cannot be removed: the remove method returns false and the size of the TreeSet remains the same (3). What's going on here?

If we drill into the implementation, a TreeSet is a Set: a Collection of unique objects. The Comparator here defines equality based upon both object equality and equality of the z value. This presents a problem in that it can cause conflicts with how the TreeSet is implemented.

A TreeSet is backed by a red-black tree to sort the objects. A red black tree is self-balancing, and given the comparator an altogether feasible way the tree might be organized is the following:


     3(1)

    /

   2(0)

    \

     1(0)

The above diagram shows the object number with its z value represented in parenthesis. Now consider how a search might be performed for item 1: The the z-value is less than that for the root so it goes down the left node (wrapper2). The z value is equal to this node, and given the Comparator defines z value equality to be 'greater than' (1 in the returned value of the Comparator) it attempts to traverse down the right child of node 2. Since no right child exists, it deems the search object is not contained within the Set.

The lesson here: read the API. A Set is clearly defined as containing unique objects - a TreeSet defines uniqueness based upon the compareTo or compare methods of Comparable or Comparator, respectively. Attempt to use a TreeSet exclusively for sorting - in which the sort value and equality are defined as being different - and unpredictable things can happen, as the above example demonstrates.



There are no comments on this article.

Back to Articles


© 2008-2022 Greg Cope