ppkwok: programming

Monday, 3 December 2012

Java Cafe 3 : WeakReferences and the caveats of WeakHashMap

WeakReference

First of all, lets see what a WeakReference is. A normal reference to an object is a strong reference, e.g.:

List list = new LinkedList();

list.add(1)

list is a strong reference to a LinkedList object. It prevents the garbage-collector from removing the LinkedList from memory, as long as list is in the current scope of executing Java program. Now, you can also wrap it around a WeakReference:

WeakReference<List<Integer>> list = new WeakReference<List<Integer>>(new LinkedList<Integer>());

List myList = list.get();
myList.add(1);

You now have a WeakReference (list) to a LinkedList, and you can obtain the LinkedList using the WeakReference.get() method. The purpose of a WeakReference is to allow the garbage-collector to gc the referred object if necessary. Consider the code below:

WeakReference<List<Integer>> list = new WeakReference<List<Integer>>(new LinkedList());

list.get().add(1);

int i = 0;
while (i < 10) {
    System.gc();
    i++;
    if (list.get() == null) {
        System.out.println("GC-ed at iteration " + i);
        break;
    }
}

On my JVM, it prints out:

GC-ed at iteration 1

That is, the LinkedList has been garbage-collected after the first System.gc() call (note that this does not necessarily have to happen)! While this behaviour is not guaranteed, we have certainly shown that WeakReferences allows the referred object to be garbage-collected.

WeakHashMap

The java.util.WeakHashMap class is one of the least frequently used in the Collections package, and it uses WeakReferences internally. It is meant to serve as a caching map where the entries can be garbage-collected if the JVM is running low on memory. It accomplishes this by weak referencing the keys in the map, also called weak keys. For instance, if you have a WeakHashMap which maps Strings to Integers:

String key = new String("one");

Integer value = new Integer(1);

WeakHashMap<String,Integer> weakMap = new WeakHashMap<String,Integer>();

weakMap.put(key, value);

key = null;

value = null;

If there are no other references to key, then map entry (key, value) can (but non-necessarily) be garbage-collected the next time GC happens. If that happens, the size of the WeakHashMap will reduce from 1 to 0, and the object key will be garbage-collected:

String key = new String("one");

Integer value = new Integer(1);

WeakHashMap<String,Integer> weakMap = new WeakHashMap<String,Integer>();

weakMap.put(key, value);

key = null;

int i = 0;

while (i < 100) {
    System.gc();
    i++;
    if (weakMap.size() == 0) {
        System.out.println("GC-ed at iteration " + i);
        break;
    }
}

In my JVM, weakMap becomes empty at iteration 6. If there are no other references to the object value, it will be garbage-collected too.

One important fact to know about the WeakHashMap is that the value objects are held using strong references. Therefore, if you had mistakenly wrote:

WeakHashMap<String,String> weakMap = new WeakHashMap<String,String>();

weakMap.put(key, key);

The entry for key will never be garbage-collected, because key is also held as the value object using a strong reference. If you need to use the same object as the value, it is recommended that you wrap it around a WeakReference, like:

weakMap.put(key, new WeakReference(key));

This way, there will be no strong references to key, allowing it to be garbage-collected.

Saturday, 24 November 2012

Java Cafe 1 : Never write NaN == NaN (they're not equal)

I see this Java bug time and time again:

NaN == NaN

So what's wrong with this?

The Float and Double classes in java.lang defines a constant holding a Not-a-Number (NaN) value of type float and double respectively. NaN can be used to represent a mathematically undefined number, such as that obtained by dividing zero by zero, or an unrepresentable value, such as the square root of a negative number, which is imaginary so cannot be represented as a real floating-point number. For instance:

System.out.println(0.0f / 0.0f);
System.out.println(Math.sqrt(-1.0f));

prints out:

NaN
NaN

Sometimes programmers initialize a class field to NaN to indicate that it has not been assigned a value. Later on in the program, they check if that field has been assigned a value by checking if it is equal to NaN, using the == operator, e.g.:

public class NaNTest {
    private float value = Float.NaN;

    public void setValue(float newValue) {
        if (value == Float.NaN) // wrong, never do this!
            value = newValue;
    }

    public float getValue() { return value; }
}

Unfortunately, value will never be set to newValue in the setValue() method, because (Float.NaN == Float.NaN) always returns false. In fact, if you look at the JDK implementation of Float.isNaN(), a number is not-a-number if it is not equal to itself (which makes sense because a number should be equal to itself). The same holds for Double.NaN.

This error is easy to make, because the == operator is what you will normally use to compare numbers and primitive types. This bug can go unnoticed for a long time, potentially giving disastrous consequences. For instance, if the code that uses the value returned by getValue() performs the same faulty equality check, and then performs some critical operations:

NaNTest test = new NaNTest();
test.setValue(4.0f);           // does not set it 4.0f
float value = test.getValue(); // returns Float.NaN
float result = 0.0f;
if (value != Float.NaN) {
    result = value;
}
System.out.println(result);    // prints NaN

Although not immediately obvious, the printed value will always be NaN, not 4.0. This is because value has the value Float.NaN, and (Float.NaN != Float.NaN) is always true!

The correct way to check if a number is NaN is to use Float.isNaN() and Double.isNaN(). For example, continuing with the NaNTest class:

public void setValue(float newValue) {
    if (Float.isNaN(value))
        value = newValue;
}

Equivalently, this will also work:

public void setValue(float newValue) {
    // works but don't do this
    if (value != value)    // yes, this check is weird!
        value = newValue;
}

but you should use Float.isNaN() and Double.isNaN() because they make clear the intention of the check, and they will work regardless of any changes to the underlying floating-point implementation of Java.

Finally, the same applies to checking for positive and negative infinity: always use Float.isInfinite() and Double.isInfinite().