Thursday, March 14, 2013

Java Generics Tutorial - Part III - Wildcards




The previous posts introduced us to the basics of Java generics y their subtyping relations. In this posts we'll introduce wildcards and how can covariant andcontravariant subtyping relations be established with generics.

Wildcards

As we've seen in the previous post, the subtyping relation of generic types is invariant. Sometimes, though, we'd like to use generic types in the same way we can use ordinary types:
  • Narrowing a reference (covariance).
  • Widening a reference (contravariance

Covariance

Let's suppose, for example, that we've got a set of boxes, each one of a different kind of fruit. We'd like to be able to write methods that could accept a any of them. More formally, given a subtype A of a type B, we'd like to find a way to use a reference (or a method parameter) of type C<B> that could accept instances of C<A>.

To accomplish this task we can use a wildcard with extends, such as in the following example:

List<Apple> apples = new ArrayList<Apple>();
List<? extends Fruit> fruits = apples;

? extends reintroduces covariant subtyping for generics types: Apple is a subtype ofFruit and List<Apple> is a subtype of List<? extends Fruit>.

Contravariance

Let's now introduce another wildcard: ? super. Given a supertype B of a type A, thenC<B> is a subtype of C<? super A>:

List<Fruit> fruits = new ArrayList<Fruit>();
List<? super Apple> = fruits;

How Can Wildcards Be Used?

Enough theory for now: how can we take advantage of these new constructs?

? extends
Let's go back to the example we used in Part II when introducing Java array covariance:

Apple[] apples = new Apple[1];
Fruit[] fruits = apples;
fruits[0] = new Strawberry(); 

As we saw, this code compiles but results in a runtime exception when trying to add aStrawberry to an Apple array through a reference to a Fruit array.

Now we can use wildcards to translate this code to its generic counterpart: since Appleis a subtype of Fruit, we will use the ? extends wildcard to be able to assign a reference of a List<Apple> to a reference of a List<? extends Fruit> :

List<Apple> apples = new ArrayList<Apple>();
List<? extends Fruit> fruits = apples;
fruits.add(new Strawberry());

This time, the code won't compile! The Java compiler now prevents us to add a strawberry to a list of fruits. We will detect the error at compile time and we won't even need any runtime check (such as in the case of array stores) to ensure that we're adding to the list a compatible type. The code won't compile even if we try to add a Fruit instance into the list:

fruits.add(new Fruit());

No way. It comes out that, indeed, you can't put anything into a structure whose type uses the ? extends wildcard.

The reason is pretty simple, if we think about it: the ? extends T wildcard tells the compiler that we're dealing with a subtype of the type T, but we cannot know which one. Since there's no way to tell, and we need to guarantee type safety, you won't be allowed to put anything inside such a structure. On the other hand, since we know that whichever type it might be, it will be a subtype of T, we can get data out of the structure with the guarantee that it will be a T instance:

Fruit get = fruits.get(0);

? super
What's the behavior of a type that's using the ? super wildcard? Let's start with this:

List<Fruit> fruits = new ArrayList<Fruit>();
List<? super Apple> = fruits;

We know that fruits is a reference to a List of something that is a supertype ofApple. Again, we cannot know which supertype it is, but we know that Apple and any of its subtypes will be assignment compatible with it. Indeed, since such an unknown type will be both an Apple and a GreenApple supertype, we can write:

fruits.add(new Apple());
fruits.add(new GreenApple());

If we try to add whichever Apple supertype, the compiler will complain:

fruits.add(new Fruit());
fruits.add(new Object());

Since we cannot know which supertype it is, we aren't allowed to add instances of any.

What about getting data out of such a type? It turns out that you the only thing you can get out of it will be Object instances: since we cannot know which supertype it is, the compiler can only guarantee that it will be a reference to an Object, since Object is the supertype of any Java type.

The Get and Put Principle or the PECS Rule

Summarizing the behavior of the ? extends and the ? super wildcards, we draw the following conclusion:

Use the ? extends wildcard if you need to retrieve object from a data structure.
Use the ? super wildcard if you need to put objects in a data structure.
If you need to do both things, don't use any wildcard.

This is what Maurice Naftalin calls The Get and Put Principle in his Java Generics and Collections and what Joshua Bloch calls The PECS Rule in his Effective Java.

Bloch's mnemonic, PECS, comes from "Producer Extends, Consumer Super" and is probably easier to remember and use.

Next Steps

In the next post (coming soon), we will put all together in some examples to clarify how generics can be used to help us write cleaner, clearer and more type safe code.

Part I - The Basics
Part II - Subtyping
Part III - Wildcards
Part IV - Bounded Type Variables

Java Generics Tutorial - Part IV - Wildcards in Method Signatures and Bounded Type Variables

Part I - The Basics
Part II - Subtyping
Part III - Wildcards
Part IV - Bounded Type Variables

Long time no hear. In the previous parts of this blog post we learned what generics classes and methods are, how they behave in subtyping relations and how wildcardscan be used to provide covariant and contravariant subtyping to generic types.
In this part of this series we will learn what bounded type variables are and the flexibility they provide.

Wildcards in Method Signatures

As seen in Part II of this series, in Java (as in many other typed languages), theSubstitution principle stands: a subtype can be assigned to a reference of any of itssupertypes.

This applies during the assignment of whichever reference, that is, even when passing parameters to a function or storing its result. One of the advantages of this principle, then, is that when defining class hierarchies, "general purpose" methods can be written to handle entire sub-hierarchies, regardless of the class of the specific object instances time being handled. In the Fruit class hierarchy we've used so far, a function that accepts a Fruit as a parameter will accept any of its subtypes (such as Apple orStrawberry).

As seen in the previous post, wildcards restore covariant and contravariant subtyping for generic types: using wildcards, then, let the developer write functions that can take advantage of the benefits presented so far.

If, for example, a developer wanted to define a method eat that accepted a List of whichever fruit, it could use the following signature:

void eat(List<? extends Fruit> fruits);

Since a List of whichever subtype of the class Fruit is a subtype of List<? extends Fruit>, the previous method will accept any such list as a parameter. Note that, as explained in the previous section, the Get and Put Principle (or the PECS Rule) will allow you to retrieve objects from such list and assign them to a Fruit reference.

On the other hand, if you wanted to put instances on the list passed as a parameter, you should use the ? super wildcard:

void store(List<? super Fruit> container);

This way, a List of whichever supertype of Fruit could be passed in to the storefunction and you could safely put whichever Fruit subtype into it.

Bounded Type Variables

The flexibility of generics is greater than this, though. Type variables can be bounded, pretty much in the same way wildcards can be (as we've seen in Part II). However, type variables cannot be bounded with super, but only with extends. Look at the following signature:

public static <T extends I<T>> void name(Collection<T> t);

It takes a collections of objects whose type is bounded: it must satisfy the T extends I<T> condition. Using bounded type variables may not seem more powerful than wildcards at first, but we'll detail the differences in a moment.

Let's suppose some, but not all, fruits in your hierarchy can be juicy as in:

public interface Juicy<T> {
    Juice<T> squeeze();
}

Juicy fruits will implement this interface and publish the squeeze method.

Now, you write a library method that takes a bunch of fruits and squeezes them all. The first signature you could write might be:

<T> List<Juice<T>> squeeze(List<Juicy<T>> fruits);

Using bounded type variables, you would write the following (which, indeed, has got the same erasure of the previous method):

<T extends Juicy<T>> List<Juice<T>> squeeze(List<T> fruits);

So far, so good. But limited. We could use the very same arguments used in the same posts and discover that the squeeze method is not going to work, for example, with a list of red oranges when:

class Orange extends Fruit implements Juicy<Orange>;
class RedOrange extends Orange;

Since we've already learned about the PECS principle, we're going to change the method with:

<T extends Juicy<? super T>> List<Juice<? super T>> squeezeSuperExtends(List<? extends T> fruits);

This method accepts a list of objects whose type extends Juicy<? super T>, that is, in other words, that there must exist a type S such that T extends Juicy<S> and S super T.

Recursive Bounds

Maybe you feel like relaxing the T extends Juicy<? super T> bound. This kind of bound is called recursive bound because the bound that the type T must satisfy depends on T. You can use recursive bounds when needed and also mix-and-match them with other kinds of bounds.

Thus you can, for example, write generic methods with such bounds:

<A extends B<A,C>, C extends D<T>>

Please remember that these examples are only given to illustrate what generics can do. Bounds you're going to use always depend on the constraints you're putting into your type hierarchy.

Using Multiple Type Variables

Let's suppose you want to relax the recursive bound we put on the last version of thesqueeze method. Let's then suppose that a type T might extend Juicy<S> although T itself does not extends S. The method signature could be:

<T extends Juicy<S>, S> List<Juice<S>> squeezeSuperExtendsWithFruit(List<? extends T> fruits);

This signature has pretty much equivalent to the previous one (since we're only using Tin the method arguments) but has got one slight advantage: since we've declared the generic type S, the method can return List<Juice<S> instead of List<? super T>, which can be useful in some situations, since the compiler will help you identify which type S is according to the method arguments you've passed. Since you're returning a list, chances are you want your caller to be able to get something from it and, as you've learned in the previous part, you can only get Object instances from a list such asList<? super T>

You can obviously add more bounds to S, if you need them, such as:

<T extends Juicy<S>, S extends Fruit> List<Juice<S>> squeezeSuperExtendsWithFruit(List<? extends T> fruits);

Multiple Bounds

What if you want to apply multiple bounds on the same type variable? It turns out that you can only write a bound per generic type variable. The following bounds are thus illegal:

<T extends A, T extends B> // illegal

The compiler will fail with a message such as:

T is already defined in...

Multiple bounds must be expressed with a different syntax, which turns out to be a pretty familiar notation:

<T extends A & B>

The previous bounds means that T extends both A and B. Please take into account that, according to the Java Language SecificationChapter 4.4, states that a bound iseither:
  • A type variable.
  • A class.
  • An interface type followed by further interface types.
This means that multiple bounds can only be expressed using interface types. There's no way of using type variables in a multiple bound and the compiler will fail with a message such as:

A type variable may not be followed by other bounds.

This is not always clear in the documentation I've read.