Java 中如何推断泛型类型?

2022-01-22 00:00:00 type-inference java generics java-stream

The Function.identity() returns a function, Function<T, T> that always returns its input argument(i.e. identity function).

But being a static method, how does it know which concrete argument to return in place of type parameter T when it doesn't even take any inputs?

Illustration of my thought process:

Map idToPerson = people.collect( Collectors.toMap( (person -> person.getID() , Function.identity() ) );

Question: So how does the compiler figure out that Function.identity() is supposed to return Function<element of 'people' stream, element of 'people' stream> stream despite having no inputs?


According to the OpenJDK, the implementation is something like:

static <T> Function<T, T> identity()                                                                                                                          
{ 
    return t -> t;    
}

An attempt to narrow down my question:
How does Function.identity() know what the concrete data type t in t -> t(btw this is the lambda Function<T, T>) is?

解决方案

The Java type inference algorithm is based on the resolution of constraint formulas on inference variables. It is described in detail in Chapter 18 of the Java Language Specification. It's a bit involved.

Informally, for the example above the reasoning would go roughly as follows:

We have an invocation of Function.<T>identity(). Because most type parameters are named T, and consistently with the JLS, I'll use Greek letters to denote inference variables. So in this initial expression T :: α. What constraints do we have on α?

Well identity() returns an instance of Function<α,α> used as argument to toMap.

static <T,K,U> Collector<T,?,Map<K,U>>  toMap(Function<? super T,? extends K> keyMapper, 
   Function<? super T,? extends U> valueMapper)

So now we have the constraints {α :> T, α <: K} (where :> means supertype of and vice-versa). This now requires us to infer T and K in this expression, which we'll refer to as β and γ, so: {α :> β, α <: γ}. To avoid getting bogged down in details, let's work through β only.

toMap then returns a collector as argument to Stream.collect, which provides us with another source of constraints:

collect(Collector<? super T,A,R> collector)

So now we know that {β :> T}. But here T also needs to be inferred, so it becomes an inference variable, and we have {β :> δ}.

This is where it starts unfolding, because the type parameter T for method collect refers to the parameter T in Stream<T>. So assuming the stream was defined as Stream<Person>, now we have {δ=Person} and we can reduce as follows:

  • {β :> δ} => {β :> Person} (β is a supertype of Person);
  • {α :> β} => {α :> (β :> Person)} => {α :> Person)} (α is a supertype of Person);

So through the process of inference we figured out that the type variable for Function.identity needs to be Person or a supertype of Person. A similar process for α <: γ would yield {α <: Person} (if the return type is specified). So we have two constraints:

  • α needs to be Person or a supertype of Person;
  • α needs to be Person or a subtype of Person;

Clearly the only type that satisfies all these constraints is Person.

相关文章