Swift: Associated Types

Swift: Associated Types

http://www.russbishop.net/swift-associated-types

 

Associated Types Series

Sometimes I think type theory is deliberately obtuse and all those functional programming hipster kids are just bull-shitting as if they understood one word of it. Really? You have a 5,000 word blog post about insert random type theory concept? And it doesn't explain a) why anyone should care and b) what problems are solved by this wonderful concept's introduction? I want to tie you up in a sack, throw the sack in a river, and hurl the river into space.

Where was I? Oh yeah, Associated Types.

When I first saw Swift's implementation of generics, the use of Associated Types jumped out at me as strange.

In this post I'm going to work through the type theory and some practical considerations, mostly as an attempt to cement the concepts in my own mind (if I make any mistakes, please let me know!)

Generics

If I want to abstract over a type (aka create a generic thingy) in Swift, I use this syntax for classes:

class Wat<T> { ... }

Similarly, a generic struct:

struct WatWat<T> { ... }

Or a generic enum:

enum GoodDaySir<T> { ... }

But if I want an abstract protocol:

protocol WellINever {

    typealias T

}

Huh?

Basics

Unlike classes, structs, and enums, protocols don't support generic type parameters. Instead they support abstract type members; in Swift terminology Associated Types. Though you can accomplish a lot with either system, there are some benefits to associated types (and currently some drawbacks).

An associated type in a protocol says "I don't know what exact type this is; some concrete class/struct/enum that adopts me will fill in the details".

"Great!" you cry, "so how is that different from a type parameter?" Good question. Type parameters force everyone to know the types involved and specify them repeatedly (when you compose with them it can also lead to an explosion in the number of type parameters). They're part of the public interface. The code that uses the concrete thing (class/struct/enum) makes the decision about what types to select.

By contrast an associated type is part of the implementation detail. It's hidden, just like a class can hide its internal ivars. The abstract type member is for the concrete thing (class/struct/enum) to provide later. You select the actual type when you adopt the protocol, not when you instantiate the class/struct. It leaves control over which types to select in a different set of hands.

Usefulness

Mark Odersky, creator of Scala, discusses an example in this interview. In Swift terms, without associated types if you have a base class or protocol Animal with a method eat(f:Food), then the class Cow has no way to specify that Food can only be Grass. You certainly can't do it by overloading the method - covariant parameter types (making a parameter more specific in a subclass) isn't supported in most languages and is unsafe anyway since casting to the base class would let you feed in unexpected values.

If Swift protocols did support type parameters it might look like this:

protocol Food { }

class Grass : Food { }

 

protocol Animal<F:Food> {

    func eat(f:F)

}

 

class Cow : Animal<Grass> {

    func eat(f:Grass) { ... }

}

Great. What happens when you need to track more than just food?

protocol Animal<F:Food, S:Supplement> {

    func eat(f:F)

    func supplement(s:S)

}

 

class Cow : Animal<Grass, Salt> {

    func eat(f:Grass) { ... }

    func supplement(s:Salt) { ... }

}

The increasing number of type parameters is unfortunate but that's not our only problem. We're leaking the implementation details all over the place, requiring us to re-specify the type parameters repeatedly. The type of var c = Cow() is actually Cow<Grass,Salt>. A doCowThings function would be func doCowThings(c:Cow<Grass,Salt>). What if we want to work with all animals that eat grass? We have no way to express that we don't care about the Supplement type parameter either.

When we derive from Cow to create specific breeds, our class definitions are just idiotic: class Holstein<Food:Grass, Supplement:Salt> : Cow<Grass,Salt>.

Worse, how about a function to buy food and feed the animal: func buyFoodAndFeed<T,F where T:Animal<Food,Supplement>>(a:T, s:Store<F>). Besides being really ugly and verbose, we have no way to link the type of F to Food. If we rewrite the function definition we can work around that func buyFoodAndFeed<F:Food,S:Supplement>(a:Animal<Food,Supplement>, s:Store<Food>), but it won't work anyway - Swift will complain that "'Grass' is not identical to 'Food'" when we try to pass a Cow<Grass,Salt>. Again, notice that this method doesn't care about the Supplement but it has to deal with it.

Now let's see how associated types help us:

protocol Animal {

    typealias EdibleFood

    typealias SupplementKind

    func eat(f:EdibleFood)

    func supplement(s:SupplementKind)

}

class Cow : Animal {

    func eat(f: Grass) { ... }

    func supplement(s: Salt) { ... }

}

class Holstein : Cow { ... }

 

func buyFoodAndFeed<T:Animal, S:Store 

    where T.EdibleFood == S.FoodType>(a:T, s:S)

{ ... }

The type signatures are much cleaner now. Swift infers the associated types just by looking at Cow's method signatures. Our buyFoodAndFeed method can clearly express the requirement that the store sells the kind of food the animal eats. The fact that Cow requires a specific kind of food is an implementation detail of the Cow class, but that information is still known at compile time.

Getting Real

Enough with the animals for a minute; let's look at Swift's CollectionType.

Note: As an implementation detail, a number of Swift protocols have nested protocols with leading underscores; CollectionType -> _CollectionType or SequenceType -> _Sequence_Type -> _SequenceType. For brevity, I'm going to flatten that hierarchy when I talk about those protocols. So when I say that CollectionType has ItemType, IndexType, and GeneratorType associated types you won't find those on the CollectionType protocol itself.

Obviously we need the type of the elements T, but we also need the type of the index and the generator/enumerator so we can handle subscript(index:S) -> T { get } and func generate() -> G<T>. If we were just using type parameters, the only way a generic Collection protocol could work is by specifying T, S, and G in a hypothetical CollectionOf<T,S,G>.

What about other languages? C# doesn't have abstract type members. It handles this firstly by not supporting anything other than open-ended indexing where the type system says nothing about whether the index can only move one direction, supports random access, etc. Numeric indexes are just integers and the type system says nothing else about them.

Secondly, for generators IEnumerable<T> spits out an IEnumerator<T>. The difference seems very subtle at first but the C# solution is using an interface (protocol) to indirectly abstract over the generator, allowing it to avoid having to specify the generator type as a parameter to IEnumerable<T>.

Swift aims to be a traditionally-compiled (non-VM, non-JIT) systems programming language so requiring that kind of dynamic behavior is not a great idea for performance. The compiler would really prefer to know the types of your index and generator so it can do fancy things like inlining and knowing how much memory to allocate. The only way that can happen is to run all these generics through the sausage grinder at compile time. If you force it to defer to runtime, that means indirection, boxing, and other such tricks which are nice when you need them but aren't free.

The Ugly Truth

There's a major "gotcha" with abstract type members: Swift won't actually let you declare them as variable or parameter types because that would be useless. The only place you can use a protocol with associated types is as a generic constraint.

In our Animal example from earlier, it isn't safe to call Animal().eat because it just takes an abstract EdibleFood and we don't know what that might be.

In theory, the code below should work since the generic constraint on the function enforces that the animal eats what the store sells, but in practice I was seeing some EXC_BAD_ACCESS crashes when testing it so I'm not sure this scenario is fully-baked.

func buyFoodAndFeed<T:Animal,S:StoreType 

    where T.EdibleFood == S.FoodType>(a:T, s:S) {

    a.eat(s.buyFood()) //crash!

}

The inability to use these kinds of protocols as parameters or variable types is the true kicker. It just requires jumping through far too many unnecessary hoops. This is an area where Swift can (and hopefully will) improve in the future. I want the ability to declare variables or types like this:

typealias GrassEatingAnimal = 

    protocol<A:Animal where A.EdibleFood == Grass>

 

var x:GrassEatingAnimal = ...

Note: this use of typealias is actually creating a type alias, not an associated type in a protocol. Confusing, I know.

This syntax would let me declare a variable that holds some kind of Animal where the animal's associated EdibleFood is Grass. It might also be useful to allow this automatically if the associated types in the protocol are themselves constrained, but it seems like you could get into unsafe situations so that would require some more careful thought. One thing you'll run into if you do constrain the associated types in the protocol definition is the compiler will not be able to satisfy those on any method generic constraints (see below).

Currently you have to work around this by creating a wrapper struct that "erases" the associated type, exchanging it for a type parameter. Fair warning: It's ugly.

struct SpecificAnimal<F,S> : Animal {

    let _eat:(f:F)->()

    let _supplement:(s:S)->()

 

    init<A:Animal where A.EdibleFood == F, A.SupplementKind == S>(var _ selfie:A) {

        _eat = { selfie.eat($0) }

        _supplement = { selfie.supplement($0) }

    }

 

    func eat(f:F) {

        _eat(f:f)

    }

    func supplement(s:S) {

        _supplement(s:s)

    }

}

If you ever wondered why Swift's standard library includes GeneratorOf<T>:Generator, SequenceOf<T>:Sequence, and SinkOf<T>:Sink... now you know!

The bug I mentioned above is that if Animal specified typealias EdibleFood:Food then this struct can't be compiled even if you define it as struct SpecificAnimal<F:Food,S>:Animal. Swift will complain that F is not a Food even though the constraint on the struct clearly means that it is. Filed as rdar://19371678.

Wrap It Up

As we've seen, associated types allow the adopter of a protocol to provide multiple concrete types at compile time, without polluting the type definition with a bunch of type parameters. They're an interesting solution to the problem and a different kind of abstraction (abstract members) from generic type parameters (parameterization).

I do wonder if taking the Scala approach and simply supporting both type parameters and associated types for classes, structs, enums and protocols would be a better long-term approach. I haven't given it a lot of thought so there may be some major gotchas lurking. That's part of what is so exciting about a new language - watching it evolve and improve over time.

Now go forth and dazzle your colleagues and coworkers with fancy terms like Abstract Type Member. Then you too can lord it over them and render comprehension impossible.

Just stay away from sacks.

And rivers.

Not space. Space is awesome.

上一篇:There is already an open DataReader associated with this Command which must be closed first


下一篇:Hadoop学习记录(1)|伪分布安装