Friday, November 3, 2017

Continuations and type systems

How do we get a programming language which combines all the type features we have seen? Linear types, dependent types, subtypes, continuations rather than functions, jumbo types,and so on?

First we must recall the purpose of a type system: at least at a low level, types are calling conventions. The type of a function, module, etc. is its interface, its API. A function may have multiple types and hence multiple calling conventions; but at runtime presumably only a few of these will be used. For efficiency reasons it makes sense to have only one true type and construct all the other types as wrappers, but this is by no means necessary.

When we have a C type, for example int main(int argc, char **argv), this is actually a very strong type; the C standard and implementation specify many details of memory layout, stack usage, etc. that are assumed to hold. The interface with the kernel is a cleaner type; execve simply loads the proper memory layout and jumps to the starting address. To represent this API we must use continuations instead of functions, as they never return a value. After looking at IL we might denote this as the type ¬kernel. Expanding the type kernel out it might look more like ¬(stack pointer -> {argc, argv, env}), at least on Linux. Continuations map nicely to the power of assembly language, where jumping to an unknown code segment gives up all control. If a C 'function' calls exit() it will never return. In IL, there is also a 0 type, the type of 'enterable locations'. From the rules the only values of type 0 look like (λ x.F) T, where F is another value of type 0, and the only values of type ¬T look like λ x.(E1 E2). It's not clear what this is useful for.

At a high level, though, the idea of contracts remains. We have properties of our programs that we wish to remain true, like memory safety. Linear types are one method of enforcing these; we reject programs that don't use resources linearly. Dependent types are another form of contract, allowing us to write pre- and post- conditions on values.

Following the IL paper, and its related cousin TACC, we can define types of continuations - roughly, the type of a continuation is the argument layout it expects to have. We extend our data types to the Jumbo Connectives of arbitrary records and sum types (the Pi type can be ignored, continuations are more powerful than functions as an intermediate language). And also we have to generalize negation to delimited continuations, following Zeilberger and a more recent paper. We can implement these on LLVM in two ways: first, by translating them away into standard function calls, and second by using a special calling convention, jump-with-argument.

How do we combine linear types with continuations? Following this paper we might introduce multiple types of continuations, e.g. the paper's lowered negation ↓¬ and boxed negation . Then linear function types translate as !¬(τ1 & ↓¬τ2) while normal function types translate as !¬(τ1 & !¬τ2). But this hides the Rig (semiring) structure of the context. Maybe instead we could use a numbered continuation, , where k is from our Rig. Then adding dependent types is easy, and subtypes are just polarizing all the information flow. Simple!

Papers by Appel (1, 2 , 3, 4) describe a closure strategy analysis, wherein known functions are compiled using registers but unknown functions use the heap. They do not use the stack because it inefficiently holds on to values even after they are not being used; deallocation would require making holes in the stack. But of course nothing in life is free; as soon as one gives up the stack, one needs it again for interfacing with other (C) code.

No comments:

Post a Comment