Wednesday, February 21, 2007

Dependent types & security

During my last semester in grad school, before I left physics with a Master's, I took the graduate course on OS design. That was probably the second most fun class I ever took, second only because the professor was not an awesome old man we could call Yoda.

In any case, in this OS class I worked on a design project for a kernel written in Haskell that employs higher order types to create a static capability system. Beyond simply being a capability system, however, we wanted to encode actual secrecy and integrity levels into the type system as well and attach that data to all system resources. Of course, we ran into the limitations of even Haskell's Sexy Types(tm) and found that what we really wanted were true dependent types. Not having the time to examine the feasibility of a kernel design in Epigram, though what a kernel it would be, we ended up tagging system resources with phantom types to represent the capabilities themselves and stored the secrecy and integrity levels as actual data. We called it Matrioshka and I would love to build a real prototype on top of House in the not too distant future.

Now, the fun part is what happens when you try to express something like a capability in a dependently typed language. It turns out to be rather simple.

Let's take, as a simple example, capabilities for an arbitrary resource with only a secrecy level. This is going to be a bit of a half-baked example, because I'm still trying to digest Epigram, but hopefully it will be somewhat illustrative anyway.

So let's consider a process as just some type with an attached secrecy level. Then we should be able to evaluate whether or not we can use a read capability to a resource, and the bit of epigram code that follows allows just that.


data Nat : * where zero : Nat ; succ : Nat -> Nat
-------------------------------------------------
data Bool : * where false : Bool ; true : Bool
-------------------------------------------------
( x, y : Nat !
let !---------------!
! le x y : Bool )

le x y <= rec x
{ le x y <= case x
{ le zero y => true
le (succ x) y <= case y
{ le (succ x) zero => false
le (succ x') (succ x) => le x' x
}
}
}
---------------------------------------------------
inspect le (succ zero) zero => false : Bool
---------------------------------------------------
data Read : * where read : Read ; illit : Read
---------------------------------------------------
( n : Nat ; X : * ! ( x : X !
data !------------------! where !-----------------!
! Cap n X : * ) ! cap x : Cap n X )
---------------------------------------------------
( n : Nat !
data !------------! where (---------------!
! Proc n : * ) ! proc : Proc n )
----------------------------------------------------
( p : Proc n ; c : Cap m Read !
let !------------------------------!
! canRead _n _m p c : Bool )

canRead _ n _ m p c <= case c
{ canRead _ n _ m p (cap c) <= case c
{ canRead _ n _ m p (cap read) => le m n
canRead _ n _ m p (cap illit) => false
}
}
---------------------------------------------------


Now there might be a serious question as to why one would ever want to do this. After all, if you're just analyzing the secrecy & integrity levels at runtime anyway then why not just store everything as data and forget all these fancy-pants type systems?

There are two major reasons, in my opinion, of why you would want to handle all security at the type level: first, so that you can use the type system to help prove that the system is working correctly. Second, to improve performance!

Performance is important in an OS, and by using the type system we can save quite a bit of work. First, analysis of types should be in principle faster than actually accessing data stored in the capability itself. Second, it should be possible to designate a working set of capabilities that a process can name and access securely in accordance with the secrecy and integrity lattices. After this set has been calculated once, then the kernel will no longer have to interpose itself for security checks until the working set changes again.

Hopefully, with this potential performance gain we can have our cake and eat it too when it comes to using a functional language for the kernel design. Of course, these claims remain just that until a prototype is completed. As I already alluded, House is probably going to be the most feasible system in which to implement a prototype but it will likely be quite a bit of work.

1 comment:

Antoine said...

What is "House"? A kernel, I presume?