Security types are a niche area of type theory research that allow expressions about what a computation is allowed to know about. This has obvious applications to security and privacy, but security type theory has also been used in many other applications, such as binding-time analysis, garbage collection, non-termination with dependent types, and more.
The initial insight in the field is that information security is monadic. To illustrate this, consider a value \(v\) of type \(a\) that is secret. We'll use a functor \(T\) for secrecy, so we say that the type is \(Ta\). We can compute with \(v\) as long as the result of our computation is also secret: for \(f:a\to b\), we can do \(Tf:Ta\to Tb\). That's all normal functor stuff, but here's the monadic part: if the result of our computation is secret, then we don't really make the result "double secret," that would be silly. So if we have \(f:a\to Tb\), then \(Tf:Ta\to T(Tb)\), and we use \(\mu_b:T(T(b))\to Tb\) to squash the secrecy. This doesn't reveal the secret, because the result is still secret: we have no way to go from \(Tb\) to \(b\). The other monadic thing is \(eta_a:a\to Ta\), because we can always make something secret if we want to.
DCC was really the birth of this field, making the first step of the \(T\) monad. They went on to have a predetermined set of security levels. The set is a lattice, meaning that for two levels \(l_1\) and \(l_2\) there's a least upper bound \(l_1\lor l_2\) and a greatest lower bound \(l_1\land l_2\). Furthermore, since it's a lattice, there's a preorder over the levels, that is, a reflexive, transitive relation \(l_1\le l_2\). As you might expect, higher security levels can depend on information at lower security levels, but not vice versa.
But what if we didn't make a secrecy functor \(T\)? What if values were secret by default, and there was a publicity functor \(P\)? In this case, \(P\) is a comonad: \(\varepsilon_a:Pa\to a\) means public values can be made secret, and \(\delta_a:Pa\to P(Pa)\) means that if something is public, then the public knows that it's public. This is important because if we take a public value \(v:Pa\) and apply a function that returns a secret value \(f:Pa\to b\), the truth is that that first value was made public before, so the public knows about it, so this new value only depends on public information, so it's not really secret, and we can have a value of type \(Pb\). This is done with \(Pf\circ\delta:Pa\to Pb\), which proves the fact that \(Pb\) can be deduced from \(Pa\) and \(Pa\to b\). You can think of this as meaning that you can talk in secret about public information, but the conclusions you make aren't really secret, because anyone could have made them. These ideas are explored more in this paper.
Note that when I say "made secret" I don't mean redacted, I literally mean forgetting that it was public. The key idea of security type theory is that once something is shown to the public, the public can be remembering, using, and depending on that information potentially forever. So the monadic aspect keeps information in a secrecy monad to show that secret things can only lead to secret things, and the comonadic aspect allows you to remember that public things are still public; they're two different ways of looking at that same information-behavior.
The very brilliant reader may have noticed that these rules parallel the rules of S4 modal logic. The secrecy monad is like the possibility modality (which is indeed a monad) and the publicity comonad is like the necessity modality (which is indeed a comonad). For example, \(\Box p\land(\Box p\Rightarrow q)\Rightarrow\Box q\), because \(\Box p\Rightarrow\Box\Box p\) and \(\Box p\land(p\Rightarrow q)\Rightarrow\Box q\). I have a blog post all about this here, and I think it's really cool!
One important perspective on security types is "indistinguishability," the idea that observers at one security level can't tell the difference between values at higher security levels. Denotational semantics for security types are often given from this perspective, with equivalence relations on the types for each of the security levels (for example in this paper). Stephanie Weirich has explored a dependent type theory with an equality based on this idea of indistinguishability.
Functional programmers typically use parametricity to constrain information flow, instead of security types. So there's a train of research papers that try to define the primitives of DCC in System-F. The first success story was Noninterference For Free, by William J. Bowman and Amal Ahmed.
Building on that work, Simple Noninterference from Parametricity by Maximilian Algehed and Jean-Philippe Bernardy brought in ideas from the comonadic perspective to discover new primitives that simplified the translation to System F.
The other major direction of security type research is in calculi that add more cybersecurity-based guarantees. For example, FLAC by Owen Arden et al. builds on DCC by distinguishing between integrity (writing) and confidentiality (reading), and correctly handles changing security levels during runtime, without leaking secret information and having a careful notion of declassification.
Here's a paper by Priyanka Mondal, Algehed, and Arden that extends the guarantees of FLAC to a distributed setting with replication, failure, and quorum concensus.
Another direction is dynamic taint tracking, which tags information at runtime and crashes the program if a secret is about to get leaked. I connect this idea to type theory and logic in the blog post I linked to above. Dynamic taint tracking is popular in industry because it can be added to existing languages to apply to existing codebases; for example, Amazon uses it in C++ and Java codebases right now.
© 2024 Ryan Brewer.