Skip to content

Security Fiasco: Why small is necessary but not sufficient for security

2014/12/23

“Small is beautiful” is nowhere more true than it is in security. The smaller a system’s trusted computing base (TCB), the more feasible it is to achieve a reasonable degree of security. The TCB is the part of the system that must be trusted to perform correctly in order to have any security in the system. It includes at least parts of the hardware (processor, memory, some devices) and the most privileged part of the software, i.e. the operating system (OS) kernel, and, for virtualised systems, the hypervisor.

This is one of the strongest justifications for microkernels: you want to absolutely minimise the OS part of the TCB, by using a kernel (the part of the OS running in privileged mode) that is kept as small as possible. Kernel size is minimised by only including the minimum functionality, i.e. fundamental mechanisms, and leaving implementation of other functionality, including all policy, to (unprivileged) user-mode software.

We have built the seL4 microkernel on this insight, and taken it to its logical conclusion, by mathematical proving the correctness (“bug freedom”) of the implementation, as well as general security properties. These proofs are critically enabled by seL4’s small size (about 10k SLOC).

But while (barring dramatic technological advances which are many years away) security requires small size, this doesn’t mean that a kernel is secure just because it’s a microkernel!

A case in point here is the Fiasco.OC microkernel (also called the L4Re Microkernel – Fiasco.OC is the kernel and L4Re is the userland programming environment built on top). Some propagate Fiasco.OC as the core of a national security competency and advocate it as the base for the development of a national security infrastructure. Sounds like a good idea, after all, it’s a microkernel?

Turns out, it’s not. Turns out, Fiasco.OC (besides being not all that “micro”, weighing in at about 20–35k SLOC depending on configuration) isn’t secure after all, and most likely can’t be made secure without a comprehensive re-design.

Details are provided in a paper by researchers from the TU Berlin (and interestingly the lead author is a former member of the Fiasco.OC team). It shows that Fiasco.OC’s memory management functionality provides covert storage channels of significant bandwidth.

The mechanisms underlying those channels are the result of design flaws of Fiasco.OC’s memory management, which, on the surface, is driven by security needs. Specifically, Fiasco.OC provides an abstraction of per-partition memory pools. However, these are part of a global management structure that breaks isolation. In short: Fiasco.OC’s “partitioned” memory pools ain’t.

The deeper cause for this breakdown of security is that Fiasco.OC violates a core microkernel principle: that of policy-mechanism separation. The microkernel is supposed to provide policy-free mechanisms, with which policies are implemented at user level. In contrast, Fiasco.OC’s memory management encapsulates a fair amount of policy, and it is exactly those policies which the TUB researchers exploited.

This is a nice example that demonstrates that you’ll pay eventually when deviating from the core principle. Usually the cost comes in the form of restricting the generality of the system. Here it is in the form of a security breakdown. (Ok, for purists, that’s also a form of restricting generality: Fiasco.OC is restricted to use in cases where security doesn’t matter much. Not very assuring for those propagating it as a national security kernel!)

Given the size of Fiasco.OC (2–3 times that of seL4) and the lack of understanding of security issues that seems to have affected the design of its memory management, one must suspect that there are more skeletons hiding in the security closet.

This all is in stark contrast to seL4, which has been designed for isolation from the ground up, including radically eliminating all policy from memory management. This approach has enabled a mathematical proof of isolation in seL4: In a so-called non-interference proof that applies to the executable binary (not just a model of the kernel), folks in my group have shown that a properly-configured seL4 system enforces isolation. The proof completely rules out attacks of the sort the TUB researchers have levelled against Fiasco.OC!

To be fair, seL4’s security proofs only rule out storage channels, while leaving the potential for timing channels. This is, to a degree, unavoidable: while it is, in principle, possible to prevent storage channels completely (as has been done with seL4), it is considered impossible to completely prevent timing channels. At least seL4 has a (still incomplete) story there (see our recent paper on assessing and mitigating timing channels), with more work in progress, while Fiasco.OC has known unmitigated timing channels (in addition to the storage channels discussed above).

Building security solutions on secure microkernel technology is good, and I am advocating it constantly. But it will only lead to a false sense of security if the underlying microkernel is inherently insecure. seL4, in contrast, is the only OS kernel with provable security. Use it as the base and you have at least a chance to succeed!

4 Comments
  1. Sebastian permalink

    … unfortunately this: https://www.linkedin.com/pulse/fiascooc—real-fiasco-jean-pierre-seifert?trk=prof-post marks a new low point. Especially coming from a guy who seemed to be involved in the Trust2Core project and, correct me if I am wrong, a while ago these very same people where kind of proud to get a BSI approval. Constructive critique (you) vs. (I don’t even know what to call that). Nothing I want to see around and while I have many issues with Fiasco, this kind of behavior is way below my standards and I hope that of the L4 community as well.

  2. Hi Sebastian,

    Yes, I wouldn’t have written that blog you’re referring to (at least not like it’s been written), and thanks for calling my comments “constructive critique”.

    Also there are significant issues with the paper that blog (as well as mine above) references, starting at style (in places very unscientific) to an utterly unconvincing threat model. And I told the authors that. It is certainly not how I would write a paper.

    Nevertheless, the paper makes some serious points. The core issues raised are
    1) an availability (denial of service) attack
    2) a covert storage channel

    (2) is arguably unconvincing: The “new” covert channel, as presented in their threat model, is along an overt (although encrypted) channel (the VPN). That could be trivially exploited for sending covert information, without requiring any further channels. This is where the paper’s threat model breaks down. However, I suspect that you can develop a more convincing threat model where their identified channel breaks security.

    (1) is clearly serious: A DoS attack across security-partition boundaries is a clear breakdown of security.

    Gernot

    PS: I don’t know much about Trust2Core, but could well imagine that it was that work that led them to identify the security weaknesses in Fiasco.

  3. Tom permalink

    Hi Prof. Gernot,

    I recently discovered your blog and it has been bringing me great pleasure and thinking to read your blogs. Thanks for your writing!

    I’ve got a question here. I understand that seL4 kernel is small and verified. But I wonder is the kernel equivalent to TCB? According to the wiki page of TCB, TCB could include hardware, firmware, kernel and trusted processes (let’s ignore hardware and firmware for now). This definition makes sense to me because it seems to me that the security of a networked process depends on, not only the kernel, but also the trustworthiness of the network driver/stack, no matter the network component is in the kernel or gets pushed out of the kernel.

    So I guess my question is that: is a verified small kernel really equivalent to a verified TCB of a system?

    Looking forward to your relay.
    Regards,
    Tom.

    • Hi Tom,

      Sorry for taking so long to approve the comment, it got buried during some hectic weeks.

      You’re right, the TCB is bigger than the microkernel, and each part of the TCB can, in principle, be attacked. Hardware is certainly a big headache, but is other people’s domain, I can only encourage the hardware folks to work on verifying what they are doing, and make the verification artefacts checkable. Similar for the firmware, although in principle it should be possible to remove that from the TCB.

      Having said that, we often refer to the software part only when talking about the TCB, as it is the “malleable” part of the system. But one should always remember that there are other bits too.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: