Skip to content

Dishonest and Contradictory: Fact-Checking CSIRO’s Communications about the Trustworthy Systems Group

A few months back I fact-checked the CSIRO CEO’s evidence to a Senate Estimates Committee hearing. At the time I promised more fact-checking of CSIRO’s communication about the abandoning of Trustworthy Systems (TS). Well, it has taken much longer than I hoped, mostly because I’ve been working 60-hour weeks to undo the damage caused by CSIRO to the TS group, and seL4 and its ecosystem. Six months down the track I’m winning, but it’s taking all my time and energy.

So, somewhat delayed, here we go…

CSIRO Statements

There are at least six ways in which CSIRO communicated about the dismantling of TS:

  1. Statements made by management to all Data61 staff
  2. Statements made by management/HR to TS staff
  3. Statements made by management to unions
  4. Statements made by CSIRO to the media
  5. Statements made by CSIRO to industry partners
  6. Statements made by the CSIRO CEO to the Australian Parliament

Let’s look at them in turn.

1. What CSIRO told all Data61 staff

Exactly 6 months ago, on 17 May 2021, CSIRO’s Data61 held an all-staff meeting where management informed staff about the new Data61 strategy. While there were some extremely vague statements about the new strategy earlier in the year, this was the first time things became concrete.

Some of the statements made there by management are (and I point out that none of the following are literal quotes, as I do not have access to a recording/transcript or the slides presented, only my sparse notes):

  1. Data61 will be doing “fewer, bigger things”
  2. Data61 will be all about AI
  3. Data61 will continue to do cybersecurity, but only
    1. cybersecurity for AI, and
    2. AI for cybersecurity
  4. Data61 will exit “less aligned areas”
  5. There will be “minimal impact” on university collaborations (so-called CRP agreements)

We’ll refer to those statements as S1.1–S1.5 in the following.

2. What CSIRO told TS staff

On 20 May, TS staff received email from CSIRO HR stating (among others):

In relation to the Trustworthy Systems Group, the review has identified that while this group has achieved impact in the past, the capability area is not aligned with the three new strategic goals of the Data61 Strategy and fewer and bigger things identified. In particular, we will be focusing on the intersection of AI and Cyber, not just any cybersecurity and related research.  

This literal quote barely qualifies as English, but let’s try to separate it into individual statements (referenced as S2.1–S2.3 below):

  1. the capability area is not aligned with the three new strategic goals of the Data61 Strategy
  2. [TS is not aligned with doing] fewer, bigger things
  3. [Data61] will be focusing on the intersection of AI and Cyber, not just any cybersecurity and related research

3. What CSIRO told unions

On 21 May, CSIRO management sent an Advice to Unions, stating (among others):

Data61 will pivot away from areas that lack critical mass, are not at the cutting edge scientifically, and/or do not have sufficient market demand.

Separating the sentence into individual claims S3.1–S3.3, we get:

  1. pivot away from areas that lack critical mass
  2. [pivot away from areas that are] not cutting edge scientifically
  3. [pivot away from areas that] do not have sufficient market demand

4. What CSIRO told the media

On 21 May, a CSIRO spokesperson was quoted as:

The Trustworthy Systems group is focused on the area of formal methods for design, implementation, and verification of software systems. It is [sic] mature area of technology that CSIRO has invested in over a number of years and is now well supported outside the organisation

We separate the claims into:

  1. [what TS is doing] is a mature area of technology
  2. [TS technology] is now well supported outside the organisation

5. What CSIRO told industry partners

An email sent to industry partners (which reached me at my UNSW address via third parties not encumbered by NDAs) said:

  1. Data61 is committed to deliver to our partners, such as [company], especially on current projects and any work-in-progress.

6. What CSIRO’s CEO told parliament

In my earlier blog I fact-checked the testimony Dr Marshall, CSIRO’s CEO, gave to the Australian Senate. The conclusion was that some of it was factually incorrect and relevant information was omitted (and my clarifications/corrections to Dr Mashall were not passed on the parliament). I won’t go through all of it again, but am extracting the bits that are relevant for contrasting other statements made by CSIRO:

  1. [TS technology is] very mature and it’s open source, so it’s difficult to see an opportunity to build an industry in Australia or to derive a national benefit from that technology.

Let’s examine these statements in turn

Statements to all staff

S1.1: “Data61 will do fewer, bigger things.”

TS was unique inside Data61 in tackling large problems with critical mass and a unique combination of skills. Our strategy explicitly said for years “we solve problems no-one else can” – and we did.

Assessment: This is exactly what TS is known for, so that’s not a reason for destroying TS, to the contrary.

S1.2 + S1.3: Data61 will be all about AI, plus cybersecurity as long as it’s for AI systems.

It’s obviously management’s prerogative to define what to include or exclude in their research strategy (if you can call a bunch of buzzwords a “strategy”). However, “cybersecurity for AI systems” would clearly include the work of TS. Here are a few proof points:

  • The DARPA HACMS and CASE programs explicitly use TS technology to protect autonomous systems from cyber-attacks.
  • seL4 Foundation member Ghost stated: “There is more research required into how to architect, construct and integrate an AI application on a trusted system of this complexity, and how we achieve this at scale. Investing in AI research without investing in trustworthy systems research will greatly diminish the impact and applicability of AI to real-world products.” Clearly, TS technology ensures cybersecurity of AI systems.
  • Another seL4 Foundation member, Horizon Robotics, said: “To address the challenges of safety, security and realtime in autonomous software, a fundamental high quality state-of-the-art microkernel is needed. We are looking forward to working with members of seL4 Foundation to build mixed-critical platform and solution for next-generation autonomous driving vehicles.” Again, TS technology providing cybersecurity for AI systems.

There are more examples of this. Clearly, if you’re serious about cybersecurity for AI systems, then TS’s is the go-to technology, so ruling TS out of scope on that base is dishonest or clueless.

But it gets worse. Data61 seems to keep investing in blockchain technology. There’s nothing inherently wrong with that, but how does it fit the criteria that supposedly rule TS research out of scope? Blockchain may be a popular buzzword, but it’s not AI. And even calling it cybersecurity for AI is a huge stretch. It’s much less obviously aligned with the Data61 “strategy” than TS technology is. The same can be said about Data6’s IoT-Cloud Security work.

Assessment: Ruling TS technology out – while ruling blockchain and IoT-Cloud security research in – based of the Data61 strategy is inconsistent and dishonest.

S1.4: Data61 will exit “less aligned areas”

As above: TS research is clearly aligned with “cybersecurity for AI systems”, and is better aligned than blockchain research and at least as well as “IoT-Cloud Security”, yet they “exit” TS but, apparently, not the others.

Assessment: inconsistent and dishonest.

S1.5: There will be “minimal impact” on university collaborations (CRPs)

Well, the TS CRP with UNSW was one of the biggest (if not the biggest). It was terminated early as a consequence of the “exiting” of TS. Do you call this “minimal impact”?

Assessment: Looks pretty dishonest to me.

Statements to TS staff

S2.1: [TS] is not aligned with the three new strategic goals of the Data61 Strategy

As explained above, TS is actually aligned with “cybersecurity for AI”, as independently evidenced by external sources. Yet, the clearly less-aligned blockchain area seems to continue.

Assessment: dishonest.

S2.2: [TS] is not aligned with “fewer, bigger things”

Assessment: debunked above (S1.1).

S2.3: we will be focusing on the intersection of AI and Cyber, not just any cybersecurity and related research

Assessment: debunked above (S1.3).

Statement to Unions

These were a number of “and/or” conditions, so notionally the combination was true if at least one of them was true. Let’s look at them one by one.

S3.1: exiting areas lacking critical mass

TS used to be the by far biggest group in Data61, there’s no doubt there was critical mass. By the time of the decision, this had been withered down to about 14 CSIRO full-time staff by not replacing departures and not renewing fixed-term contracts (leading to projects being understaffed and failing to deliver on time – partners getting screwed as a result). In addition there were about 6 FTE UNSW research and engineering staff. 20 FTE is still a pretty good group size, so this claim does not hold for TS.

S3.2: exiting areas “not cutting edge scientifically”

There is plenty of work on-going in Data61 that is nowhere near “cutting edge scientifically”, but TS wasn’t one of them. Just to give one example, our work on time protection as a principled means for completely eliminating information leakage through timing channels is without doubt cutting-edge, I’d even claim it’s peerless. Just one indication is that out of 4 publications in the 2 years prior to the decision to abandon TS, that work has won three best-paper awards. I challenge anyone to publicly state that this work is not cutting edge – they’ll open themselves up to ridicule. So, this claim also doesn’t hold for TS.

S3.3: exiting areas that do “not have sufficient market demand”

In the context of TS, this one is directly contradicted by CSIRO’s own public statements, specifically S4.2, as well as the strong uptake of seL4 and the growth of the seL4 Foundation in the past 6 months. So, this claim doesn’t hold for TS.

Assessment: neither of these individual claims hold for TS, and hence using them (even in combination) as a justification for “pivoting away” from TS research is dishonest.

Statements to the media

The astute reader will notice that the complete quote avoids associating TS with cybersecurity in any way. That in itself is interesting, given that the importance of cybersecurity is much more widely understood than that of software verification, even though you can’t have real cybersecurity without verification. This looks very much like spin.

But let’s look at the individual statements:

S4.1: TS technology is mature and doesn’t require on-going support

I wrote a whole blog debunking this nonsense. I’m not going to repeat it here, but the summary is while seL4 is mature enough to be deployed in the real world, there’s plenty of fundamental research work left on seL4 itself (such as the award-winning time protection work), and there is far more research left on how to achieve real-world trustworthy computer systems.

Assessment: Utterly wrong.

Statements to industry partners

S5.1: Data61 is committed to deliver to our partners

So, they claimed they are “committed” to delivering to customers, just as they decided to get rid of all the people who have the skills for such delivery? Making such a statement to partners who are handing over large sums to CSIRO in exchange for getting research done which CSIRO is now obviously incapable of delivering is just stunning. And management can’t pretend they didn’t know that, the uniqueness of TS’s skillsets are widely known in the organisation. It’s a shame none of the affected partners sued them, they would have had a good case I reckon.

What happened in the end is that one partner terminated their multi-million-dollar project (after having handed over the majority of that cash). The others we transferred to UNSW, where they are served by the remaining TS team. However, that possibility only existed because UNSW offered to fund the team to the end of the year – without that offer (which was made after CSIRO’s announcement, of which UNSW was given no prior warning) there would be no TS team left. So if CSIRO were to make any claims (which were at least implied in some of the statements) that all is good because we’re continuing at UNSW that would add to the dishonesty.

Assessment: Blatantly dishonest.

Statements to the Australian parliament

I already fact-checked Dr Marshall’s evidence to the Estimates Committee of the Australian Senate, and concluded that it omits important facts, and especially with regard to a senator’s question about work on protecting autonomous cars from cyber-attacks (of which CSIRO has none after demolishing TS). I pointed out that the claim of “difficult to see an opportunity to build an industry in Australia or to derive a national benefit from that technology” cannot be reconciled with the fact that there is a growing number of Australian companies whose business is based on TS technology. (And to my knowledge, there is no research remaining on Data61 that can make a similar claim.)

A further dis-proof of this claim is the Laot project, funded by Australian Defence, which deploys TS technology for protecting Australia’s critical infrastructure. And silly me thinks that would be in the national interest? Particularly after Australian Home Affairs Secretary Mike Pezullo’s warning about attacks on critical infrastructure?

Moreover, Dr Marshall failed to correct the record after I provided the information to him, so he has no place to hide.

Assessment: Omits important facts.

Comparing CSIRO statements

Amazingly, CSIRO’s statements aren’t even consistent with each other. Specifically, CSIRO makes claims that TS technology:

  • “does not have sufficient market demand” (S3.3);
  • “is now well supported outside the organisation” (S4.3);
  • “[no] opportunity to build an industry in Australia or to derive a national benefit” (S6.1).

I can’t see how to reconcile these statements with each other, especially “no market demand” on one side and at the same time “well-supported outside” (by industry!)

Assessment: contradictory.


The seemingly inescapable conclusion from the above is that CSIRO’s communications about TS are full of half-truths, untruths and contradictions. This is a pretty sorry state of affairs for our national science agency, and specifically its Data61 arm that is all about the future-critical information technology. Note that this is an organisation that lists “trusted” as one of its values. How can you trust an organisation that is so loose with the truth?

Unfortunately, this is consistent with the long and slow decline of CSIRO, resulting from budget cuts and political interference. I am actually amazed that there is still great science being done in parts of CSIRO, despite the oppressive culture (likened by some to the Soviet Union) created by management. Data61, despite all the propaganda about excellence etc, based on my observations (and that of others), is an organisation that fosters mediocrity and is hostile to excellence.

Four years ago, Data61’s then Chief Scientist Prof Bob Williamson FAA conducted an internal review of Data61 science. It identified 6 groups/teams as being top-class, TS being one of them. (Another one was the Legal Informatics team that got disbanded at the same time as TS.) Not much is left of those, the top people left or their teams got abolished. What is left is mostly mediocre.

The same review also identified areas that were sub-par. It was particularly scathing about one team:

This is a weak team that does not know it. Unfortunately they appear to have been favoured by the program director who has a very poor sense of scientific quality. The interview and documentation was disappointing. They are actually a reputational risk to the organisation. They should be disbanded. I doubt there is talent here that is worth retaining.

That team is continuing in Data61, and its leader has been given extra responsibilities. It’s all you need to know.

Data61 management is likely to counter this with the external “science review”, which contained a fair amount of uncomplimentary material, some of which was apparently aimed at TS, especially comments about living off the past. The problem is that we were supposed to talk only about impact, of which TS has plenty, but, of course, that’s inherently a rear-mirror view. Any reference to our on-going research and vision was censored from the group leader’s talk by management and comms people! Talk about setting world leaders up for failure!

My final conclusion is that Data61 is an institutionalised mediocrity and a waste of taxpayers’ money. Eventually, people will realise this and shut it down (or completely revamp it). While inevitable, this may take up to five years – plenty more tax dollars will be wasted until then. Depressing prospects for the tax payer.

But on the bright side: TS is over the hill and is regrowing its strength. We aren’t done yet with producing excellence and impact – stay tuned!

Where is seL4 Heading?

Following CSIRO’s abandoning of Trustworthy Systems (TS) and the seL4 technology TS developed, the seL4 Foundation has seen a significant increase in Foundation membership, including a number of new Premium Members that get a seat on the Board. Among others, this has increased the Foundation’s budget ten-fold – a great shot in the arm for the Foundation and its role of growing the seL4 ecosystem.

Does that mean a change of control of seL4?

This is a concern I have come across, mostly from people who do not (yet) understand how an open-source community such as seL4 works. (And I have been informed that people behind some of the competing, yet inferior, technologies out there are actually feeding this concern; classical FUD as it has been used against open source for decades. If someone presents you with this FUD then please point them here.)

In short, there’s no basis to this concern.

The way the seL4 community operates has not changed since we set up the seL4 Foundation in April 2020 (other than having overcome much growing pain). We have a clearly-defined and open governance structure which consists of the seL4 Board (responsible mainly for Foundation finances and recruitment) and the Technical Steering Committee (TSC) for technical leadership. I’ll explain these below, after addressing some specific concerns people have raised.

Could some organisation undermine seL4’s security?

In a word: No.

To start with, like in any other functioning open-source project, contributions to seL4 are made openly, via pull requests on the public Github, and anyone subscribed sees each of them. Any pull request is carefully vetted and discussed publicly. It cannot be merged until it passes regression tests (both traditional testing and proofs) and is approved by at least one committer, a trusted expert blessed by the TSC. You can’t buy committer status, you can only earn it, by earning the trust of the TSC. And only after this approval, the PR can be merged by one of the committers. So far, so standard for open source.

However, seL4 is special, as we all know. And what makes it special is its formal (mathematical), machine-checked proofs. With other projects, regression tests are just that: tests. They exercise the software on carefully chosen inputs and the outputs are compared to expectations. As all testing, this is inherently incomplete and cannot fully protect against bugs (or malicious code) being introduced.

The seL4 proofs are different: They are complete in a strong sense; code that is proved correct against a specification cannot have bugs as far as the specification is concerned. And by re-running proofs before merging changes, we can be certain that the code is still bug-free. That’s the really unique thing about seL4: even if the reviewers and committers fail to spot a bug that’s been introduced, the proofs will catch it (by failing).

The proofs do not yet cover all aspects of seL4, for example the 64-bit Arm version is not yet verified, multicore operation is not yet verified, nor is the boot code, cache management, and the very small amount of assembler code. Hence there is still potential for bugs in those unverified bits (and occasionally we find some). But the unverified configurations share much code with verified ones, and overall the kernel is small, so that the risk is far reduced compared to almost all other software. Deliberately introducing a vulnerability even into those parts would be extremely hard. And it would be detected as soon as that code gets verified. But, as we always point out, when running an unverified configuration you have no guarantee. A good reason to work together to raise funds for verifying those missing parts!

Also, to repeat, the contribution process has not changed since we set up the Foundation. Members cannot buy committer status, and they certainly cannot buy changes to the code.

How about Board membership?

According to the seL4 Foundation Fund Charter, the seL4 Board manages the seL4 Foundation Fund, and the purpose of the seL4 Foundation Fund is “to raise, budget and spend funds in support of the seL4 technical project carried out by [the seL4 Foundation]”. In other words, the Board’s role is to raise money and decide how to spend it for the benefit of seL4. It has no say on technical decisions (other than by prioritising specific activities through directing money there).

The Charter also stipulates that the Board “will strive to make decisions based on consensus.” This is real: We are yet to experience a case where a Board decision was not unanimous (other than members abstaining for conflict-of-interest reasons). In fact, our experience is that Board members very much trust the seL4 Founders (June, Gerwin and myself, who have guaranteed Board seats for five years renewable). A diverse Board representing a breadth of stakeholders yet unified in trust is a great asset for the seL4 community!

What if that trust broke down?

A single rogue Board member cannot do much damage (other than making the Board’s life harder), and the fixed representation of the founders acts as a further stabiliser. If a significant number of Board members tried to pull into a different direction, then that would be a problem, but would be a reflection of serious issues in the seL4 community as a whole. I do not have any concerns in this respect. Our Board members, whether elected by members or representing Premium Members share a common interest: they are there because they are vested into the technology; their companies are developing their products around seL4. Their interest is to see seL4 flourish, not undermined.

How about the TSC?

The role of the Technical Steering Committee is defined by the seL4 Foundation Technical Charter. Its role is defining the rules governing development of seL4 and the core open-source technologies around it, enable and foster technical collaboration and provide technical oversight.

Who is on the TSC?

Its members are the original committers, plus further trusted contributors the TSC decides to invite. So it’s really the trusted technical leadership – you can’t buy your way in there. (Note that this is different from, e.g., RISC-V International; Premium Members get representation on the RISC-V TSC.)

The seL4 TSC also blesses new reviewers and committers, ensuring that only people who have earned the trust of the community can authorise changes to the mainline code (but never on their own).

Will CSIRO’s abandonment slow seL4 development?

This is an obvious concern, and there was a very serious risk that this could happen.

Financially, the impact is minimal. In CSIRO we had some base funding that helped us maintain the technology. While this has disappeared, it was quickly picked up by the community. Community contributions have increased massively since we moved out of CSIRO, and the Foundation membership has grown (increasing its budget ten-fold). Those two developments together are easily making up for the loss. And all major developments have been externally funded ever since TS became part of CSIRO. CSIRO’s commitment to seL4 was never more than superficial.

The biggest risk right after CSIRO’s move was to lose most of TS’s people. Mountains of money are no use if you don’t have the skilled people to spend it on. And that risk was very real: After word of CSIRO’s actions got out, many people had job offers within days. Had not UNSW, within a day(!), stepped up to support the team for half a year, the result might well have been catastrophic for seL4.

Thanks to UNSW, that has been avoided. We still lost many people, but most stayed in the community and keep contributing as part of their new job. This is healthy, as it broadens the community and helps it mature to the point where it is no longer dependent on a single organisation. And UNSW gave us the buffer we needed to line up new funding to continue the TS team and grow it back. We’re in the middle of this, and it’s going well – you’ll hear some announcements soon. For now I can just say how extremely grateful we all are to Aaron Quigley, Head of School of Computer Science & Engineering at UNSW, for moving quickly and decisively to protect TS and seL4.

How will we fund major projects?

There are a number of major projects we’re looking to fund, mostly the “big ticket” verification projects (AArch64 and multicore), some medium-sized verification projects (completing the RISC-V verification story and completing verification of the MCS kernel), as well as a number of systems projects (such as multicore VMM, device virtualisation, an actual seL4-based OS). We need external funding for those, and that would not have been different with CSIRO.

But we’re in a much stronger position now. The community has realised that these things won’t happen on their own, they need to contribute. And the community, and especially the Foundation, has grown, which helps. And we have a much stronger (and growing) ecosystem of companies providing services around seL4, which itself strengthens the ecosystem and will lead to further investment. One of these companies is, of course, Proofcraft, the provider of verification services founded by the leaders of the seL4 verification.

The future

Personally I’m much more bullish about seL4’s future than I was a year ago (and those who heard me talk about seL4 then will understand that this means something!) CSIRO was, in many ways, a weight that slowed us down. When I bring money into UNSW, I remain in control of the funds, and can hire as needed. At CSIRO, in contrast, we were always at the mercy of management, much of which are people who have never achieved anything of scientific significance in their life, but think they know better than those who have – definitely a poor fit.

Since the divorce, things have been moving faster in the right direction than I anticipated, and by its anniversary we will be in a far stronger position than before. You’ll see some indications of that in the near future.

Stay tuned, stay calm, and trust your kernel (and the team behind it!)

seL4 Integrity Enforcement Proved for RISC-V

Great news: Ryan Barry from the Trustworthy Systems verification team has just completed the access-control proofs of seL4.

What does this mean?

In more detail: the proof shows that seL4 will only allow a thread to access an object or memory resource if the access is explicitly authorised by a capability. Specifically, user code cannot write to memory for which it does not hold a write capability (nor will the kernel perform such a write on the user’s behalf).

This establishes the critical integrity property: A process cannot overwrite another process’s memory without explicit authorisation. In other words: user processes on seL4 are strongly isolated and cannot interfere with each other.

But it means more. As the proof guarantees that there is no access to objects or memory resources unless explicitly authorised, it also implies availability of the memory resource: A process cannot interfere with another’s resource access.

The integrity proof does not talk about read accesses directly, but it does predict which user threads can at most have read access to which memory regions.  This is a very useful property, even if it stops short of the stronger notion of confidentiality (the third of the classical “CIA properties” of security). This is because preventing read access is not sufficient for preventing leakage of secrets.

Where does that leave us?

proof chain
seL4 proof chain for RISC-V.

We had previously proved confidentiality (including freedom from covert storage channels) for the 32-bit Armv7 architecture. For RISC-V, this final security proof still needs to be done (and we’re working on it).

However, we already have by far the most comprehensive verification story of any OS for RISC-V, and really for any OS for a 64-bit architecture. Specifically, we now have for RISC-V:

  • Proof of functional correctness, meaning that the C implementation is proved to conform to the specification and, as such, is free of bugs in a very strong sense;
  • Proof of translation correctness, meaning that the binary code produced by the compiler and linker is correct. This extends functional correctness to the executable binary;
  • And now proof of integrity and availability enforcement, or, more general, that the kernel enforces the access-control model. Because of the other proofs, we know that this property, proved about the formal specification of the kernel, applies to the actual kernel executable.

This degree of assurance is only surpassed by the proof chains of seL4 for Armv7. It means we are getting close to RISC-V becoming the best-supported architecture for seL4.

Where does this leave Arm?

When we did the original functional correctness proof of seL4 12 years ago, Armv7 didn’t even exist, we did it for Armv6 and later adapted to v7. 32-bit Arm was then the only architecture with a verified kernel. By now, of course, 32-bit processors are not exactly hot any more, the world of mobile devices which Arm dominates has long moved to 64-bit Armv8 processors. So there is now an OS with an unparalleled verification story for an obsolescent version of the architecture.

I would encourage Arm, as well as major users of Arm processors, to consider this situation, where they are effectively being overtaken by RISC-V.

We would love to talk to you about rectifying this. There are plenty of major players with a strong interest in security on Arm processors. Each of them alone (including Arm itself) could easily afford funding the verification of seL4 on AArch64, but if a handful of them get together, the cost to each becomes a fraction of their marketing budget.

Think about it!

There’s more to it, Dr Marshall!

On 3 June 2021, Dr Larry Marshall, CEO of CSIRO, gave evidence in an Estimates hearing in the Australian Senate. The official transcript is available from the Australian Parliament’s web site. Senator Chisholm asked a number of questions regarding CSIRO’s decision to abandon the Trustworthy Systems Group (TS), one of them Dr Marshall took on notice. The answer was published last week (Question 71), so I thought I might do a little bit of fact checking.

Dr Marshall’s oral evidence

Referring to an article in InnovationAus, Sen Chisholm asked about Data61’s job cuts in general, and then starts homing in on TS:

Why has CSIRO announced it will discontinue funding for the world-leading Trustworthy Systems team?

To which Dr Marshall replied:

I thought that was what you were referring to earlier. This is the team that developed the seL4 kernel, which is a microkernel that’s very good for security of mobile phones. That breakthrough was made back in the early 2000s by NICTA and UNSW. The team that you’re referring to is actually on campus at UNSW in Kensington. Unfortunately, that technology was licensed to I think Qualcomm—don’t hold me to that—for a one-time fee. I say ‘unfortunately’ because that technology has gone through two billion mobile devices. Unfortunately, there was no ongoing royalty arrangement with the deal that was done at that time. NICTA did spin out a company to try to commercialise that technology—OK Labs—but it closed its doors in 2012. The NICTA team continued to develop the seL4 kernel from about 2009. When we acquired NICTA we continued to work with them.

Not everything here is correct. In particular, seL4 has no more to do with mobile phones than with autonomous aircraft, autonomous cars, medical devices, critical infrastructure protection, defence systems – it’s used in all of these. Dr Marshall confuses seL4 with the earlier L4 kernel that became OKL4, which we last touched in research in 2006, and which ended up on billions of Qualcomm chips and iOS devices.

And the comment about a breakthrough in the early 2000s? This sounds to me like he’s trying to say “these guys did some great work 20 years ago but not much since.” Such an implication would be blatantly wrong. Our first breakthrough was the verification of seL4 in 2009, the first proof of implementation correctness of an operating system, something people had been trying for 30 years. But that was not the end of it: We followed it up with further world-firsts: proving that seL4 enforces integrity and availability, later proving that it enforces confidentiality, extending the correctness proofs to the binary code (no longer having to trust the compiler), and the first sound and complete worst-case execution-time analysis of a protected-mode operating system. And our latest flagship project, time protection, presents a principled, systematic approach for preventing information leakage through timing channels – a serious real-world problem almost everyone puts into the too-hard basket or tries to combat with patches and workarounds. While only at the beginning, this project has already won three Best Paper awards. I’m yet to see any project in Data61 that is comparable in ambition and potential real-world impact, yet credible given the track record of the people behind it. So any implication that we’re resting on our laurels is just plain BS.

But maybe I’m just paranoid and he didn’t mean to say what can be read into his statement. Besides, we can’t reasonably expect Dr Marshall to be on top of all the detail of what’s going on in Data61, so I’m giving him the benefit of the doubt.

Dr Marshall continues:

There is a challenge with that technology. It’s very mature and it’s open source, so it’s difficult to see an opportunity to build an industry in Australia or to derive a national benefit from that technology. Given our priority is artificial intelligence, we chose to pursue that and focus our resources where we thought we could derive greater national benefit.

Very mature? Well, it’s mature enough to be deployed in real-world systems, but that does not mean there is no research left – to the contrary, as I had pointed out in a recent blog. It’s as if solid-state electronics was considered “mature” and no longer in need of research once the first transistor radio shipped. We wouldn’t have laptops or mobile phones with that attitude.

The argument that because it’s open source “it’s difficult to see an opportunity to build an industry in Australia or to derive a national benefit from that technology” is outright bogus. We’re building up an ecosystem around seL4 that has already delivered a lot of economic impact, including multiple new companies, development for Australian defence, and projects using seL4 to protect Australia’s critical infrastructure. In fact, CSIRO’s abandoning of seL4 is particularly stunning given that less than two weeks earlier, Home Affairs Department secretary Mike Pezzullo was quoted in the media with truly alarming statements about threats to Australia’s critical infrastructure. seL4 is the best foundation for providing strong protections here – how can investing in it not further the national benefit? I find that statement truly mind-boggling. Can CSIRO management really be that oblivious to the power of its own technology?

Sen Chisholm:

Are there any outstanding contracts requiring the ongoing services of that team?

Dr Marshall:

That team has a number of contracts, which is good, because it made it easier for UNSW to work with them based on that external revenue. It would be great if they continued to do that and even better if they were able to figure out how to create a company around that. That would be a great outcome. Our conclusion was that that’s not really feasible in Australia, which is why we chose to discontinue the work.

Now this is where we’re getting into dangerous territory as far as the truth is concerned. For one, Data61 told contract partners:

Data61 is committed to deliver to our partners, such as [company], especially on current projects and any work-in-progress. [1]

Note, this was before UNSW stepped in to rescue the team! Without UNSW’s action, the team would be gone by now, and the delivery on the contracts simply impossible, as CSIRO has no-one else familiar enough with the technology. Which raises an interesting legal question: How can a team be declared redundant, when it is needed to deliver on commitments to funding partners? But I digress.

It is worth noting that Dr Marshall offers two reasons why Trustworthy Systems and the seL4 technology was abandoned: (1) it’s not AI (true, but see below…) and (2) you can’t turn it into a startup. The second one is rather curious: if the ability to become a startup is a necessary condition for research in Data61, then they’ll have shut down most of the joint. While TS does not have to be shy about its economic impact, there’s not a lot in Data61 that can match it. But then, measuring economic impact isn’t easy and I’m not an economist. I’ll leave this point for another day.

However, Sen Chisholm homes in on reason (1):

Okay. AI has been identified as a sector that is going to be really important in autonomous cars and so forth. What funding is being directed to ensuring that we have a system that’s safe from hackers and is robust, I suppose, in its use?

CSIRO takes this question on notice.

CSIRO’s written answer

CSIRO’s response has just been published. I quote it in full:

Artificial intelligence (AI) and cybersecurity are growing areas of investment and collaboration for CSIRO, with a specific focus on the use of AI to protect against cyber-attacks as well as mitigating against new forms of AI-enabled cyber-attacks.

Work to ensure security of systems is an integral part of larger CSIRO projects. Thus, it is not possible to provide accurate funding figures for this specific area. CSIRO projects of relevance include:

1) cutting edge machine learning and AI technologies to generate realistic computer systems and assets for the purposes of deceiving intruders

2) advanced AI-driven anti-phishing techniques to defend systems from social engineering or advanced AI-driven phishing attacks

3) technologies for detecting vulnerabilities in AI systems and software applications, and corresponding countermeasures

4) advanced technologies to make security solutions usable/human-centric, combining the benefits of AI and human intelligence

5) advanced AI-driven solutions for preserving data privacy

6) advanced methods to detect fake news and protect social media users using AI

7) advanced techniques to increase the trustworthiness of AI systems

8) AI and cybersecurity projects in partnership with Defence Science and Technology under the Next Generation Technologies Fund to realise new capabilities to support Australia’s Defence.

[Note that CSIRO’s response has a list of bullet-point items here, I numbered them for easier reference.]

The astute reader will notice that this does not actually answer Sen Chisholm’s question at all when he talks about protecting autonomous cars! At best items (3) and (7) are loosely related, the others are utterly irrelevant to the question.

How does this relate to seL4?

What Dr Marshall did not say is that seL4 is not only obviously useful for protecting AI systems, but is actually being deployed in many places for exactly that purpose. I give a number of examples.

DARPA: In the DARPA HACMS program we worked with project partners on protecting a military autonomous helicopter from cyber attacks – successfully. We are presently participating in DARPA’s successor program CASE, which is about scaling up the technology and is still centered around autonomous aircraft. And just for the record (so Dr Marshall cannot claim that this is about aircraft, while Sen Chisholm asked about cars): In HACMS, seL4 also secured autonomous army trucks – successfully.

Autonomous passenger cars: But it’s not just the military, the automotive industry is also recognising that seL4 is the best technology for protecting autonomous vehicles from cyber attacks.

  • In my previous blog I quoted Dr Daniel Potts of of autonomous driving company Ghost Locomotion. Ghost is a member of the seL4 Foundation, because they think that the seL4 technology is required for keeping the AI system secure. “Investing in AI research without investing [in] trustworthy systems research will greatly diminish the impact and applicability of AI to real-world products“, he said.
  • seL4 Foundation member Li Auto, makers of premium electric cars, say: “Li Auto’s team will develop a secure highly extensible real-time open platform for next generation self-driving vehicles based on the micro-kernel OS seL4. The platform will enable an ecosystem for third party application developers.
  • seL4 Foundation member NIO, also a premium electric-car maker, says: “The Digital Systems department at NIO is missioned to develop the most advanced software platform for the next-generation autonomous driving vehicles in the industry from the ground up. This platform is internally named NIO Vehicle Operating System (NVOS) and based off seL4.
  • And Horizon Robotics, also a member of the seL4 Foundation, calls themselves a “a global leader in the development of artificial intelligence computing platforms.” And they go on to state: “To address the challenges of safety, security and realtime in autonomous software, a fundamental high quality state-of-the-art microkernel is needed. We are looking forward to working with members of seL4 Foundation to build mixed-critical platform and solution for next-generation autonomous driving vehicles.

The last three quotes can all be found on the seL4 news page.

So, when it’s about protecting AI systems from cyber-attacks, industry goes for seL4. The very technology CSIRO thinks is not relevant to their agenda.

An honest mistake?

Maybe Dr Marshall didn’t know that when he gave his evidence to the Senate?

Quite possible he didn’t know at the time he appeared before the Senate, he cannot be expected to know everything that’s going on in CSIRO. But he definitely knew by the time he signed off on CSIRO’s answer to the question on notice.

How do I know?

I told him. In an email to Dr Marshall I said:

I can help you with [the answer to Sen Chisholm’s question]: this funding has just been dramatically slashed. seL4 and the rest of the Trustworthy Systems (TS) technology is proven to protect autonomous vehicles from cyber attacks, ever since the DARPA HACMS program demonstrated four years it could thwart such attacks against an autonomous helicopter and autonomous army trucks (see Since then, TS technology is being designed into military as well as civilian autonomous vehicles. (Besides a lot of other uses, such as secure communication devices, including those built by Canberra-based Penten and in daily use in the Australian Army.)

[Note that the link in the quote may no longer work by the time you click on it, due to our web site migrating to UNSW. The new link is]

Dr Marshall acknowledged that email. So he cannot pretend he didn’t know.

In the same email I also challenged his comments on startups:

I also found your claim that TS couldn’t be turned into a startup to be quite misleading, given that it has already generated multiple startups, the latest only a few weeks ago.

I’m looking forward to reading your explanation/clarification in the Hansard, and am happy to provide more detail as needed.

He obviously passed on the opportunity to set the facts straight.


There’s clearly more to the story than Dr Marshall and CSIRO said in their answers to Sen Chisholm.

Does that constitute misleading parliament? I don’t know, I’m not a lawyer. But I wonder what Sen Chisholm thinks.

And I never thought I had to fact-check our national science agency. Not fact-checking the science, “just” the management. Still, truly saddening.

Stay tuned for more fact-checking of CSIRO’s communication about the abandoning of Trustworthy Systems.

[1] Just so no-one can claim I am revealing sensitive CSIRO internals: This information about CSIRO promises made to project partners came to me from third parties to my UNSW address, unencumbered by non-disclosure agreements.

“Trustworthy Systems Research is Done” – Are You Kidding, CSIRO?

CSIRO, Australia’s national research agency, has just decided to disband the Trustworthy Systems (TS) team, the creators of seL4, the world’s first operating system (OS) kernel proved correct and secure. TS is widely regarded and admired as the leader in the use of formal methods (mathematical proof techniques) to real-world software systems, and arguably the team that put CSIRO’s Data61 on the map internationally.


Why would they cut down their shining example of research excellence, with a rare track record coming up with fundamental solutions to real problems, and taking those solutions to the real world?

One of the reasons given by CSIRO is that “seL4 [is] now a mature technology that is ‘well supported’ outside the organisation’.”

This claim, that seL4 is a “mature technology” that needs no more research and has sufficient support is stunning on so many levels. For one, the group is not accidentally called “Trustworthy Systems” (and not, say, the “seL4 Research Group”). seL4 is only the starting point for achieving trustworthiness in computer systems. It’s as if over 100 years ago people said combustion engines are a solved problem once it was shown they could power a car.

Fact is that, while seL4 is mature enough to be deployed in the real world, there’s plenty of fundamental research work left on seL4 itself, and there is far more research left on how to achieve real-world trustworthy computer systems. It’s not that just sprinkling a bit of seL4 fairy dust over a system will make it trustworthy. More on both points below.

In this context it’s interesting to note that the Head of Australia’s Department of Home Affairs warns that the threat of cyber attacks to Australia’s critical infrastructure is “immediate”, “realistic” and “credible”, and could take down the nation’s electricity network, just days after we learn that CSIRO shuts down the research that specifically aims to stop such attacks (and is arguably the best approach to achieving such protection). Great timing.

Work to do: seL4

It is true, the seL4 kernel is mature in many ways, good enough to be deployed in real-world systems. It is already in daily use in the real world, and is being designed into many more systems. But that doesn’t mean it’s “done”.

Right now, seL4 solves a number of fundamental security problems, and it provides the best possible solution to these problems. In particular, it provides the strongest possible spatial isolation, in that it guarantees that memory cannot be accessed without explicit authorisation. It also provides strictly controlled communication between subsystems, in that two subsystems (provably) cannot communicate through system calls or memory unless explicitly authorised. And it does this with unbeaten performance. This is more than any other real-world OS can give you.

What seL4 cannot (yet) do, and no other OS can either, is to provide temporal isolation guarantees. This comes in two guises, the integrity and the confidentiality aspect.

Here, integrity means the ability to guarantee timeliness of real-time systems, especially mixed-criticality systems (MCS), where critical, high-assurance real-time tasks operate concurrently to untrusted code. seL4’s new MCS model provides temporal integrity to a significant class of MCS, and its verification is on-going. However, it does not yet fully solve the problem. Specifically, we found that there are important use cases for which the present MCS model is not sufficient. On-going research is addressing this, leading to further improvements of the model.

Furthermore, we have not yet developed the formal framework for reasoning about timing guarantees on top of the MCS model. This is, of course, what is needed for making high-assurance MCS a reality, and is a significant research challenge, which is presently unfunded. Again, while we’re ahead of any other system, the world’s emerging cyberphysical systems need more.

Much more work remains on the confidentiality side: Here the problem is to guarantee that there is no information leakage through covert timing channels; this kind of leakage is a serious real-world problem, as demonstrated in the Spectre attacks. Timing channels have long been put into the too-hard basket by most people. Triggered by Spectre there is now a flurry of activity, but most are band-aid solutions addressing symptoms. In contrast, we are working on a principled, fundamental approach to a complete prevention of timing channels. We call this approach time protection, in analogy to the established memory protection. The feedback from the research community has been strong: the work has already won three best-paper awards, yet we are only at the beginning of this line of work.

Specifically we have designed some basic OS mechanisms for providing time protection, and have shown that they can be effective on the right hardware, but also that contemporary hardware is deficient. Presently, with support by the Australian Research Council and the US Air Force, we are working on proving that these mechanisms are effective on suitable hardware. This work, having progressed well, is now under threat as CSIRO took the unusual step of returning the Air Force funding.

We are also working with the RISC-V community on defining appropriate hardware support to allow time protection to do its job. But much more research is needed on the OS side, as so far we have some basic mechanisms, that work in very restricted use cases. It’s far from having an OS model that addresses the large class of systems where timing channels are a security threat. This work is presently unfunded.

And finally, we have not yet solved the problem of verifying seL4 for multicore platforms. While there exist kernels with a multicore verification story, these kernels have performance that makes them unsuitable for real-world use. Thanks to our past research we now understand how to verify multicore seL4, but we need funding to do it.

So much about seL4 research being “done”. seL4 does define the state of the art, but the state of the art is still a fair bit behind the needs of the real world.

Work to do: Scaling trustworthiness to full systems

Beyond seL4, there’s the wider Trustworthy Systems agenda: creating a societal shift  towards mainstream adoption of software verification, as the TS home page has been saying for years. We have made some progress here, with verification uptake increasing in academia and industry, but it’s far from mainstream.

To enable this shift, the team has more concrete research goals. These include:

  • Lower-cost approaches for verifying the non-kernel parts of the trusted computing base, such as device drivers, file and network services, but also the actual applications. So far, verified software is still more expensive to produce than the usual buggy stuff (although life-cycle cost is probably already competitive). TS’s declared aim is to produce verified software at a cost that’s at par with traditionally engineered software;
  • Proofs of high-level security properties of a complete system (as opposed to “just” the underlying microkernel);
  • Proofs of timeliness of a complete real-time system built on seL4;
  • Design of a general-purpose operating system that is as broadly applicable as Linux, but where it is possible to prove security enforcement.

These are all research challenges that remain unsolved, are of high importance for the security and safety of real-world systems and which TS is in a prime position to address. DARPA is throwing many millions at scaling up trustworthiness. So much about “research done”.

Tackling big problems was always core to the TS approach. As you can see, we’re nowhere running out of big problems to solve! And we have the track record and credibility to deliver, but we need funding to do it.

Speaking of AI

AI systems are increasingly used in life- and mission-critical settings, autonomous cars are a great example. But can we trust our life on an AI system, if a hacker can bypass or influence its decision? Clearly not, and the TS research agenda is very much about enabling such trust.

I can support this with an industry quote, from Dr Daniel Potts of autonomous driving company (and seL4 Foundation member) Ghost Locomotion:

“Ghost is building a trustworthy AI system that will deliver safe self-driving for consumer cars. The company is using formal methods to achieve the reliability required to deploy AI to millions of cars with the guarantee that no harm will be done. AI can only be safely deployed in the field if the underlying system is trustworthy. 

“There is more research required into how to architect, construct and integrate an AI application on a trusted system of this complexity, and how we achieve this at scale. Investing in AI research without investing trustworthy systems research will greatly diminish the impact and applicability of AI to real-world products.”

Clearly, few are better placed than TS to do this research.

Moving forward

The reaction of the community to this crisis facing TS has been incredible supportive, and there are many discussions about supporting TS and its research agenda. Clearly, the seL4 Foundation is key, and we are encouraging the community to support the Foundation, by joining or by providing direct financial and in-kind support, or engage with TS directly.

For now I’m very happy to announce that UNSW’s School of Computer Science & Engineering has committed to support TS to the end of the year!

This is great news, as it gives us the time to line up more pathways and support to ensure the future of TS and its research and tech-transfer agenda. With the community’s help we’ll get there!

seL4 on RISC-V Verified to Binary Code

… and seL4 and RISC-V Foundations form an alliance

In June 2020 we announced that the seL4 microkernel, the world’s first operating system (OS) kernel with a machine-checked proof of implementation correctness, has now also been verified for the RV64 architecture, making it the first formally verified OS for RISC-V. We are pleased to announce that this verification has now been extended to the executable binary, meaning that the machine code running on the processor is proved to be correct against the kernel’s specification. RISC-V is the first 64-bit architecture for which this has been achieved.

What does this mean?

The previously announced proof means that, according to the semantics of the C language in which seL4 is implemented, the kernel will always behave as specified. Among others this means that seL4 cannot be attacked with stack overflows, malformed inputs or other forms of code injection or control-flow hijacking – it is provably secure in a very strong sense. However, there is still the risk of security holes resulting from a buggy (or compromised) C compiler, or from the compiler and kernel developers interpreting the C semantics differently.

The newly completed binary verification completely removes these risks – it guarantees that properties we prove about the C code hold for the executable code, and thus that the executable kernel binary behaves as required by the kernel’s formal specification. 

More than porting to a different ISA

The binary verification toolchain. Dark blue arrows are proved.

While seL4’s implementation correctness proofs use interactive theorem proving, with hundreds of thousands of (mostly hand-written but machine-checked) lines of proof, the binary verification uses an automated tool chain (see the seL4 White Paper for details). The tool chain converts both the C code as well as the binary into an intermediate language that represents the control flow of the program. It then uses SMT solvers to prove equivalence of the two programs, one short code sequence at a time. 

SMT solvers prove properties by a very efficient exploration of the state space, using state compression techniques to make the problem tractable. We had previously built the binary-verification toolchain for the 32-bit Armv7 architecture. As the state space grows exponentially with the word size, taking the step to a 64-bit architecture requires overcoming significant scalability challenges – which the ingenuity of our team around Matt Brecknell and Zoltan Kocsis could overcome in the end.

Further implications

This work represents a significant step for both the RISC-V and seL4 ecosystems. No 64-bit architecture other than RISC-V presently has an OS with such a comprehensive verification and security story. And seL4 has with RISC-V the ideal base for driving further innovation in computer system security, especially for our work on the systematic prevention of information leakage through timing channels, based on the approach we call time protection. Stay tuned for more exciting results to come!

We are now formalising the link between the two ecosystems by announcing that RISC-V International and the seL4 Foundation are joining each other as Associate Members.

No Safety without (Cyber-)Security!

It’s a common experience: I talk to people developing safety-critical embedded systems, be it cars or medical devices, and, while clearly serious about product safety, they show little interest in security. A great example was when, as a part of an delegation of Australian academicians, I visited an automotive testing facility in another country. They showed us their impressive driving range, which featured all imaginable road conditions, from unsealed tracks via city roads with pedestrians and public transport to 200 km/h freeways. They proudly talked about all the safety recalls they triggered, and how they were now focussing on autonomous driving. I asked how much of the testing was about cyber-security. The answer was “nothing”.

This is crazy, given the spade of car-hacking attacks over the past number of years, which are obviously a huge safety problem! But those safety testers are not alone – the attitude is common across the car industry, and not much different in the space of medical implants. Most of these systems have serious security vulnerabilities, which are typically a result of mushrooming functionality and hence complexity. Complexity is always the arch-enemy of security, as most security failures result from unforeseen interactions of features. At the same time, features drive sales, so the more of them the better.

This even applies to the kind of devices where we do not typically think of “features”, as in medical implants. Consider a heart pacemaker. Decades ago that was an (electronically) simple device, that had a sensor or two, a control loop of probably at most a few 100 source lines of code (SLOC) and an electrode (actuator). That was enough to keep a person alive as long as they behaved carefully, within a relatively constrained operating envelope. But there was a clear motivation for doing better, to expand the operating envelope (for instance, allowing the wearer to exercise). So the number of sensors increased, as did the complexity of the control logic. But the expanded envelope required tuning for patient specifics, meaning adjusting operating parameters. That requires an external interface to allow the physician to access performance data and update settings. This needs to happen without opening up the patient, so it needs a radio interface.

Before you know it, you have device drivers for radios, and complex protocol stacks for providing communication over a lossy wireless medium. We are now easily into 100s of kSLOC in software complexity. This communications software will have bugs that can be exploited by an attacker. And once the attacker is on the device, they can interfere with its operation and harm, possibility kill the patient (while leaving no traces).

But these devices have to undergo rigorous evaluations to be certified for safety, right?

Well, sort-of. Reality is that these safety certification regimes do not really consider systematic attacks (remember the automotive testing facility?) How is that possible, given the strong and sustained safety focus in areas where devices can maim or kill?

I believe that at the root of the problem are attitudes, ways of looking at the problem. Traditional safety engineering is about avoiding faults and minimising their impact, and is based on some core assumptions:

  1. faults are random, and the risk they pose can be expressed as a probability;
  2. faults are rare, i.e. the probability can be made small;
  3. faults are independent, meaning that the probability of multiple faults is the product of the probabilities, and therefore very small.

These assumptions are appropriate for physical artefacts. The problem is that, when dealing with software, these assumptions are all wrong!

Firstly, software behaviour is never random, it’s highly deterministic. Which means if a certain sequence of events trigger a fault, the same sequence will trigger the fault again. If an attacker knows that sequence, they can trigger the fault at will.

Secondly, software faults are not rare at all. The rule of thumb is 1-5 faults per kSLOC, with (unusually) well engineered software it may get as low as 0.3/kSLOC. But with 100s of kSLOC in your system, you have dozens, more likely 100s of faults.

And software faults are definitely not independent. If you know how to trigger one fault, you can combine it with the next one you find. And that means you can daisy-chain exploits to get deeper and deeper into the system. This also applies to systems that, on first glance, seem to employ physical separation. For example, cars have multiple networks (traditional CAN busses as well as wireless networks) for different purposes, where the critical control functionality is on a separate network from infotainment and other less critical parts. However, there are devices that connect to multiple busses and therefore act as gateways. Once an attacker has intruded an externally-facing network, they can then attack the gateway and work their way to the place where they can take over the vehicle. These days, physical separation is basically an illusion, except in some military settings.

So, the bottom line is that with systems containing significant amounts of software (that’s about all of them nowadays), safety must be looked at as a security problem. This requires a radical change of developer mindset. Instead of thinking about the likelihood of a fault, one has to assume that there is a fault, unless there’s conclusive proof of the opposite. And when we are looking at a 100 kSLOC network stack, we know there is a fault (almost certainly many). And, instead of thinking that the fault has some random effect (such as overwriting a random memory location) we have to assume it has the worst possible effect — overwriting the most critical memory location with the most damaging value. Because that’s exactly what an attacker will do.

So the focus must be on preventing those faults from becoming fatal. In particular, we have to assume the enemy is on the device, and must keep them away from the really critical assets. In the pacemaker example, we have to assume that the attacker hijacks the network stack and can execute arbitrary code, and we must therefore prevent interference with the life-critical control loop.

Obviously, the fundamental requirement here is isolation, enabling security by architecture: Components that are not fully trustworthy (which means anything that is more than a few 100 SLOC or has not undergone a thorough security assurance process) must only be able to interact with critical (and presumably highly assured) components via well-defined and small interfaces. And the isolation must be provided by an OS that is as highly assured as the critical components: if you can compromise the OS, then the interfaces can be bypassed.

People who know me will not be surprised when I say that the seL4 microkernel is a great choice here: it is the world’s first OS kernel with a mathematical proof of implementation correctness (and security properties based on that). And it is by far the best-performing microkernel (verified or not).

In any case, you need an OS that you can trust to enforce isolation, and an architecture that uses the OS-enforced isolation to protect the critical parts of the system. In other words, a security architecture enforced by a secure OS. Without that, there will be no safety.

This post originally appeared on 2020-11-05 on the ACM SIGBED blog.

seL4 is verified on RISC-V!

Sounds great! But what does it mean?


seL4 (pronounced ess-e-ell-four) is arguably the world’s most secure operating system (OS) kernel. 

The OS kernel is the lowest level of software running on a computer system. It is the code that executes in privileged mode (S-mode in RISC-V; M-mode is reserved for microcode/firmware).  The kernel is ultimately responsible for the security of a computer system. 

seL4 is a microkernel. The idea of a microkernel is to minimise the trusted computing base – the part of the system for which there is no Plan B if it fails. The Linux and Windows kernels consist of tens of millions of lines of code, and contain literally thousands (more likely tens of thousands) of bugs – a huge attack surface. A well-designed microkernel, such as seL4, has about ten thousand lines – inherently more trustworthy.

What sets seL4 aside from all other OS kernels, including other microkernels, is its verification story. It was the world’s first OS kernel with a machine-checked, mathematical proof of the functional correctness of its C implementation (winning us a Hall of Fame Award as a result). This means that it is proved to be bug-free relative to a specification formulated in a mathematical logic. And by now it has proofs about further security properties (which show that the specification has the right properties) and functional correctness extending down to the binary code. And it has the most advanced support for hard real-time systems. And it is the world’s fastest microkernel. It’s best in class by any definition.

We originally verified seL4 for 32-bit Arm processors. We then extended that to 64-bit x86 processors. And now to RISC-V RV64 processors. Which now covers all the important ISAs.

seL4 on RISC-V

But the combination of seL4 and RISC-V is special.

RISC-V is an open instruction-set architecture (ISA). seL4 is an open-source OS microkernel. It’s a match made in heaven. Especially in terms of security. 

When we verify seL4, we have to assume that the hardware operates correctly (i.e. as specified). That assumes that there is an unambiguous specification in the first place, which is not the case for all hardware. But even where there is such a specification, and it is formal (meaning written in a mathematical formalism that supports mathematical reasoning about its properties), how do we know that it actually captures the behaviour of the hardware? Reality is we can be pretty sure that it does not. Hardware is no different from software in that both have bugs.

But having an open ISA has advantages that go beyond being free of royalties. One is that it allows having open-source hardware implementations. An example of this is the CVA6 (formerly Ariane) core developed at ETH Zurich. With an open-source implementation, you see what you get, and can check for yourself whether it has security-critical bugs. This is what German company HENSOLDT Cyber GmbH did: they produced a chip, based on Ariane, with a strong supply-chain security story. And, to complement this with a secure OS, funded the verification of seL4 on RISC-V. [Disclaimer: I have an interest in HENSOLDT Cyber.]

The most exciting aspect of an open ISA with open-source hardware implementations is the prospect of verifying the implementations. This sounds like a big ask that will be hard to realise. But the same was said about a verified OS kernel – until we did it. I know there are multiple groups working towards this goal, and sooner or later one of them will succeed. This will be a revolutionary step towards achieving real security, and I bet it’s only a few years away!

Hardware-software co-design for security

Open-source hardware with a verified, open-source OS is a great thing for security, but there is more.

The combination of open ISA, open-source OS and open-source hardware enables innovation at the hardware-software interface that was previously reserved for hardware manufacturers (and thus happened at a snail’s pace). This is really important for security.

We are all aware of Spectre attacks, which use side effects of speculative execution to steal secrets. What is less understood is that speculative execution on its own is not enough. Spectre needs a covert timing channel to get the information out. Without the covert timing channel there is no Spectre attack.

Preventing these covert timing channels happens to be something we’ve been working on for a few years. We developed OS mechanisms for preventing them, but only to realise that they don’t quite work. Digging deeper, we found that this is because the hardware doesn’t give us the right mechanisms. In other words, most existing hardware is broken in terms of security! And the underlying cause is the ISA, which is our hardware-software contract. The ISA specifies functionality of the hardware, but intentionally abstracts away anything to do with time. But timing can be used to leak information, and the ISA does not provide the means for stopping this. As such, the ISA is insufficient for ensuring security, and must be improved.

This would normally be hopeless endeavour: trying to convince hardware manufacturers to agree to a new contract that imposes additional restriction on what they can do. (Trust me, I’ve tried!) Or even provide the OS with an extra instruction it can use to defeat covert channels – as long as the house is not on fire, they are unlikely to listen. And the Spectre fire is apparently not burning hot enough yet.

Open hardware changes this. It enables us to innovate without waiting for manufacturers. It enables us to control both sides of the fence. It enables true co-design, for the benefit of security.

Partnering with the creators of Ariane at ETH Zurich, we did exactly that: We explored how we can amend the hardware-software contract just a bit to give the OS the right means to defeat covert channels. And we could demonstrate that it takes very little to be highly effective.

The relevant working groups in the RISC-V Foundation are discussing these mechanisms right now. Stay tuned for RISC-V setting the benchmark in processor security, complementing the world’s most secure OS kernel – seL4.

What to know more?

The seL4 web site explains seL4, and has a whitepaper, written for non-experts, that explains seL4 and its verification story in detail.

[Note: This blog was originally written for the RISC-V blog site.]

Update 2020-07-30: fixed CVA6 link

The seL4 Foundation – What and Why

We have created the seL4 Foundation!

But what is the seL4 Foundation, and why did we create it?

In a nutshell, these are the reasons (I’ll expand on them below):

  1. Provide for the longevity of seL4 beyond support from any specific organisation
  2. grow and integrate the seL4 ecosystem
  3. protect and promote the seL4 brand
  4. provide a platform for funding on-going engineering as well as sharing the cost of big-ticket verification items.

Why seL4 Foundation

Ensure longevity of seL4

seL4 is a game-changer for safety- or security-critical systems.  Being an OS microkernel, seL4 is at the bottom of any software stack. To use it properly and to achieve the strong security properties enabled by seL4, the system must be engineered for seL4. This may require a long-term commitment, requiring significant resources. You therefore want to make sure seL4 stays around, is well-supported, and continues to provide those guarantees.

The viability of seL4 is strongly tied to its developers, the Trustworthy Systems (TS) team, and especially the technical leadership of TS. This in itself is not a problem, because everyone knows how committed we are to the success of seL4. And because there are four technical leaders for the kernel (Kevin, Gerwin, June and myself), we have a degree of redundancy (although we do work best as a team!)

We believe that as an independent, nonprofit organisation, the seL4 Foundation can make the seL4 story even stronger, and an even better choice for developers who build trustworthy systems. This is especially true if you consider a time-scale of decades, which is necessary for many of the early adopters of seL4 in the infrastructure, medical, and defence domains. By setting up an organisational framework that is supported by (and accountable to) adopters (subject to providing financial support, of course) we can plan for the future. Our Board of Directors will represent the voices of significant adopters, which will help keep the technical leadership aware and responsive to the needs of adopters.

Grow and integrate the seL4 ecosystem

Right now, the biggest barrier to uptake of seL4 is probably the lack of system components and tools. This includes board-support packages (BSPs), device drivers, file systems, networks stacks, high-level programming environments (something at the abstraction level of POSIX, but preferably based on a more suitable model), and configuration and development tools. Plus debuggers, IDEs etc.

Without some supporting organisation, many of those missing bits will have to be (re)built by every adopter. This would a huge waste of effort better spent developing new trustworthy systems. Furthermore, I believe that there are plenty of people out there who would be happy to contribute if there were more guidance, active community engagement, and a clear shared direction forward, so that the contributor can be confident their input will make a difference for the long term.

We think that the way to address this is having a forum where the community, including developers, adopters, and researchers can meet and engage. The seL4 Foundation as an open and neutral entity, where the broader community is represented, and strategy is discussed openly, seems the right way to do this.

Protect and promote the seL4 brand

seL4 has become a recognised brand that is strongly associated with its unique properties – the world’s fastest and most advanced microkernel, with a powerful assurance story. Even ten years after we concluded the original proofs, there is still no other general-purpose, non-toy OS kernel with a proof of implementation correctness. And on top of all this, seL4 is open source.

These properties drive the adoption of seL4, providing a competitive advantage to the developers of seL4-based products. But, of course, this creates expectations: a system based on “verified seL4” is expected to be more trustworthy than others. This trust could be undermined by adopters who modify the kernel. Let me explain why.

Mathematical proof provides the strongest assurance possible. But the guarantees it provides need to be thoroughly understood, in order to avoid being misleading the people who rely on them. Even mathematical proof can never cover everything: It always relies on a specific scope (eg. the assumption that the hardware behaves correctly). If these conditions are misunderstood, you might overestimate the guarantees that you get. In particular, the proofs are for very specific versions of the kernel and against a very specific specification. A seemingly trivial code change may invalidate the assumptions and thus void any assurance.

Adopters need clarity on this. Also, a company that sees a competitive advantage in having a verified system does not want this advantage undermined by others taking shortcuts and making unsubstantiated verification claims.

The Foundation will help by officially “blessing” the status of verified versions and precisely identifying the conditions under which they hold. This should help provide clear provenance, so seL4 adopters will be able to assign liability resulting from someone hacking an unverified version to where it belongs: to those providing such unblessed code.

In addition, it should be more visible what changes will affect proofs. That can informally happen with documentation, but a better solution will be a framework where adopters can run the proofs themselves, on the code base they are deploying, to confirm that the proofs still hold. Such a framework will, of course, also help contributors to identify whether adopting their changes will be straightforward (if they do not invalidate the proofs).

All this must go hand-in-hand with the promotion of seL4 and its benefits. The best way to achieve this is through a broad membership base. If you’re reading this, you clearly have some interest in seL4 – please join!

Fund on-going engineering and big-ticket items

seL4 is the result of big investments. Firstly by the Australian tax payers, who (through NICTA) funded its creation, and (through NICTA and then CSIRO’s Data61) continued supporting it. Over the past 6 years, US taxpayers (mostly through DARPA, but also other parts of the DoD as well as DHS) invested a lot in completing and extending the verification story, as well as deploying on real-world systems. And most recently, HENSOLDT Cyber funded verification of the RISC-V port of the kernel. [Disclosure: I have an interest in HENSOLDT Cyber.]

These are awesome expressions of confidence in seL4 and the team behind it, and we are immensely grateful for this support. Yet, we need more.

On the one hand we need a revenue stream to fund on-going support of the community, maintenance of existing versions, improvements of the infrastructure (in particular the automation of the seL4 verification framework). We need support for what we call strategic engineering – evolving the kernel and, importantly, developing the ecosystem. While we hope that much of the latter will eventually be done by the community, we will for now have to keep showing the way, by providing reference designs and sample components. This is obviously needed to make seL4 an attractive platform for commercial systems.

On the other hand, we have a number of big-ticket items that are crucial to our aim of making seL4 the Linux of the embedded world. The biggest of these are:

  • Verification of the 64-bit Arm kernel
  • Verification of the multicore kernel

There are plenty of companies keen to see both of these (and a few others) done, and several of them are willing to put money on the table to make it happen. However, the investment required for these two is above the pain limit of what most companies can justify for an open-source project, where they cannot walk away with a bunch of proprietary IP. But most can see the value of enabling it as an open-source platform.

So this is the third major goal of the Foundation: Provide on-going base funding, and bring these interested parties together and divide the required investment in a way that keeps individual contributions below the pain limit.

Together we can do it – we can change the world of critical systems, to achieve real security!

What will the seL4 Foundation do?

Obviously, the core function of the Foundation is to address the above four points, which were the rationale for setting it up in the first place. But how will this work in practice?

Community engagement

This is really the central mission: Engage and grow the community of seL4 developers and adopters, and direct and standardise the evolution of seL4 and its ecosystem.

There are several aspects to this. On the one hand there is the seL4 kernel and its proofs, on the other hand there is all the stuff around it that is essential to using it. And everyone who has had a serious look at seL4 will appreciate that the 10kLOC kernel is just the foundation – you need so much more to build a system on it!

As mentioned above, this is really the biggest barrier to uptake at the moment. There is not a lot of seL4-based userland that is ready to use, and there is very limited support through tools and development environments. Reality is that anyone who wants to deploy seL4 in a real-world system ends up developing much of this themselves.

This is obviously a waste! It will be to everyone’s benefit if at least the core components and tools are shared. Things like device drivers, network stacks, or core system services contain no IP of value but bear a significant cost. We need to reduce the cost by sharing those artefacts. There are two aspects here: Those who are developing such artefacts should share them with the community, and members of the community should adopt and maintain them.


The seL4 developer community.

The (highly simplified) diagram above indicates this with the “Core userland” box – these are basic system services everyone needs. In fact, these typically correspond to components which Linux would have deep inside the kernel. We (TS) see it as our job to provide some of the core userland infrastructure, such as the CAmkES component framework, and maintain it at least initially. And we need to provide at least examples of others, such as well-integrated device drivers. However, we want the community to contribute to this, and hope that an increasing number of such parts will be adopted by maintainers from the developer community. The lower in the stack a component resides, and the more unique it is, the higher will be the bar for such adoption. This is because such core components must be designed right, meaning that maintaining them requires a very thorough understanding of how seL4 and seL4-based systems operate. Ideally, such core components should also be verified, but it will be a while until that becomes reality.

Things get easier with more high-level components, developer tools, libraries, or drivers for yet another NIC, once a performing example of a similar NIC driver exists. Some are still low-level enough for their Linux equivalents to live inside the Linux kernel (device drivers, crypto libraries) or are part of the “OS environment” (programming-language runtimes, shells, etc). But their dependence on the seL4 model is less, or there are good templates. Furthermore, there may be benefits from competing components with similar functionality (eg libraries optimised for small IoT systems vs desktops vs servers). Here we see our (TS’s) responsibility as mostly “showing the way” by providing sample code or designs and putting them out for community adoption.

The kernel itself will for the foreseeable future continue to depend strongly on our expertise. Besides ensuring purity of the design, this has a lot to do with verification: Very few people understand how a (seemingly innocent) change to the kernel will affect verification, and even our experts sometimes grossly underestimate the cost of changes. And, of course, when the kernel has changed, the proofs need to be updated, and that again is something few people can do at the moment. We do hope to get to the point where more non-TS community members can contribute proofs, but for the time being are committed to remain central to the kernel’s evolution.

We certainly welcome platform ports, and we are working on ways to make verification of a platform port easier, maybe semi-automatic. But that is still research!

And we do commit to an open process for evolving the kernel! This will be part of the duty of the technical steering committee of the Foundation, with representation of developers, which will run an open process for kernel evolution.

How does the seL4 Foundation operate?


The seL4 Foundation is set up as a Project under the Linux Foundation (LF) and follows the established structure of LF. This has the advantage that it is a familiar setup that is used by hundreds of open-source projects. It also means that we benefit from LF’s existing infrastructure (organisational and legal) as well as their existing networks for recruiting members. The structure is shown in the figure.


The structure of the seL4 Foundation under the Linux Foundation.




In line with LF, we clearly separate the Foundation’s governance from technical leadership and contributions. The Foundation has a Board that is responsible for the governance. In particular, it controls the “directed fund”, which is where membership fees go (minus a tax to LF). The directed fund can also receive extra contributions from members, it will especially be used to pool funds for the “big ticket” items I discussed above.

The technical activities are underneath the legal body of the “LF Projects LLC”, LF’s nonprofit company. Under it, the “seL4 Series LLC” is a virtual company for the seL4 Foundation. It holds the rights to the seL4 trademark and the domain name, and it hosts the GitHub source repos. Importantly, contributions to the source code no longer require a contributors license agreement (CLA). Instead contributors will provide a standard developer certificate of origin (DCO), as is standard with LF projects, removing the most bureaucratic part from contributions.

Membership and Board

Similar to the LF itself, the seL4 Foundation has different classes of membership, that model the LF’s. Note that to become a member of the seL4 Foundation, an individual or organisation must first be a member of good standing in LF (and pay their membership fees as well).

Fees for regular members are tiered by size, similar to LF, although our fees are lower than LF’s for small companies and somewhat higher for large ones. Premium members (who can be regular members of LF) pay extra in exchange for the privilege of having a guaranteed Board seat. Individuals and non-profit organisations (such as universities or other open-source foundations) can join for free as associate members. The Trustworthy Systems team, as the creators of seL4, have a special status as premium members without paying membership fees for five years.


seL4 Foundation membership and Board.

The Board is composed of three members from TS, and one from each premium member. In addition, all regular members together elect a further Board Member. Finally there is the Technical Steering Committee, which is composed of committers and Technical Leaders (such as the non-committing leaders of TS) and is responsible for setting technical directions for the seL4 ecosystem. It has a chair, who is an ex-officio member of the Board.

The high representation of TS on the Board shows our commitment to making the Foundation work. We are not merely dumping this on the community and moving back to research. We are staying engaged, and have committed our resources for a minimum of 5 years. We chose this model in response to specific feedback from adopters.

Contributors, Committers and Technical Leaders

The Contributors make up the seL4 community – people who contribute code, documentation, tools etc to the seL4 ecosystem. A selected subset of Contributors are designated as Committers, these are the ones who can directly commit to the repository and accept pull requests.

Initially, the Committers are TS members who commit to seL4 code, proofs, docs, or userland code. We anticipate (in fact sincerely hope!) that we will soon be able to appoint a significant number of Committers from the community for non-kernel components. These will be approved by the Technical Steering Committee (TSC), which is made up of Committers and Technical Leaders.

One particularity of seL4 is that it has Technical Leaders who don’t commit code, but provide leadership on design and implementation issues (this includes yours truly). This is a very small number of people, who have the same say as Committers, including being part of the TSC, except the right to commit code.

The TSC elects a Chair, who is an ex-officio member of the seL4 Board, to represent the technical community.

Does this mean research on/for seL4 is done?

Not at all! We at TS will continue to do research that will feed into seL4, to ensure it remains the most advanced OS technology. In particular, our work on time protection as a principled way to prevent information leakage through timing channel has still a fair bit to go until it is production-ready and can make it into the mainline kernel.

This is in addition to research on architecture of secure systems, and tools and languages that make it easier to build high-assurance (ideally verifiable) components on top of seL4. This cluster of research will be an enabler for achieving our vision of not only a verified kernel, but verified core system components. One aim for the next few years will be to be able to build at least some simple systems with a completely verified trusted computing base.

We see this research as a central part of our on-going contributions to the seL4 ecosystem. And we hope that the Foundation will also lead to more collaboration in seL4-related research.

Ready to get involved?

The interim Foundation web site provides more details on how to join, including the full set of legal documents and a link to the LF interface for becoming a member.

But we need not only members, we also need active contributors, whether they are companies or individuals. If you want to be one of them, look at the Contribute page for projects looking for help, which can range from ports via enhancements to people becoming committers.

And if you want to deploy seL4 to secure your product, check out retrofitting and transitioning to seL4 for ways of doing this and the list of available userlevel infrastructure. And contact us if you are looking for commercial support for seL4, we can link you up with one of the growing number of companies providing this.

seL4 Design Principles

seL4 has been our team’s greatest achievement, but it didn’t fall out of the sky: it was the result of 15 years of research, and has evolved further for the past 10 years.

From the beginning, the design of seL4 has been driven by a number of principles. But a recent internal discussion about some fine points of the spec (as well as some discussions with externals) reminded me that some of these principles are in the minds of the designers but not really documented. This can lead to people (internal as well as external to Trustworthy Systems) arguing for APIs that are not in the spirit of seL4. Hence I’ll try to write up these principles.

The core drivers of the seL4 API are:

  1. Verification
  2. Minimality
  3. Policy freedom
  4. Performance
  5. Security
  6. Don’t pay for what you don’t use

Then there are a few non-goals:

  1. Stopping you from shooting yourself in the foot
  2. Ease of use
  3. Hardware abstraction

I’ll discuss them in turn.


These are the drivers of seL4’s design.


This is most obvious, and is seL4’s core claim to fame: we have a proof that the implementation adheres to the spec (functional correctness) as well as proofs of higher-level security properties. And with any changes to the kernel, the proof needs to be updated to re-establish functional correctness.

Verification has a significant impact on some design decisions – it shifts the trade-off between alternative designs. For example, concurrent systems are notoriously hard to verify, so we do everything to keep concurrency out of the kernel. Examples include the virtual TCB array and “long IPC” that was used in the original L4. These designs led to exceptions happening while in kernel mode. Such nested exceptions introduce concurrency, and are not acceptable for seL4, which is designed (and proved) never to trigger an exception inside the kernel. Interestingly, these problematic designs were already abandoned before we did seL4 (for simplicity and performance reasons), so verification did not force an inferior design. But it is an important factor in every design consideration, and certainly forces us to think carefully about every design choice. And this careful consideration has frequently led to a cleaner design.

A core challenge during kernel evolution is that re-verification takes much longer than the code changes that triggers it. We need to minimise the re-verification cost, so it is critical that code changes don’t make the life of our verifiers harder than necessary. In many cases, the impact of a code change on the re-verification effort is hard to predict, even for our experienced kernel engineers (and much harder for externals, which can sometimes lead to frustration, if externals suggest a change we have to decline as it would create a verification headache).

We handle this by keeping verification engineers in the loop when proposing changes to the kernel. This helps a lot, but does not eliminate surprises. Many verification challenges only show up when trying to re-prove kernel invariants, and frequently these lead to changes in the code, sometimes even the API, to make verification easier.


This principle pre-dates seL4 by many years, and is the core principle of the original L4 microkernel, the ancestor of seL4 (and the whole L4 microkernel family). It was expressed by Liedtke in his seminal Hall-of-Fame paper On μ-kernel construction as:

A concept is tolerated inside the µ-kernel only if moving it outside the kernel, i.e., permitting competing implementations, would prevent implementation of the system’s required functionality.

This is a fairly categorical aim, and really an ideal, in that every kernel I’ve seen violates it at least a bit. But it is an excellent design driver, that has served the L4 microkernel family well for over a quarter of a century. With seL4 it has taken on additional significance: Verification cost grows roughly with the square of the code size, an excellent motivation for keeping it small!

Aspects of minimality are that we have no device drivers in the kernel (other than the interrupt-controller and a timer), and seL4’s extreme approach to memory management, where even kernel memory is managed by user-level code.


This is also an original aim of L4 (and microkernels in general): be a (trustworthy) foundation for a very large class of system designs. It is the principle that stops minimality from reducing the kernel size to zero.

In reality, there have always been use cases that have been better supported by the kernel than others. And much of our research over the years has been driven by addressing this: evolving the kernel to support a growing class of use cases well. For example, the biggest change to seL4’s functionality has been the introduction of a new scheduling model, with the aim of supporting a wider class of real-time (RT) systems, especially mixed-criticality systems (MCS), where critical RT code co-exists with untrusted code. As it turns out, the MCS kernel cleans up a number of other issues with the API, which make it an improvement even if your use case isn’t an MCS.

Policy freedom

This principle predates even the original L4 by decades. Policy-mechanism separation was explicitly stated as a principle in a 1975 paper about the Hydra OS, but the idea is already clearly contained in Brinch Hansen’s 1970 Nucleus paper, which describes the arch-ancestor of all microkernels. It really is a consequence of minimality and generality: if you want to keep your kernel minimal yet general, you must focus on the basic mechanisms, and build everything else on top.

seL4’s memory-management model takes a big chunk of (kernel memory-management) policy out of the kernel. Similarly for the MCS model, which introduces principled, capability-authorised user-level management of time as just another first-class resource.


Performance, especially of the critical IPC operation, has always been a core driver of L4 kernels: IPC performance is the master, as Liedtke expressed it in his ’93 paper Improving IPC by kernel design. The reason was that attempts to build usable systems on earlier microkernels, especially Mach, failed because the kernel introduced too much overhead. Liedtke’s original kernel showed that microkernels can be fast, and allowed the construction of performant systems on top.

From the beginning, seL4 was designed to be suitable for real-world use, and we therefore considered an uncompromising design for performance as essential. My message to the team at the beginning of the project was “I will not consider the project a success if we lose more than 10% performance compared to our fastest kernel to date.” At the time of the original seL4 paper, we were right at that 10% limit. But further optimisation got us to the point where seL4 became faster than all our earlier kernels.

This became the tagline for seL4: Security is no excuse for bad performance! Besides its verification, performance is what sets seL4 apart from all other microkernels – it really sets the benchmark.

Performance: Focus on the hot code paths

This performance aspect results from the observation that in any system, some operations are used more frequently than others, and overall performance can be maximised by shifting cost from the frequently used “hot” operations to infrequently-used ones.

The prime example of a hot operation is IPC: in a microkernel-based system, all system services are provided by user-level servers that are invoked by IPC, so this is the predominant operation. Notifications, which are semaphore-style synchronisation primitives, are also frequently used. Handling of exceptions and interrupts is also frequently performance-critical (eg exceptions are used for emulation/virtualisation), but exceptions are mapped onto IPC and interrupts onto Notifications, so optimising those will benefit exception and interrupt handling.

But not all of seL4’s IPC functionality is equally performance-critical. IPC allows transferring capabilities, which is an inherently more costly operation (requires Cspace manipulations). If the “simple”, function-call-style IPC can be made faster while slightly penalising the more complex variants, then this is a winner. It’s the idea behind the IPC fast path, which does the minimal checking that all its preconditions are true, and then executes a very tight code sequence.

Where is the boundary between “hot” and “cold” code? A good way to think of this is in term of changes to kernel state, i.e. the set of data structures held by the kernel. Every system call changes kernel state (even if only to update accounting information). But there are simple invocation sequences that return the kernel into what I’ll call the same logical state as before the sequence began, meaning that the kernel states only differ in accounting information (execution time charged to threads).

For example, the basic RPC-like server invocation (aka protected procedure call) results in the following changes of logical kernel state, assuming we start with the server blocked on its request endpoint in the receive phase of seL4_ReplyRecv():

  1. client performs seL4_Call(), kernel changes client state to blocked
  2. kernel sets reply object state
  3. kernel moves scheduling context from client to server
  4. kernel unblocks server, which executes the request
  5. having handled the request, server performs seL4_ReplyRecv() on the reply object to respond to the client, kernel blocks server
  6. kernel returns scheduling context to client
  7. kernel unblocks client, which continues executing.

As far as the kernel is concerned, we’re now back at the initial logical state, and state change was temporary, and should be made fast. This is in contrast to typical Cspace manipulations: these are reversible too (eg. a capability can be transferred and later revoked), but not in a simple (in the user view primitive) operation such as a round-trip IPC. An IPC that transfers a capability will leave the receiver’s Cspace changed, and that change is typically long-lived. It’s also inherently more expensive than a basic IPC.

Such logical-state–preserving operations may be accelerated by a degree of laziness, as exemplified in the scheduler optimisation called Benno scheduling: When unblocking a thread on an IPC receive, we don’t immediately insert it into the ready queue, as it is likely to be blocked again soon, which would undo the queue operation just performed. Instead we only insert the presently running thread into the ready queue if it gets preempted. This avoids any queue-manipulation operations during the logical-state–preserving IPC round-trip.

Performance: Don’t pay for what you don’t use

While much of the above principles are discussed in several places, including our 20-years experience paper, this aspect is somewhat buried inside the brains of the designers, but is important for understanding some design decisions. A feature that may benefit a particular use case may have a (small but noticeable) cost to other, frequent use cases.

An example is the “long IPC” that was a feature of original seL4. It supported copying of large message buffers by the kernel, which seems like a good optimisation for sharing bulk data. But the only actual use case was Posix emulation (specifically broken aspects of Posix). But there is a cost even if you don’t use it: the performance cost of extra checks on the IPC path, and the complexity cost of the kernel having to deal with nested exceptions (page faults happening during long IPC). In the end we decided it was a bad design, and a gross violation of minimality (the copying can be done by a trusted user-level server).

A similar argument is behind not supporting scalability of the kernel for large number of cores. On closer consideration, this is a bad idea: migrating even a single cache line from one end of a manycore chip to the other takes 100s of cycles, of the order of a the cost of a complete IPC. So, if you really want this, you should implement it at user level (i.e. a multikernel design). Sharing a kernel image makes sense where the cost of using a multicore configuration would be significantly higher when implementing it at user level, as is the case in a closely-coupled multicore, where the cores share an L2 cache. This is the case we support, while we have no plans scaling the single-kernel-image design to loosely-coupled multicores (not sharing an L2).


Security was always high on the list of requirements for L4 kernels, although in the past came in many cases with a degradation in performance and flexibility. Even before seL4 it was the reason for adopting capabilities for access control, first in our OKL4 microkernel.

In seL4 security is a core principle, the kernel is fundamentally designed for providing the strongest possible isolation. It is the main driver behind seL4’s radical approach to resource management: after booting, the kernel never allocates any memory, it has no heap. Instead, when performing an operation that requires kernel memory (e.g. creating an address space, which requires memory for page tables) the caller must provide this memory explicitly to the kernel (by a process called retyping, which converts user-controlled Untyped, i.e. free, memory into kernel memory).

The model makes management of kernel memory the responsibility of (trusted and protected) usermode processes. The model not only enhances minimality and policy-freedom, it is the core enabler of our proofs of security enforcement. Its aim is to make it possible to reason (formally) about the security properties of seL4-based systems.

Most recently we have taken a step further in extending isolation (and thus security) to timing properties, as a principled way of eliminating timing channels. This is still very much a research topic and not yet ready for the production kernel.

Security: Least privilege (POLA)

An important part of (design for) security is the principle of least privilege, also known as the principle of least authority (with the catchy acronym POLA). It means that any component should only ever have the privileges (power) it needs to do its job, but not more. The fine-grained access control provided by capabilities is a great enabler of POLA, which is why we switched to caps even before seL4.

Security: Delegation and revocation

Least privilege is potentially at odds with performance, as a simple way to minimise authority given to a component is to have it to obtain explicit approval from an external (trusted) security monitor for everything it does – clearly not a good approach. Furthermore, a security-oriented design is likely hierarchical: a subsystem has limited privileges, and it contains sub-sub-systems with even more limited privileges. Enforcing such internal boundaries in subsystems should not be the duty of the top-level (most privileged) component.

This calls for mechanisms for delegating the exercise of (reduced) privilege. If a component has rights to a resource, such as physical memory, it should be able to create an (isolated) subcomponent that with the same or reduced rights to that resource. And it must be able to revoke the delegation.

Capabilities, as seL4 provides them, support effective delegation of (partial) privilege. For example, if you want to have a subsystem that manages a subset of physical resources autonomously then you can supply it with some Untyped memory, which it can then manage without interfering with other components’ memory use. If you want to restrict the component (eg an OS personality) further, you can instead supply it with with caps to pools of TCBs, frames, address spaces etc. It can then manage the pools, but would have to appeal to a higher authority to move memory between the pools. In short, the authority give to the subsystem can be made to match the system’s resource management policy.


There are a number of properties many people would like from the API, but we are explicitly not providing them for reasons I will explain.

Stopping you from shooting yourself in the foot: not the kernel’s job

In fact, the kernel does its best not to limit the size of guns you can use; aiming the gun is your problem (see policy-freedom). Specifically, it’s the system-designer’s job to ensure that dangerous tools can only be used by those who are trusted (ideally proved) to use them responsibly.

However, stopping someone without explicit authorisation from shooting you in the back is the kernel’s job; that’s a core aspect of security.

Ease of use: not a goal of a good mirokernel API

seL4 is a minimal wrapper around hardware that provides just enough so you can build secure and performant systems on top. Consequently, you shouldn’t expect it to be easier to use than bare metal.

Usability is important for building practical systems, of course, but the way to achieve it is by providing higher-level abstraction layers. These are inevitably optimised for particular classes of use cases, introduce policy and reduce generality. This is fine, as the abstraction layer can be replaced if you want to implement a different use case (see the definition of Minimality).

Hardware abstraction: not a goal of the seL4 API

This should be obvious from the above: HW abstraction supports ease of use and introduces overhead. But it will also hide details that are important in some cases.

An example is memory management. Different architectures specify different page-table formats. Trying to fit them under the same abstraction would inevitably lose detail, and prevent seL4-based systems from making full use of whatever hardware support there may be. An obvious example would be architectures that use hashed page tables, such as certain versions of the Power architecture. Forcing a hierarchical page-table structure would imply inefficiencies.

All presently supported architectures use hierarchical page tables, so we could be tempted to ignore that particular aspect. But even on our supported architectures (x64, Arm and RISC-V) there are subtle difference in the semantics of page table which would be unwise to hide.


Like just about everything in systems, our principles are frequently in conflict, meaning it is necessary to find a design that represents the best trade-off between them. Especially performance and verifiability frequently run against an otherwise attractive choice. Conversely, performance considerations occasionally lead to non-minimal features (and increased pain for our verifiers).

Clearly, these principles, other than security and verification, are not absolute. In this sense, good microkernel design is still an art, and understanding the trade-offs is key to good results. But these principles are excellent and time-honoured drivers of good design.