Clarification on seL4 media reports

The open-sourcing of seL4 has created a bit of media coverage, generally very positive, good to see.

Some of it, however, is inaccurate and potentially misleading. I’ve commented where I could, but some sites don’t allow comments.

One frequent error is the (explicit or implicit) claim that seL4 was developed for or under the DARPA HACMS program, stated eg at Security Affairs, and seemingly originated at The Register. This is incorrect: seL4 pre-dates HACMS by several years, and was, in fact, a main motivation for HACMS. We are a “performer” under this project, and, together with our partners, are building an seL4-based high-assurance software stack for a drone, please see the SMACCM Project page for details. [Note 31/7/14: The Register has now corrected their article.]

LinuxPro states that DARPA and NICTA released the software, the release was by General Dynamics (who own seL4) and NICTA. Also, seL4 wasn’t developed for drones, it’s designed as a general-purpose platform suitable for all sorts of security- and safety-critical systems. I for one would like to see it making pacemakers secure, I may need one eventually and with present technology would feel rather uncomfortable about this.

I’ll keep adding if I find more…

2014/07/06

Sense and Nonsense of Conference Rankings

The CORE Rankings

Some may know that a few years ago, the Australian Research Council (ARC) had a ranking of publication outlets produced. For computer science, the exercise was outsourced to CORE, the association of Australian and NZ CS departments (Oz equivalent of CRA). It categorised conferences (and journals) into A*, A, B and C venues.

I have in the past stated what I think of that list: very little. In short, I think it’s highly compromised and an embarrassment for Australian computer science. And I’m outright appalled when I see that other countries are adopting the “CORE Rankings”!

The ARC disendorsed the rankings in 2011. Yet, in 2013, CORE decided to maintain and update it. I argued that updating with a similar process as the original one will not improve the list.

The Fellows Letter

Now, some senior colleagues (Fellows of the Australian Academy of Science) have written an open letter, denouncing not only the CORE list, but basically any use of publication venues as an indicator of research quality.

The letter was, apparently, written by Prof Bob Williamson from the ANU, and fellow group leader at NICTA. Bob is a guy I have a lot of respect for, and we rarely disagree. Here we do completely. I also highly respect the other Fellows (one of them is my boss).

The Fellows essentially argue (with more clarification by Bob) that looking at where a person has published is useless, and the right way to judge a researcher’s work is to read their papers.

What I think

With all respect, I think this is just plain nonsense:

These rankings exist, like it or not. In fact, we all use them all the time. (Ok, I cannot prove that the “all” bit is strictly true, some, like Bob, may not, the rest of us do.) When I look at a CV, the first thing I look for is where did they publish. And I claim that is what most people do. And I claim it makes sense.

Fact is that we know what the “good” venues are in our respective disciplines. This is where we send our papers to, this is where we tell our students and ECRs they need to get their papers accepted. They are the yardsticks of the community, like it or not, it is where you publish to have impact. Publishing in the right venues leads to high citations, publishing in the wrong ones doesn’t.

Of course, we really only understand the venues in our own sub-disciplines, and may be a few neighbouring ones. So, collecting and documenting these top venues across all of CS isn’t a bad thing, it creates clarity.
The idea that someone can judge a person’s work simply by reading some of their papers (even the self-selected best ones), with respect, borders on arrogance. In effect, what this is saying is that someone from a different sub-discipline can judge what is good/significant/relevant work!

If this was true, then we as a community could reduce our workload drastically: We’d stop having conference PCs where everyone has to read 30 papers, and every paper gets at least half a dozen reviews before being accepted (as at OSDI, where I’m presently struggling to get all my reviews done). Instead, every conference would simply convene a handful of Bobs, divide the submissions between them, and each decides which one of their share of the papers should be accepted.

Of course, things don’t work like this, for good reasons. I’ve served on enough top-tier conference PCs to have experienced plenty of cases where the reviews of discipline experts diverge drastically on multiple papers. In my present OSDI stack of 29 papers this is true for about 35% of papers: 10 papers have at least one clear reject and one clear accept! And it is the reason why each paper gets at least 6 reviews: we get the full spectrum, and then at the PC meeting work out who’s right and who’s wrong. The result is still imperfect, but vastly superior to relying on a simple opinion.

Now these reviewers are the discipline experts (in this case, leading researchers in “systems”, incorporating mostly operating systems and distributed systems). If you get such a diversity of opinions within such a relatively narrow subdiscipline, how much would you get across all of computer science? I certainly would not claim to be able to judge the quality of a paper in 80% of computer science, and someone thinks they can, then my respect for them is taking a serious hit.

In summary, I think the idea that someone, even one of the brightest computer scientists, can judge an arbitrary CS paper for its significance is simply indefensible. An expert PC of a top conference accepting a paper has far more significance than the opinion of a discipline outsider, even a bright one!
Of course, that doesn’t justify using the publication outlets as the only criterion for promotion/hiring or anything else. That’s why we do interviews, request letters etc. Also, I definitely believe that citations are a better metric (still imperfect). But citations are a useless measure for fresh PhDs, and mostly of not much use for ECRs.
Nor do I want to defend the present CORE list in any way. I said that before, but I’m repeating for completeness: the present CORE list is the result of an utterly broken process, is completely compromised, and an embarrassment for Australian computer science. And any attempt to fix it by using the existing process (or some minor variant of it) is not going to fix this. The list must either be abandoned or re-done from scratch, using a sound, robust and transparent process.
My arguments only are about top venues. A track record of publishing in those means something, and identifying across all of CS what those top venues are has a value. By the same token I believe trying to categorise further (i.e. B- and C-grade venues, as done in the CORE list) is a complete waste of time. Publishing in such venues means nothing (other than positively establishing that someone has low standards). So, if we bother to have a list, it should only be a list of discipline top venues, nothing more.

2013/05/03

Closing the gap: Real OS security finally in reach!

A few years ago we formally verified our seL4 microkernel down to C code. That proof, unprecedented and powerful as it is, didn’t actually establish the security (or safety) of seL4. But the latest achievement of NICTA’s Trustworthy Systems team actually do provide a proof of security.

What the original seL4 verification achieved was a proof that the kernel’s C code correctly implements its specification. The specification is a formal (mathematical) model of the kernel, and the proof showed that the C implementation is “correct”, in the sense that all possible behaviours, according to a semantics of C, are allowed by the spec.

This is both very powerful and limited. It’s powerful in the sense that it completely rules out any programming errors (aka bugs). It’s still limited in three important respects:

Ok, the code adheres to the spec, but who say’s the spec is any good? How do you know it’s “right”, i.e. does what you expect it to, is it “secure” in any real sense?
The C standard is not a formal specification, and is, in fact, ambiguous in many places. In order to conduct the proof, we had to pick a subset which is unambiguous, and we proved that our kernel implementation stays within that subset. However, we have no guarantee that the compiler, even on our subset, assumes the same semantics!
Everyone knows compilers are buggy. So, even if we agree with the compiler about the semantics of our C subset, we can still lose if the compiler produces incorrect code.

Two sets of recent results remove these limitations. One set of work addresses the first limitation, and another the remaining two.

Integrity and confidentiality proofs

What does it mean for the kernel spec to be “good”? It means that the kernel can be used to build the kind of systems we want to build, with the right properties. The specific properties we are after are the classical security properties: Integrity, confidentiality and availability. (For safety a further property is required, timeliness. seL4 has this too, but I’ll discuss this another time.)

Availability is actually a non-issue in seL4: the kernel’s approach to resource management is not to do any, but to leave it to user-level code. The exact model is beyond the level of technicality of this blog, I refer the interested reader to the paper describing the general idea. The upshot is that denial of kernel services is impossible by construction. (This does not mean that availability is necessarily easy to enforce in seL4-based systems, only that you can’t DOS the kernel, or a userland partition segregated by correct use of kernel mechanisms.)

Integrity, at the level of the microkernel, means that the kernel will never allow you to perform a store (write) operation to a piece of memory for which you haven’t been explicitly been given write privilege; including when the kernel acts on behalf of user code. The NICTA team proved this safety property for seL4 about two years ago (see the integrity paper for details). In a nutshell, the proof shows that whenever the kernel operates, it will only write to memory if it has been provided with a write “capability” to the respective memory region, and it will only make a region accessible for writing by user-code (eg by mapping it into an address space) if provided with a write capability.

Confidentiality is the dual to integrity: it’s about read accesses. The figure at the left illustrates this, by showing allowed (black) and forbidden (red) read accesses. However, it turns out that confidentiality is harder to prove than integrity, for two reasons. 1) violation of confidentiality is unobservable in the state of the entity whose confidentiality has been violated 2) confidentiality can be violated not only by explicit reads (loads or instruction fetches) but also by side channels, which leak information by means outside the system’s protection model.

A NICTA team led by researcher Toby Murray has just recently succeeded in proving that seL4 protects confidentiality. They did this by proving what is called “non-interference”. The details are highly technical, but the basic idea is as follows: Assume there is a process A in the system, which holds a secret X. Then they look at any arbitrary execution traces of an arbitrary second process B. If the initial state of the system of any two such traces is identical except for the value of X, they proved that the two traces must be indistinguishable to B. This implies that B cannot infer X, and thus A’s confidentiality is maintained. The technical paper on this work will be presented at next month’s IEEE “Oakland” security conference.

Binary verification

In order to remove from our trusted computing base the C compiler, as well as our assumptions on the exact semantics of C, the team proved that the kernel binary (i.e., the output of the compilation and linking process) is consistent with the formal model of the kernel implementation. This formal model is actually what had earlier been proved to be correct with respect to the kernel spec (it had been obtained by formalising the C code using the formal C semantics). By verifying the binary against the formal model, the formalisation process (and thus our assumptions on the C semantics) are taken out of the trusted computing base.

Binary verification workflow and proof chain

The verification of the binary went in multiple phases, as indicated by the figure on the right. First, the binary was formalised. This was done with the help of a formal model of the ARMv6/v7 architecture. That formalisation of the architecture had previously been done at Cambridge University, in close collaboration with ARM, and is extensively validated. It’s considered to be the best and most dependable formalisation of any main-stream processor.

The formalised binary, which is now a complex expression in mathematical logic inside the theorem prover HOL4, was then “de-compiled” into a more high-level functional form (the left box called “function code” in the figure). This translation is completely done in the theorem prover, and proved correct. Similarly, the formal kernel model is transformed, using a set of “rewrite rules” which perform a similar operation as the compiler does, into a functional form at a similar abstraction level as the output of the above de-compilation. Again, this operation is done in the theorem prover, by application of mathematical logic, and is proved to be correct.

We now have two similar representations of the code, one obtained by de-compilation of the binary, the other by a partial compilation process of the formalised C. What is left is to show their equivalence. Readers with a CS background will now say “but that’s equivalent to the halting problem, and therefore impossible!” In fact, showing the equivalence of two arbitrary programs is impossible. However, we aren’t dealing with two arbitrary programs, but with two different representations of the same program (modulo compiler bugs and semantics mismatch). The operations performed by a compiler are well-understood, and the re-writing attempts to retrace them. As long as the compiler doesn’t change the logic too much (which it might do at high optimisation levels) there is hope we can show the equivalence.

And indeed, it worked. The approach chosen by Tom Sewell, the PhD student who did most of the work, was to compare the two programs chunk-by-chunk. For each chunk he used essentially a brute-force approach, by letting an SMT solver loose on it. Again, some trickery was needed, especially to deal with compiler optimisations of loops, but in the end, Tom succeeded. Details will be presented at the PLDI conference in June.

The resulting tool is fairly general, in that it is not in any way specific to seL4 code, although it has limitations which are ok for seL4 but might be show-stoppers for other code. Specifically, the tools can’t deal with nested loops (but seL4 doesn’t have any, nor should any microkernel!) At present, the tools grok the complete seL4 binary when compiled by gcc with the -O1 optimisation flag. We normally compile the kernel with -O2, and this introduces a few optimisations which presently defeat the tool, but we’re hopeful this will be fixed too.

This is actually quite a significant result. You may have heard of Ken Thompson’s famous Turing Award Lecture “Reflections on trusting trust“, where he describes how to hack a compiler so it will automatically build a back door into the OS. With this tool, seL4 is no longer susceptible to this hack! If the compiler had built in a back door, the tool would fail to verify the correctness of the binary. But regular compiler bugs would be found as well, of course.

Where are we now?

We now have: 1) Proofs that our abstract kernel spec provides integrity and confidentiality; 2) A proof that the formalised C code is a correct implementation of the kernel spec; and 3) a proof that the kernel binary is a correct translation of the formalised C code. Together this means that we have a proof that the kernel binary has the right security properties of enforcing confidentiality and integrity!

There are still some limitations, though. For example, we haven’t yet verified the kernel’s initialisation (boot-strap) code, this is work in progress (and is nearing completion). There are also bits of unverified assembler code in the kernel, and we don’t yet model the operation of the MMU in detail. So there are still some corners in which bugs could, in theory, be hiding, but we’re getting closer to sweeping out the last refuges of potential bugs.

A more fundamental issue regards the side channels mentioned above. These are traditionally classified into two categories: storage channels and timing channels. The difference is essentially whether a common notion of time is required to exploit them.

In principle our non-interference proof covers storage channels, up to the point to which hardware state that could constitute a channel is exposed in the hardware model that is used in the proof. Presently that model is fairly high-level, and therefore might hide potential storage channels. It does cover things like general-purpose CPU registers, so we know that the kernel doesn’t forget to scrub those in a context switch, but modern CPUs have more complex state that could potentially create a channel. Nevertheless, the approach can be adapted to an arbitrarily-detailled model of the hardware, which would indeed eliminate all storage channels.

Timing channels are a different matter. The general consensus is that it is impossible to close all timing channels, so people focus on limiting their bandwidth. So far our proofs have no concept of time, and therefore cannot cover timing channels. However, we do have some promising research in that area. I’ll talk about that another time.

2013/04/28

When “persistent” storage isn’t

A couple of weeks ago we published our RapiLog paper. It describes how to leverage formal verification to perform logically synchronous disk writes at the performance of asynchronous I/O (see the previous blog on the implications of this work).

A frequent comment from reviewers of the paper was: “why don’t you just use an SSD and get better performance without any extra work?”

My answer generally was “SSDs are more expensive [in terms of $/GiB], and that’s not going to change any time soon.” In fact, the RapiLog paper shows that we match SSD performance using normal disks.

What I didn’t actually know at the time is that, unlike disks, SSDs tend to be really crappy as “persistent” storage media. A paper by Zheng et al. that appeared at the Usenix FAST conference in February studies the behaviour of a wide range of different SSDs when electrical power is cut – exactly the scenario we cover with RapiLog. They find that of the 15 different SSD models tested, 13 exhibit data loss or corruption when power is lost! In other words: SSDs are a highly unreliable storage medium when there is a chance of power loss. They also tested normal disks, and found out that cheap disks aren’t completely reliable either.

They did their best to stress their storage devices, by using concurrent random writes. In RapiLog, we ensure that writes are almost completely sequential, which helps to de-stress the media. And, of course, we give the device enough time to write its data before the power goes.

I quote from the summary of Zheng et al.: “We recommend system builders either not use SSDs for im- portant information that needs to be durable or that they test their actual SSD models carefully under actual power failures beforehand. Failure to do so risks massive data loss.”

For me, the take-away is that putting the database log on an SSD undermines durability just as running the database on a normal disk in unsafe mode (asynchronous logging or write cache enabled). In contrast, using RapiLog with a normal disk gives you the performance of an SSD with the durability of a normal disk. Why would you want to use anything else?

2013/03/18

Green computing through formal verification?

This may seem like a long bow to draw, and I admit there’s a bit of marketing in the title, but there’s substance behind it: Verification can save energy in data centres, by significantly speeding up transaction processing.

How does that work?

The key is the defensive way many systems are built to cope with failure. Specifically database systems, which are behind e-commerce and many other web sites, are designed to protect against data loss in the case of system crashes. This is important: if you do a transaction over the web, you want to make sure that you aren’t charged twice, that if they charge you, they are actually going deliver etc. That wouldn’t be ensured (even if the vendor is trustworthy) if the server crashed in the middle of the transaction.

Databases avoid losses by ensuring that transactions (which is now a technical term) have the so-called ACID properties of atomicity, consistency, integrity and durability. The ones relevant to this discussion are atomicity (meaning that a transaction is always either completed in totality or not at all) and durability (meaning that once a transaction is completed successfully, its results cannot be lost). Databases achieve atomicity by operating (at least logically) on a temporary copy of the data and making all changes visible at once, without ever exposing intermediate state.

Durability is generally ensured by write-ahead logging: The database appends the intended effect of the transaction to a log, which is kept in persistent storage (disk), before letting the transaction complete and making its results visible to other transactions. If the system (the database itself or the underlying operating system) crashes halfway through, the database at restart goes through a recovery process. It will replay operations from the log, as long as the log records a complete transaction, and discards incomplete transaction logs. This ensures that only complete transaction effects are visible (atomicity).

Durability is only ensured if it can be guaranteed that the complete log entry of any completed transaction can be recovered. This requires that the database only continues transaction processing once it is certain that the complete transaction log is actually safely recorded on the disk.

This certainty does not normally exist for write operations to disk. Because disks are very slow (compared to RAM), the operating system does its best to hide the latency of writes. It caches writes to files in buffers in RAM (the so-called block cache), and performs the physical writes asynchronously, that is it overlaps them with continuing execution. This is fine, except if you want to ensure durability of your transactions!

Databases therefore perform log writes synchronously. They use an OS system call called “sync()” (to be precise, it’s “fsync()” in Unix/Linux systems). This call simply blocks the caller until any pending writes on the respective file have finished. The database then knows that the data are on disk, and can proceed, knowing that durability is maintained.

What’s the problem with this?

There’s a significant cost, in terms of latency. Disk writes are slow, and the database has to wait until completion. In some cases, it can spend the majority of the time waiting and twiddling thumbs.

What does formal verification have to do with this?

If the underlying operating system is guaranteed not to crash, then we don’t need to block. The database can continue processing, and the OS does the write asynchronously.

This is essentially the thought I had while sitting in a database seminar a few years ago. I was thinking “you guys go though such a lot of trouble to ensure you don’t lose data. We could do that so much easier!”

In fact, that’s what we’re doing now, in a system we call RapiLog. We’re using the stability guarantees of our formally verified seL4 microkernel to get the best of both worlds: the performance advantage of asynchronous I/O, and the durability guarantees of synchronous writes. And does it ever work! We’re observing a doubling of database throughput under the right circumstances!

And increased throughput means that you can process more requests with the same hardware. This means that your data centre can be smaller, which doesn’t only reduce capital cost, but also running cost, most of which is the cost of electricity for powering the computers and for air conditioning. That’s the link to green computing.

And the cool thing is that we can do this with unmodified legacy systems: We run an unmodified database binary (including commercial databases where we have no source) on an unmodified Linux system. We just put it into a virtual machine and let an seL4-based “virtual disk” do all the magic.

How about power blackouts?

Good question! What happens if mains power suddenly goes down?

For the traditional database setup that isn’t a problem. Either the transaction is recorded in a log entry that is safely on disk when power goes, in which case the transaction is durable and will be recovered at restart time. Or there is no (complete) log entry, in which case the transaction was never completed, and is discarded at restart. But with seL4?

Turns out we can deal with that too. Computer power supply have capacitors which store some electrical energy. When mains power is cut, that energy can run the computer for a bit. Not a long time, only some 100 milliseconds. But that is plenty of time to flush out any remaining log data to the disk. Turns out that all we need is about 20 milliseconds, which is very comfortably inside the window given by the capacitors. All we need is an immediate warning (in form of an interrupt raised by hardware).

Implications on system complexity

Maybe even more important than the performance improvement is the potential for reducing system complexity. Complex systems are expensive (in terms of development and maintenance). They are also error prone, meaning more likely to fail. Reducing complexity is always a good thing. And we have shown that some of the complexity (in this case in database systems) can be avoided if we can provide guarantees about the reliability of the software. Which is what our Trustworthy Systems work is all about.

Want to know more?

Read the paper. It will be presented at next months’ EuroSys Conference in Prague.

2012/10/02

Giving it away? Part 2: On microkernels and the national interest

In my previous blog I addressed a number of misconceptions which were contained in Nick Falkner’s blog on the OK Labs sale, and the newspaper article it was based on.

Note that as far as the newspaper article reported on facts it got them essentially right. However, it drew the wrong conclusions, based on an incorrect understanding of the situation and the realities of commercialisation, and these incorrect conclusions triggered Nick’s blog. My previous blog addressed issues around commercialisation of ICT IP. Now I’d like to address some of the specifics of the OK Labs situation, and the NICTA IP involved.

Before delving deeper, I must say that there are severe limits to what I can reveal. I was a director of OK Labs, and as such bound by law to confidentiality with regard to information which I obtained as a director. Beyond that there are confidentiality clauses affecting the main shareholders (which includes myself as well as NICTA). I also was an employee of OK Labs from early 2007 until mid 2010. Essentially I have to restrict my comments to what’s on the public record or was known to me before the respective agreements were signed.

A tale of three microkernels and four names

First there is the issue of the three kernels, the first one appearing under two names, which continues to create confusion, even though the facts were essentially correctly reported in the newspaper article.

L4

Before OK Labs there was NICTA’s version of the L4 microkernel. This was an evolution of the open-source Pistachio microkernel, originally mostly developed at Karlsruhe University in Germany. We had ported it to a number of architectures, including ARM, had optimised it for use in resource-constrained embedded systems, and had designed and implemented some really cool way of doing context switches really fast (factor 50 faster than Linux). We had also ported Linux to run on top (i.e. used L4 as a hypervisor to support a virtualised Linux). Thanks to the fast context-switching technology, that virtualized Linux ran faster than native.

As I said, this microkernel started off as open-source (BSD license), and remained open-source. While the BSD license would have allowed us to fork a closed-source version (while acknowledging the original authors) this would have been a stupid thing to do. We wanted our research outcomes to be used as widely as possible.

Qualcomm

In 2004, that L4 microkernel caught the attention of Qualcomm. They had two specific (and quite different) technical problems for which they were looking for solutions. One required a fast, real-time capable kernel with memory protection. The other required virtualization of Linux on ARM. Our L4 provided both, and nothing else out there came close.

Qualcomm engaged NICTA in a consulting arrangement to help them deploy L4 on their wireless communication chips. The initial evaluations and prototyping went well, and they decided to use L4 as the basis of their firmware.

This was all before OK Labs was founded. In fact, at the time we created OK Labs, the first phones with L4 inside were already shipping in Japan! And all based on the open-source kernel.

OKL4 microkernel

The engagement with Qualcomm grew to a volume where it was too significant a development/engineering effort to be done inside the research organisation. In fact, the consulting revenue started to threaten NICTA’s tax-free status! Furthermore, we saw a commercial opportunity which required taking business risks, something you can’t do with taxpayer $$. This is why we decided to spin the activity out as OK Labs. OK Labs marketed L4 under the name “OKL4 microkernel”, and continued its development into a commercial-grade platform.

OK Labs initially operated as a services business, serving Qualcomm, but also other customers. Note that they didn’t even need NICTA’s permission to do this, they took an open-source release and supported it. Anyone could have done this (but, of course, the people who had created the technology in the first place were best placed for it). Among others, this meant that there was never any question of royalties to NICTA.

Also, it is important to note that Qualcomm would almost certainly not have adopted L4 if it wasn’t open source. Their style is to do stuff in-house, and it would have been their natural approach to just re-do L4. The engagement with us was unusual for them, but it led to NICTA technology being deployed in over 1.5 billion devices.

OKL4 Microvisor

OK Labs later decided to become a product company, and seek VC investment to enable this. They developed their own product, the OKL4 Microvisor. This is the successor of the OKL4 microkernel, and was developed by OK Labs from scratch, NICTA (or anyone else) has no claim to it. It is licensed (and is shipping) on a royalty basis, which is exactly what you expect from a product company.

seL4

Then there is the third microkernel, seL4. This was developed from scratch by NICTA, and its implementation mathematically proved correct with respect to a specification.

International headlines

The correctness proof was the big-news event that made headlines around the world. It is truly groundbreaking, but primarily as a scientific achievement: something people had tried since the ’70s and later put into the too-hard basket. But, as per my atomic-bomb metaphor in the previous blog, once people know it’s possible they can figure out how to do it themselves. Particularly since we had published the basics of the approach (after all, doing research is NICTA’s prime job, and it’s not research if it isn’t published). And it’s seL4’s development (and all the stuff that made its verification possible) that took 25 person years. This is the effort behind the biggest ICT research success that came out of Australia in a long time. It’s fair to say that this has put NICTA on the map internationally.

Commercialising seL4

seL4 is nevertheless something that can be turned into an exciting product, but that needs work. As argued in the previous blog, that’s not something you do in a research lab, it’s company business. That’s why NICTA needed a commercialisation channel.

The way they decided to do it was to license seL4 exclusively to OK Labs, with a buy-out option (i.e. the option to acquire the IP outright) on achieving certain milestones (for the reasons explained in the previous blog). In exchange, NICTA took equity (i.e. a shareholding) in OK Labs, as a way to get returns back if commercialisation succeeded. Using OK Labs as the commercialisation vehicle was an obvious choice: Firstly, OK Labs was developing the market and distribution for this kind of technology. Secondly, OK Labs does all its engineering in Australia, and any alternative would have been overseas. A reasonable deal.

How about national benefit?

The (more or less clearly stated) implication from the commentators that NICTA made a mistake is totally unfounded. And that should not come as a surprise: the people involved in the decision knew what they were doing. The director of the NICTA Lab where the work happened was Dr Terry Percival. He happens to be the person whose name is on the much-lauded CSIRO wifi patent! And NICTA’s CEO at the time was Dr David Skellern. He was the co-founder or Radiata, which implemented CSIRO’s wifi invention in hardware, and got sold for big bucks to CISCO! Those guys knew a bit about how to commercialise IP!

There are comments about the “non-discussion of how much money changed hands”. Well, that happens to be part of the stuff I can’t talk about, for the reasons listed at the beginning.

Also, national benefits aren’t simply measured in money returned to NICTA. There are other benefits here. For one, there is a publicly stated intent by OK Labs’ purchaser, General Dynamics (GD), to not only maintain but actually expand the engineering operation in Australia. Also, one thing we learn is that technology like seL4 isn’t trivial to commercialise, it requires a big investment. GD has the resources to do this, and is active in the right markets, so has the distribution channels. Finally, there is a lot of on-ging research in NICTA which builds on seL4, and is building other pieces which will be required to make best use of seL4. NICTA owns all this, and is certain to stay in the loop. Furthermore, we now have a lot of credibility in the high-security and safety-critical space. This has already shown tangible outcomes, some of which will be announced in the next few weeks.

Did we make all the best decisions? This is hard to say even with the benefit of hindsight. We certainly made good decisions based on the data at hand. The only sensible alternative (both then and with the benefit of hindsight) would have been to open-source seL4, as we had done with L4 earlier. This might or might not have been the best for maximising our long-term impact and national benefit.

We can’t really tell, but what we do know is that we’re in a very strong position of being able to do more high-impact research. In fact, we have already changed the way people think about security/safety-critical systems, and we are likely to completely change they way future such systems will be designed, implemented and verified.

2012/09/27

Giving it away? Part 1: On Commercialisation of ICT IP

There has been some publicity here in Oz about the sale of OK Labs and its implications. Unfortunately not all of that publicity was particularly well-informed. One particular instance is a blog which in turn refers to an article in The Australian newspaper.

The blog, and, to a degree, the newspaper article exhibit some significant misconceptions about how IP commercialisation works, as well as a number of mis-interpretations of publicly-stated facts about the OK Labs sale.

Commercialisation models

To address the headline issue (quite literally) first: why are there no royalties from the sale? Because a sale is exactly that, ownership changes and there is no continuing obligation to the seller. That shouldn’t be a surprise to anyone. If you sell a house you own, you don’t expect to continue to receive a share of the rent, do you?

In that sense, the comparison with CSIRO’s wifi patent is an apples-vs-oranges comparison: they are two different models. Let me explain.

If someone develops some IP, then there are different options for how to exploit it commercially. One can keep ownership of the IP and license it (resulting in a royalty stream). One can sell it to a company which is in a good position to commercialise it. Or one can create a startup company specifically for commercialising the IP.

Which one is best? If there was a clear answer to this, then we wouldn’t talk about different models — there would only be the one and the others would only exist in theory. The fact is, all three are used (and are used by NICTA as well as CSIRO), because what is best depends a lot on the technology in question.

Patents

The nature of the IP is an important factor in this trade-off. CSIRO’s wifi IP was about an invention that allowed doing high-speed wireless, most of which is incorporated in hardware. Being a method, the obvious thing was to patent it. This way, the method is published, but anyone who uses it has to pay royalties, which is what happened with the wifi patent. (But note that most companies just implemented the method without obtaining a license from CSIRO, and it took many years of lawsuits and a massive investment in legal costs before CSIRO saw any money. Not too many organisations have the kind of cash in the kitty to last through something like this, and many IP owners in the past have been bankrupted through this kind of process.)

In the case of NICTA’s “groundbreaking technologies”, the IP was software artefacts. While there is a lot of patenting of software going on, not much of it survives a challenge by cashed-up multinational corporations. We didn’t consider the patenting route appropriate for our technology, because there wasn’t a core “invention”. Hence it is software (program code and mathematics in machine-readable form) which was our IP.

This kind of IP is a bit like the atomic bomb: For a long time it wasn’t clear it could be built, and many experts doubted it could work, but once the first one was exploded, the secret out, and building one is mostly engineering (although of a very sophisticated and expensive kind). Similar here: once we published the fact that we had done what people had tried in the ’70s and ’80s (and then given up), it was clear it could be done. And there’s no way to stop others from doing it too. Although we now have a big lead, and are continuing to innovate to stay ahead. And, for a number of reasons, anyone interested in using this kind of approach is best placed to do it by working with us, so we’ll stay in the loop, even though the original IP is sold. More about that in the sequel.

Products

Another critical factor is what is needed to turn the IP into a product. Remember, NICTA isn’t a software business, it’s a national research lab, which produces world-class research and then gets the results out into the real world for the benefit of the nation.

Commercialising software artefacts requires turning a research prototype into something that is industrial-strength, easily to use, well supported by tools, well documented, professionally serviced, etc. Furthermore, once the software has reached commercial grade, there needs to be a distribution network, a sales force, etc. That’s a company’s business, not a research lab’s, and the obvious thing is to give it to a company to do.

This could be an existing company, with existing (complementary) products, engineering teams, worldwide sales force, or it could be a startup. The latter step is riskier, but has definite advantage. For example, the startup does its engineering locally, and creates local jobs, while handing it to an existing business will almost certainly mean that the value-adding is done overseas, and if the startup is a huge success, then there’s a big upside.

Startups

Which gets us to how startups work, and here both the blog and the newspaper article have some serious misconceptions.

There are essentially two kinds of startups: slow-growing services businesses and venture-backed product companies. The first kind provides work-for-hire and is constantly dependent on getting the next services contract. Such a business rarely grows beyond a dozen staff or so, although there are notable exceptions (such as the big Indian outsourcers). The other ramps up quickly, but requires significant investment (i.e. millions of $$) to hire staff and build a product which can then be licensed (and create a royalty stream). It’s the second model which is favoured by startups built around some IP. (Icons like Google, Microsoft, Apple, Facebook started that way.)

Venture Capital

The investment comes from venture capital (VC) companies (their money is the fuel that fires the Silicon Valley startup scene). Venture deals vary from case to case, but they tend to have some characteristics in common. They are built around a high-risk, high-gain model. More than 80% of startups simply die, which means the investors lose all their money. Another 10% or so doesn’t produce a big return. Which means the VCs need to make more than a ten-fold return on the rest, else they’d be better off by simply putting their money into a bank.

IP Purity

This has a number of implications. One is that the VCs need to ensure that the return is maximised in the case the startup is sold. The most important consideration for a potential buyer is “purity of IP”: that the startup owns the critical IP without strings attached. If this cannot be ensured, they won’t invest, period.

This was the reason that the IP given by NICTA to OK Labs was exclusively licensed with a buy-out option on meeting certain milestones. It allowed OK to go to VCs as a credible place to invest their money to grow the company. There would not have been investors otherwise.

Proceeds from sale of company

Another characteristic of venture deals is how the proceeds from a sale (or IP) are distributed. The standard model (which is what I described to the journalist of the Australian, without going into any OK Labs specifics) is that on a sale, the proceeds are first used to cover the cost of the sale (which can be substantial, in order to get the best return to shareholders, a bank specialising in acquisitions is typically engaged). Then the investors get their investment back. This is a typical condition in their investment, nothing unusual about this. What’s left is distributed to shareholders according to their share.

A final point about VC deals is that for their investment VCs obtain “preference shares”. These give them control over the company, even if they don’t hold the majority of shares. In particular, they can decide to sell the company and the other (“common”) shareholders have no option but to agree. So, any claims that “NICTA sold OK Labs” are actually wrong, these decisions are made by the investors.

This is simply the standard VC investment model. If you don’t like it, don’t ask for venture capital!

I’ll try to address confusions about the various bits of IP in another blog in the next few days.

2012/06/16

Retracted papers – what happened at USENIX ATC’12?

At the opening session of this year’s USENIX Annual Technical Conference I mentioned that accepted papers were withdrawn, and commended the authors on their integrity. This created some interest, including people tweeting about it. I think the episode deserves some publicity.

There were, in fact, two papers which were withdrawn. Paper A was a “full” paper which was conditionally accepted by the program committee (PC), while Paper B was a “short” paper which was accepted. What happened?

Paper A

Paper A is actually a good example of the scientific peer-review process working well. The paper investigated a particular performance effect in storage devices, and at the PC meeting there was intense discussions of its merits.

The experts debated whether the result could be true and there was significant concern that the effect might be a fluke, potentially an issue with a particular set of devices. On the other hand, one PC member with access to confidential manufacturer data from a similar (yet unreleased) device had evidence that it might be real. If the effect was real, the work would be quite interesting, and definitely worth publishing.

After much debate, we decided to accept the paper conditionally, subject to the authors providing satisfactory answers to critical questions from the shepherd (who was an expert on these devices and the main sceptic).

I should add that, like most systems conferences, ATC uses shepherding of accepted papers. The shepherd is a PC member who needs to approve the final (camera-ready) version of the paper. Specifically, the shepherd ensures that all issues raised by the reviewers are dealt with by the authors. In this sense, all acceptances are somewhat conditional, but this case was different: there were fundamental doubts on whether the paper was sound, and the shepherd’s job was to investigate and either put the doubts to rest or kill the paper.

As it turned out, the reviewers (PC members) had indeed touched a sore point, as the authors noted when trying to provide answers to the shepherd’s questions. In the end, they found that the student doing the experiment had stuffed up, and misrepresented a crucial number by two orders of magnitude! And, for good measure, the student had made another mistake of similar magnitude, which had confused things further. (This is really bad, of course, but it’s a student, still learning the ropes, and it’s the professors’ responsibility to ensure that everything is correct.)

To their credit, the authors reacted quickly, and within less than a week admitted that they had stuffed up, apologised profoundly, and withdrew the paper. Btw, the “sceptic” on the PC still thinks that the work is likely to lead to publishable results, so the authors should definitely complete their investigation!

Paper B

Paper B’s story was a bit more surprising. It was also about a performance issue, this time with networking equipment. Since it was a “short” paper, no earth-shaking results were expected. Yet it was somewhat borderline, with most reviewers being luke-warm at best if not negative about it, thinking it didn’t really account to much of significance. It got up because one reviewer championed it, arguing that it was a surprising result coupled with some nice performance tuning work by the authors. In the end, the paper was accepted, subject to standard shepherding.

When preparing the final version, and re-running experiments to address issues raised by reviewers, the authors found that they could not reproduce the original results. This meant they had no real story that they could back with solid experimental data. So they had to withdraw the paper, and did so without any attempt to bullshit themselves out of the situation.

Lessons

Of course, neither team should have got into this situation in the first place. In both cases there was a failure of the authors’ internal process which allowed (potentially) incorrect results to get out.

However, nobody is perfect, and shit happens. And more of it happens when you’re under time pressure (as we almost always are working towards a paper deadline). And shit happens in the best of families (one of the paper was from very well-known and respected authors).

But it is important to be alert, particularly in a university environment, where inevitably the experimental work, and the low-level evaluation, is done by students, who have limited experience. The profs must be sure they know what’s going on and that they can be confident of the integrity of the experimental data.

The important thing is what both teams did once they realised they had a problem: if you have stuffed up, own up to it rather than making it worse by trying to hide it! Retracting an accepted paper is very painful and highly embarrassing, but that is the smaller evil compared to publishing nonsense. Someone is bound to find out, and then you’re looking really bad! In this case, it was only a few people on the program committee who knew about it, and they weigh the integrity of the authors’ actions higher than their mistakes.

So, let this be an example of what to avoid, but also of how to recover in an honourable way from a bad situation. I commend both teams on their actions.

2012/02/15

Where Andy is wrong about L4

Andy Tanenbaum, in a recent interview with LinuxFr.org, made some comments about L4. Unfortunately, they were mostly wrong.

Make no mistake, Andy is a guy I hold in high regard, he’s done great things over the years. But that doesn’t mean he should get away with making incorrect statements about the main competitor to his Minix system! And I’m not fussed about the outdated number of 300 million L4 deployments (the official figure is 1.5 billion). There are far more substantial issues. So, let’s have a look at what he said.

Andy Quote 1: “The biggest difference is that we [Minix] run the entire operating system as a collection of servers in userland. They [L4] mostly run Linux as a whole unit in userland. This means that a single bug in Linux can crash Linux. While L4 will survive this, all that does is make the reboot faster. It doesn’t solve the reliability problem.”

While it is true that L4 is being used as a virtualization platform to host Linux (something my colleague Hermann Härtig in Dresden has been doing since 1997), and there are phones out there that run Linux on top of L4, that isn’t the whole story. In fact, if all we did was put L4 underneath Linux, we’d do nothing than adding overhead. Clearly, that would be a pointless exercise.

There are several reasons why you’d want to run Linux on top of L4. One is the classical virtualization use case of processor consolidation: running multiple systems side-by-side on the same processor, using virtual-machine encapsulation to isolate one from the other. This is the configuration that is indeed used on some of the phones that ship with L4.

The second reason is legacy support: If you think you can just introduce a new OS API (even if it’s POSIX-compliant) and the world will adapt all its software, you’re dreaming. People want to not only keep re-using their old software, they even want to build new software to the old environments. But that shouldn’t stop you from providing a better environment for safety- and security-critical components.

This trend is indeed very strong in the embedded world, even (or particularly) in the domain of safety- or security-critical devices. I rarely get to talk to someone considering deploying seL4 who doesn’t have a need for a Linux or Windows environment. The important point is that a highly trustworthy microkernel underneath allows this to co-exist safely with the critical functionality.

Andy Quote 2: “There are lots of technical differences between MINIX and L4, but the focus of the projects is very different. We [Andy] have a microkernel so we can run a multiserver system. The entire focus is on multiserver and the ability to restart OS components independently. L4 is being used to run Linux and on phones. This has nothing to do with technical differences. We could have used L4 and they could have used MINIX but it is easier to use your own since you can change it as needs arise.”

Notwithstanding my comments above, the reality is that many of the 1.5 billion L4 deployments have nothing whatsoever to do with Linux! They run L4 on the baseband processor of a smartphone, with Linux (or Windows or iOS or BBOS or xyz) running on a separate applications processor. And the baseband stack is in fact a multi-server design! It consists of multiple (each multithreaded) tasks that communicate with each other. Exactly what Andy claims he does and we don’t…

The statement that “they” (i.e. us) could have used Minix is actually quite wrong. Minix, owing to its design and strong emphasis on simplicity, is much slower than L4. (Their papers typically tell you that throughput is only degraded by a few %. However, that’s typically compared to Minux-2, which isn’t the world’s fastest system either, while a reasonable baseline would be Linux. Furthermore, they don’t tell you how much CPU usage increased – I’ve discussed this with Andy, and unless things have changed a lot recently, then CPU overhead is pretty significant. Pretending that throughput degradation measures overhead while ignoring CPU load is one of my favourite benchmarking crimes!)

Performance happens to be pretty important on a mobile device (whether it’s a phone or an implanted medical device). Throwing away cycles means throwing away battery life, and manufacturers tend to care about this. In my experience, overheads of more than 5% (measured in terms of CPU cycles) are considered unacceptable in this space. Minix certainly wouldn’t be useful in a mobile environment at all!

Andy Quote 3: “They certainly did a nice job [on formal verification of seL4] but it is really hard to verify if it works since the hardware itself is poorly defined and the OS has to deal with the hardware a lot.”

C’mon, Andy, that one’s really cheap! seL4 is the first (and so far only) OS kernel which has a formal (mathematical and machine-checked) proof that its implementation follows the specification. That means nothing less than that the implementation is bug-free with respect to the spec. Something that just cannot be achieved by simplicity and any amount of code inspection or testing!

Andy seems to imply that because the hardware isn’t exactly defined, the verification is somewhat worthless. Of course, if the hardware’s behaviour isn’t defined well enough, then that hurts Minix just as well. But the argument is actually bogus. The verification of seL4 was done for ARM processors, and (unlike x86) they have a very precisely-defined ISA. It’s so well defined that Anthony Fox and Magnus Myreen at Cambridge could turn it into a formal (mathematical) model. And we’re in the process of linking this model to our seL4 code proof.

In reality, the functional correctness proof of seL4 is very powerful. It means that seL4’s behaviour is very precisely defined (there is a formal spec of the kernel). And the implementation adheres to that spec. Which means that in order to prove safety or security properties of a system built on top, it is generally sufficient to prove that these properties hold with respect to the seL4 spec, and by implication they hold for the system. (I’m glossing over a few details here, this argument requires more than functional correctness. However, all the other bits required are either already in place for seL4, or we’re working on them and they should be in place this year!)

So, how about the multi-server issue? This is the classical microkernel approach: you structure the microkernel-based system into multiple components (or servers) and the microkernel enforces isolation between them. This can lead to a more fault-tolerant system, as Andy keeps pointing out (correctly). However, if he seems to imply that this somehow is different from what we do, then that’s simply wrong. In fact, our Trustworthy Systems research agenda uses exactly this component approach. The difference is that we can prove (security or safety or other) properties of our system. No-one else can do this!

On top of that we have a few other cool bits. Such as device drivers which are synthesised from formal specs, and are as such correct by construction! Perfect complement to a formally-verified kernel.

Andy Quote 4 (not actually about L4 but about capability-based OSes like Coyotos): “Capability systems have been around for 40 years. At the recent SOSP conference, M.I.T. professor Jack Dennis got an award for inventing them in 1967. They definitely have potential. My first distributed system, Amoeba, used them. Coyotos, however, is going nowhere.”

Am I to read this as an argument that because CoyotOS is going nowhere, capability systems aren’t?

seL4 is a capability-based system too. And, unlike CoyotOS, it is going places! In fact, capabilities are a core part of the safety/security story in seL4: they allow us to reason precisely about access rights (and information flow) in the system, in a way that is impossible to do with an access-control-list based approach as Minix uses it.

Summary

The two approaches are similar as far as OS structure is concerned. The core difference is that with Minix, Andy is trying to get things right by keeping them really simple. He cops a significant performance overhead, yet in the end can only use traditional quality-assurance methods (testing and code inspection) to make a case about the behaviour of his system. Nor can he formally reason about the security/safety properties of the complete system.

In contrast, with seL4 we have a microkernel that is blindingly fast (as fast as any L4 kernel, and no other microkernel has ever been able to beat L4 in performance), yet we can make actual guarantees about its behaviour and that of systems built on top, with the strength of mathematical proof!

Everything else is simply not playing in the same league.

2012/02/04

Automotive: the next frontier for virtualization

I have been predicting for years that there is a need for virtualization in cars. Guess what, it’s about to happen!

There are two main drivers (excuse the pun): On the one hand, features are mushrooming in cars just as in consumer electronics. And in cars, traditionally each new feature has resulted in yet another microcontroller (or four). The barrier of 100 ECUs (as micros are called in cars) has been broken years ago. Besides the problems of increasing complexity of this distributed computing system, there are very solid reasons why this approach doesn’t scale any more: cars are a very hostile environment for electronics. It must be packaged up to be heat resistant, dust resistant, vibration resistant, water proof, grease proof, acid proof, the lot. This leads to expensive and bulky packaging, which is creating space and weight problems.

On the other hand, there is an increasing need for integration of very dissimilar systems. Classical automotive (control and convenience) functionality is subject to standards set by the automotive industry. But much of the infotainment functionality uses parts from the consumer-electronics space (eg wifi, Bluetooth, Linux, Windows). These are made to standards (or not) that are out of control of the auto industry. And as such it can be hard to predict how such parts interact with automotive subsystems.

An obvious way of getting around the issue of mushrooming functionality is to consolidate on fewer processors. This requires a protected multitasking operating system to isolate functions. However, while classical automotive functionality runs on automotive OSes, such as AUTOSAR, entertainment functionality expects more powerful OS platforms. The automotive industry has reconised this, as is obvious in GENIVI, which is a Linux-based OS. But two different OSes on the same processors only works if they run in separate virtual machines (VMs). Even if they run on different cores of a multicore processor, multiple cores aren’t isolated from each other unless there’s a hypervisor underneath.

Note that this consolidation is not only needed to save parts/space/weight. Increasingly tight integration of functionalities is inevitable. For example, infotainment is becoming tightly integrated with more classical automotive functions, such as navigation and driver assistance. (Example: information on road conditions, obtained from ad-hoc networks formed with other cars, may be used to change driving behaviour, say to stabilise the car when there is a chance of roads being slippery).

However, such integration cannot lead to misbehaviour of consumer-electronics parts spreading to other functionality. If some infotainment component crashes (or worse, catches a virus) you want to be damn sure this doesn’t spread to other functions.

There are other interesting aspects. Automotive components are expected to be ready and operational very quickly when the ignition key is activated. For example, nodes in the automotive CAN and MOST networks have a requirement to start up within 50 ms. GENIVI, being a Linux OS, cannot boot up anywhere near that quickly, yet it must be able to talk to the CAN/MOST bus and is subject to its requirements. The way out is either have the CAN/MOST controller driver on a separate processor (and chip), or, much cheaper, have it in a different virtual machine on the same chip.

All this is nothing compared to having to run GENIVI and AUTOSAR on the same processor, because of integration fo functionalities (eg. the instrument cluster, driven by AUTOSAR, is to display date that comes from the infotainment world). You’re really stuck there without virtualization.

Don’t believe me? Just wait, you’ll see it happening quite soon.