Sam Thursfield

2 months ago

Sam Thursfield
2 months ago

Status update, 22/09/2025

For the first time in many years I can talk publicly about what I’m doing at work: a short engagement funded by Endless and Codethink to rebuild Endless OS as a GNOME OS derivative, instead of a Debian derivative.

There is nothing wrong with Debian, of course, just that today GNOME OS aligns more closely with the direction the Endless OS team want to go in. A lot of the innovations from earlier versions of Endless OS over the last decade were copied and re-used in GNOME OS, so in a sense this is work coming full circle.

I’ll tell you a bit more about the project but first I have a rant about complexity.

Complexity

I work for a consultancy and the way consultancy projects work is like this: you agree what the work is, you estimate how long the work will take, you agree a budget, and then you do the work.

The problem with this approach is that in software engineering, most of your work is research. Endless OS is the work of thousands of different people, and hundreds of millions of lines of code. We reason and communicate about the code using abstractions, and there are hundreds of millions of abstractions too.

If you ask me “how long will it take to change this thing in that abstraction over there”, I can research those abstractions and come up with an estimate for the job. How long to change a lightbulb? How long to rename a variable? How long to add an option in this command line tool ? Some hours of work.

Most real world tasks involve many abstractions and, by the time youve researched them all, you’ve done 90% of the work. How long to port this app to Gtk4? How long to implement this new optimization in GCC? How long to write a driver for this new USB beard trimmer device? Some months or years of work.

And then you have projects where it’s not even possible to research the related abstractions. So much changed between Debian 12 and GNOME OS 48 that you’d be a year just writing a comprehensive changelog. So, how can you possibly estimate the work involved when you can’t know in advance what the work is?

Of course, you can’t, you can only start and see what happens.

But, allocating people to projects in a consultancy business is also a hard problem. You need to know project start and end dates because you are lining up more projects in advance, and your clients want to know when their work will start.

So for projects involving such a huge number of abstractions, we have to effectively make up a number and hope for the best. When people say things like “try to do the best estimation you can”, it’s a bit like saying “try to count the sand on this beach as best as you can”.

Another difficulty is around finding people who know the right abstractions. If you’re adding a feature to a program written in Rust, management won’t assign someone who never touched Rust before. If they do, you can ask for extra time to learn some Rust as part of the project. (Although since software is largely a cowboy industry, there are always managers who will tell you to just learn by doing.)

But what abstractions do you need to know for OS development and integration? These projects can be harder than programming work, because the abstractions involved are larger, more complicated and more numerous. If you can code in C, can you can be a Linux integrator? I don’t know, but can a bus driver can fly a helicopter?

If a project is so complex that you can’t predict in advance which abstractions are going to be problematic and which ones you won’t need to touch, then even if you wanted to include teaching time in your estimation you’ll need a crystal ball to know how much time the work will take.

For this project, my knowledge of BuildStream and Freedesktop SDK is proving valuable. There’s a good reference manual for BuildStream, but no tutorials on how to use it for OS development. How do we expect people to learn it? Have we solved anything by introducing new abstractions that aren’t widely understood — even if they’re genuinely better in some use cases?

Endless OS 7

Given I’ve started with a rant you might ask how the project is going. Actually, quite some good progress. Endless OS 7 exists, it’s being built and pushed as an ostree from eos-build-meta to Endless’ ostree server. You can install it as an update to eos6 if you like to live dangerously — see the “Switch master” documentation. (You can probably install it on other ostree based systems if you like to live really dangerously, but I’m not going to tell you how). I have it running on an IBM Thinkpad laptop. Actually my first time testing any GNOME OS derivative on hardware!

For a multitude of reasons the work has been more stressful than it needed to be, but I’m optimistic for a successful outcome. (Where success means, we don’t give up and decide the Debian base was easier after all). I think GNOME OS and Endless OS will both benefit from closer integration.

The tooling is working well for me: reliability and repeatability were core principles when BuildStream was being designed, and it shows. Once you learn it you can do integration work fast. You don’t get flaky builds. I’ve never deleted my cache to fix a weird problem. It’s an advanced tool, and in some ways it’s less flexible than its friends in the integration tool world, but it’s a really good way to build an operating system.

I’ve learned a bunch about some important new abstractions on this project too. UEFI and Secure Boot. The systemd-sysusers service and userdb. Dracut and initramfs debugging.

I haven’t been able to contribute any effort upstream to GNOME OS so far. I did contribute some documentation comments to Freedesktop SDK, and I’m trying to at least document Endless OS 7 as clearly as I can. Nobody has ever had much to time to document how GNOME OS is built or tested, hopefully the documentation in eos-build-meta is a useful step forwards for GNOME OS as well.

As always the GNOME OS community are super helpful. I’m sure it’s a big part of the success of GNOME OS that Valentín is so helpful whenever things break. I’m also privileged to be working with the highly talented engineers at Endless who built all this stuff.

Abstractions

Broadly, the software industry is fucked as long as we keep making an infinite number of new abstractions. I haven’t had a particularly good time on any project since I returned to software engineering five years ago, and I suspect it’s because we just can’t control the complexity enough to reason properly about what we are doing.

This complexity is starting to inconvenience billionaires. In the UK the entire car industry has been stopped for weeks because system owners didn’t understand their work well enough to do a good job of securing systems. I wonder if it’s going to occur to them eventually that simplification is the best route to security. Capitalism doesn’t tend to reward that way of thinking — but it can reward anything that gives you a business advantage.

I suppose computing abstractions are like living things, with a tendency to boundlessly multiply until they reach some natural limit, or destroy their habitat entirely. Maybe the last year of continual security breaches could be that natural limit. If your system is too complex for anyone to keep it secure, then your system is going to fail.

#codethink #gnome

Inside the Jaguar Land Rover hack: stalled smart factories, outsourced cybersecurity and supply chain woes

Being a carmaker where ‘everything is connected’ has left JLR unable to isolate its plants or functions, forcing a shutdown of most systems

^{Jasper Jolly (The Guardian)}

This entry was edited (2 months ago)

reshared this