Fixpoint

2019-11-29

Introducing Gales Linux, a cross-bootstrapped, do-it-yourself, fully-static, discriminatory distribution

Filed under: Software — Jacob Welsh @ 20:57

Motto: Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.(i)

Gales Linux is a new operating system distribution, navigating the stormy seas of software since 2017 and to be released shortly. While composed mainly of existing components, their selection, the build and install processes and glue are of my own design and implementation, not derived from existing distributions though certainly informed by them. It provides a bulwark against the seemingly uncontrollable growth of accidental complexity and technological churn that characterizes the modern Linux scene, harking back to a time of more comprehensible computing while aiming to incorporate some of the better ideas that have come along since. This said, it is not intended to be for all comers or all purposes. It's been an experiment in applying certain design elements and seeing how far they can go. I find it useful already; if you do too, then great; and if you're inclined to build your own system you may find it a useful starting point or resource to draw parts from.

It includes a Linux kernel (that you will need to configure yourself to fit your hardware), a GCC 4.7.4(ii) toolchain supporting C and C++, the musl C library, BusyBox utilities and a custom pdksh-derived shell.(iii) At present the environment is text only. Notably, the bootstrap procedure is documented and does not require an existing Gales system or matching architecture; in theory it can work from any reasonably POSIX-like system, which so far has been demonstrated on Gentoo, OpenBSD and Gales itself.

I've had three main guiding principles in the design process. First, the system should loyally serve the operator; for example, the act of installing, reinstalling, upgrading, patching or whatnot on a program should not "helpfully" modify live configuration or daemon process state. Second, the system should preserve meaning: while the ideal of direct execution from human-readable source code may not be presently practical, it should be the preference, and full reproduction of the system from source should be regarded as a primary necessity. Third is the old "Keep It Simple, Stupid" - perhaps better formulated as fits in head.

History

I had made some earlier attempts to take control of an OS starting in 2016. At the time I was running Fedora, Debian and OpenBSD, having been soured by the constantly broken builds in Gentoo after years of using it.(iv) The idea was to adapt the Linux From Scratch process to bootstrap a musl-based system from source, then use the existing RedHat Package Manager for applications. The effort was unsuccessful but instructive. I then tried Alpine Linux; it appeared to be elegant and developed by knowledgeable musl people, but I was alarmed to find that despite a claimed focus on security, its package management tool was written in C and "secured" by HTTPS. Next I tried the experimental "Gentoo hardened musl" project, using its "stage3" binary as a base but building further libraries and applications manually, thus forcing myself to inspect the upstream offerings, read the READMEs and run the ./configures. This went fairly well and I built up a personal archive of sources, patches and recipes to document my steps. I planned to deal with the remaining mystery blob of the stage3 by reproducing it from some pre-existing system; in the sort of surprise that by then was becoming unsurprising, I found that Gentoo's bootstrapping tool, "catalyst", did not support cross builds,(v) nor had the "hardened musl" project left any documentation on how they'd seeded their image.

Around the same time I had started studying the work of Daniel J. Bernstein aka djb. Some revelations were that much of what I had understood as "package management" was an unnecessary result of poor filesystem layout; that bug-free code wasn't such an unrealistic thing to aim for, but required questioning established interfaces; and that the more enticing aspects from a sysadmin's standpoint of the "systemd" abomination had been available at least a decade prior and with vastly less code. With Gentoo appearing to have minimal value left to offer, I set out to revive my from-scratch process and build a full system around these ideas.

Key design decisions

1. No separate "package database." A package-major hierarchy plus symbolic links is enough.

2. Fully static linking. The operator should be free to modify libraries, even keep multiple versions around, without risk of breaking existing programs. Thanks to musl the cost of object code duplication is low in most cases; in theory, program load time and memory consumption can even improve compared to traditional GNU/Linux and without extra caching mechanisms. Questions of how to build a given thing static or dynamic or both become unnecessary. In combination with 1, packages can be updated or rolled back more-or-less atomically.

3. Minimal PID 1 (init), as it occupies a position with special privileges and reliability requirements. Use external scripts for the boot process and daemontools for service management.

4. Static device management. No layers of "tweak the udev rules to tell the daemon how to regenerate the nodes" - that's what the filesystem is for. If you need to tweak /dev, you just do it.

5. Use initramfs for install and rescue environments and ensure its contents are easily customized. One result is a "viral" property that installing the system does not require physical boot media, merely an existing Linux-compatible bootloader (though of course a bootloader can be installed on external media).

6. Lightly automated build and install tools for additional software ported to the system, working based on build definitions including metadata, source checksums, patches, and build scripts.

7. A conservatively curated library of such software, known as the gports tree.

8. No effort to keep up with churning data sets such as Unicode, time zones, or message translations. News from the OS should pertain only to the functioning of the OS; definitionally unstable databases are the operator's business.

9. Config files protected without any special mechanism: shipped configs are installed exclusively to /etc/examples, from which you can copy or diff at leisure. This does mean you sometimes need to check for such examples for things to work as expected.

10. Simplifying third-party build systems as a gradual effort, typically replacing autotools spew with static config.h and Makefile, which often provides a noticeable build speedup and greatly eases investigation of questions like "wtf code am I even running?!"

11. Few libraries visible in standard search paths. To link with a non-standard library you use the -I and -L compiler flags to indicate its installed path. Thus linkage becomes much more explicit even in the presence of magical build systems that try everything they see.

12. Self-extracting shell archives, allowing precise and deterministic specification of metadata for trees of text or binary files without inheriting the complexities of the "tar" formats (plural!).

13. Deterministic build for the base so as to truly factor out the bootstrap host. While much progress was made here to the extent that results were bitwise-reproducible from two Linux systems, the goal of extending this to any host remains elusive, particularly in GCC and the kernel.

14. HTTP mirror for third-party source tarballs, including base and ports, with script to replicate and efficiently synchronize without allowing existing files to change or disappear.(vi)

15. Original sources (including documentation, scripts, base config files, gports and patches) kept in a single relatively lightweight repository suitable for management with V.

Over the next few days I will be dusting off the repository and publishing the code and some stats, so stay tuned!

  1. R. Kelsey, W. Clinger, J. Rees (eds.), The Revised5 Report on the Algorithmic Language Scheme. [^]
  2. Last series that can be bootstrapped purely from C. [^]
  3. IMHO providing a good compromise between comfort, code size and standards compliance. Bash is available as an option. [^]
  4. For one thing, it tries to take no stands and be adaptable to any purpose through a system of USE flags controlling how programs are built; for another, it has a "rolling release" model and generally accepts upstream updates. The result is a combinatorial explosion such that nothing really gets tested and every Gentoo system becomes unique, uncharted territory. [^]
  5. Which raises doubts on to what extent it really builds from sources rather than importing artifacts from the host system, something that can easily happen by accident given the complexity of the toolchain. [^]
  6. A present flaw is that the sync script doesn't allow subdirectories - validating server-provided paths in a shell script is tricky! - yet the mirror has one. Manual intervention required for now. [^]

15 Comments »

  1. [...] understand GCC 4.9 coming into Republican use as the last GCC prior to version 5 wreckage. Gales Linux uses 4.7.4 in part because it's the last GCC that doesn't require any C++ to compile. What are the [...]

    Pingback by Implementing TMSR OS « Dorion Mode — 2019-11-29 @ 21:16

  2. The link at footnote i is broken already, you might want to mirror it really. A quick search seems to have turned out https://www.schemers.org/Documents/Standards/R5RS/HTML/ as your intended target perhaps?

    Sounds good otherwise and as mentioned before, I'd rather give it a spin already.

    Comment by Diana Coman — 2019-11-29 @ 22:17

  3. The link at footnote i is broken already, you might want to mirror it really.

    Ah, thanks, it worked via my hosts file apparently (212.110.186.28 readscheme.org library.readscheme.org repository.readscheme.org) where I put it last time their domain broke, at which time I indeed grabbed a mirror. Guess I get to be the scheme library now huh!

    An audience certainly provides good motivation, this one was wasting away on the todo list for quite some time.

    Comment by Jacob Welsh — 2019-11-29 @ 22:34

  4. [...] I am pleased to present, at long last, an initial public release of my Gales Linux [...]

    Pingback by Gales Linux initial release « Fixpoint — 2019-12-01 @ 21:05

  5. 2. Fully static linking. The operator should be free to modify libraries, even keep multiple versions around, without risk of breaking existing programs.

    "Free to modify libraries" reads to me that multiple C libraries can be installed for use. Similar to how you have gcc64 in gports, could you also have glibc as a gport and build/install dynamically linked programs with it ? Such as nvidia proprietary drivers :

    jfw: I expect that'd be quite difficult, its libGL is a glibc-based .so

    Though I don't believe I've linked with non-standard libraries in my Gales usage, the following reads to be a way forward in curing the static/dynamic linking divide. Fully static by default with dynamic as optional add on.

    11. Few libraries visible in standard search paths. To link with a non-standard library you use the -I and -L compiler flags to indicate its installed path. Thus linkage becomes much more explicit even in the presence of magical build systems that try everything they see.

    What are some of the top difficulties you'd expect to encounter ?

    Comment by Robinson Dorion — 2019-12-09 @ 16:29

  6. [...] ebuild example. Going forward I will be turning my attention away from Portage and towards a Gales install + report. My hope is that Gales will be a bit easier for me to digest and also will force [...]

    Pingback by A brief look into Portage and ebuilds « Krankendenken — 2019-12-23 @ 07:23

  7. [...] I had the opportunity to give a first-pass read both bvt's install report as well as some of jfw's writings on the topic. My earlier concerns about UEFI were validated once I read spyked's recent article as [...]

    Pingback by ejb plan: week 3 (Jan 6 - Jan 12) « Young Hands Club — 2020-01-05 @ 19:30

  8. @Robinson Dorion:

    > "Free to modify libraries" reads to me that multiple C libraries can be installed for use.

    Though indeed this static linking benefit applies to libc as to any other, the possibility is not quite realized in present Gales. We have it installed in /lib, as opposed to say /gales/pkg/musl-xyz/lib, making it the only exception to the package-based filesystem layout besides busybox. This could be changed, indeed probably should; I reckon all the "base" components should also be buildable via gports anyway so as to modify and rebuild after initial bootstrap. (Though this does seem to slide further from the BSD-style core/periphery distinction; if that's an important thing it'll need some clarification.) The active musl version would then be symlinked into /lib so gcc finds it by default unless told to look elsewhere.

    > Though I don't believe I've linked with non-standard libraries in my Gales usage,

    The sense of "standard" I meant here (point 11) was simply libraries in the compiler's standard search path, that is, /lib, namely musl (and possibly ncurses, a compromise to make "menuconfig" work without patching). That is, you have used "non-standard" libs, though perhaps unawares as the linker paths were handled by gports.

    But to the point: I'd say there are two divides of interest actually: static vs. dynamic, and Felker vs. Drepper.

    That is, dynamic musl might be an option. I noticed there is some glibc binary compatibility layer; I don't know how far it goes, but hacking on that might be preferable in comparison to doing anything with glibc itself.

    I'm not certain if you can just swap in a dynamic libc with the existing compiler or if we'd have to build a whole separate toolchain. Either way, we'd need to make sure the default does stay static, that is, either the dynamic libc doesn't go in /lib or the default toolchain doesn't search there. The first is much more in the Gales spirit; one potential difficulty is that dynamic executables contain a fixed path to the dynamic linker that 'bootstraps' them, e.g.:

    $ readelf -l /bin/ls
    ...
    [Requesting program interpreter: /lib/ld-musl-x86_64.so.1]

    and this is considered in some sense ABI. Might not be a real problem if we don't care about executable as opposed to library blobs

    I'm not sure I've entirely answered the question though, probably because there's still a lot of unknowns for me here.

    Comment by Jacob Welsh — 2020-01-18 @ 22:17

  9. Btw, I took a fresh look at the NVIDIA driver architecture. It consists of some glibc-based .so's (multiplied since last I saw, including addition of a supposed multi-vendor support layer), a kernel module with open source part (GPL subterfuge layer) and binary part, and some configurator and control-panel type programs.

    Comment by Jacob Welsh — 2020-01-19 @ 01:26

  10. In fact, the subterfuge nvidia came up with to subvert the original FOSS notions, once accepted by Linus, became the standard path then trod by utter nonsense such as "UEFI" & friends.

    More generally, the problem of trying to "protect" a houseful of whores by building walls around it is that... well... a hole's always found. Nothing's easier than a pore opening through wall, by the willing holes within.

    Comment by Mircea Popescu — 2020-01-19 @ 19:43

  11. I believe it!

    Comment by Jacob Welsh — 2020-01-20 @ 21:37

  12. [...] familiarity with jfw's Gales Linux1 was encouraging and since he'd just published it, having Republican eyes look it over is a good [...]

    Pingback by TMSR OS, January 2020 Statement « Dorion Mode — 2020-01-27 @ 04:12

  13. [...] denominated in units of BusyBox. He tried to get a point through to jfw and dorion that their Gales project didn't exist as a thing worth naming, by reason of building on far larger components, yet [...]

    Pingback by From the forum log, 27-29 December 2019 « Fixpoint — 2020-01-29 @ 00:36

  14. [...] is... well, better to let previous writings to speak for themselves; more precisely, JFW's introduction to Gales Linux; the initial release; and Bvt's installation report are so far the foremost pieces on the subject, [...]

    Pingback by A journey through the Gales installation process « The Tar Pit — 2020-01-31 @ 15:54

  15. [...] is the thing to break if you want to make Jacob cross2 and otherwise a fully static linux distribution of delightfully small size and clear setup that works reportedly ~everywhere, from Panama to the [...]

    Pingback by Thinkpad in Gales « Ossa Sepia — 2020-02-07 @ 14:46

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by MP-WP. Copyright Jacob Welsh.