Skip to main content

Nix Modules the Hard Way

·4851 words·23 mins
Nix Kepler Unrefined
Author
Jonas A. Hultén
Opinionated engineer who rambles semi-coherently about DevOps, type systems, math, music, keyboard layouts, security, and misc.

Having now delved into first figuring out how Nix works and getting Home Manager off the ground, the next step is to work out how modules work. Using modules strikes me as the best way to implement sane defaults while still exposing overrides through Nix options.

The question is… how?

A Hello World Module
#

# nix/system/pkgs/hello.nix

{ config, pkgs, ... }:
{
  environment.systemPackages = with pkgs; [ hello ];
}

Quick and dirty, but we should be able to bash that into our root flake and system configuration and we’ll get hello installed. Skipping the suspense: yes we do.

Now we want to make it optional. Can I hide the hello program behind an option which I have to enable elsewhere — it’ll be the root flake — in order for it to be installed?

Making options optional
#

After some serious digging in the NixOS manual — looking at things like options declarations, taking a detour via some type system deep dives, and finally getting to option definitions — here’s a module:

{
  config,
  pkgs,
  lib,
  ...
}:
with lib;
{
  options = {
    kepler.hello = mkEnableOption "hello";
  };

  config = {
    environment.systemPackages = with pkgs; if config.kepler.hello then [ hello ] else [ ];
  };
}

Without changing anything in the root flake, I rebuilt my system. hello vanished.

I added the line kepler.hello = true; to an ad-hoc module in the config and rebuilt. hello reappeared.

Eureka!

Sidebar: what’s happening here #

While my fingers are practically itching with a desire to go forth and modularize, let’s actually explain what the heck is going on in this module. After all, this whole learning-by-teaching schtick I’ve got going on this blog requires that I sometimes pause for explanation.

Function declaration
#

{ config, pkgs, lib, ... }:

This is what the first six lines of the module above looks like before the formatter has its way. We’ve seen this syntax before; we’re declaring a function which takes a set… Actually, what’s a set?

Stack push: Sets
#

A set — “attribute set” if you’re being a pedant — is Nix fancy-talk for what you might know as a map, dict, or hash — a mapping of keys, which are usually strings, to values, which can be just about any other data type, including other sets.

There’s some nuance about what set keys can be in Nix — the manual notes it as: “An attribute name can be an identifier or a string.” This implies that identifiers aren’t strings, but a secret second thing. What this means in actuality is that attribute names are alphanumeric strings which start with a letter or an underscore, and contain only letters, numbers, underscores, dashes, or single quotes, weirdly. Any other string — including arbitrary unicode like 💀 — can be an attribute name as long as it is double-quoted.

The fact that emoji work as strings and, consequently, attribute names isn’t a documented feature. That being said, I tried it and it works, so if you want foo."🐶"."🐱" in your config, no one can stop you.

Stack pop
#

Back on track. We declare a function which takes a set, which we expect to contain the keys config, pkgs, and lib. The magic ... just means that other keys are also allowed — if we omitted it, we’d require a set with exactly these three keys, whereas now we allow any set so long as those three keys are in there somewhere.

As a matter of fact, I don’t really know what’s in the set that’s passed to a module. It’s probably in the documentation somewhere, but we’ll get there when we get there, I guess.

The with construct
#

with lib;

This line notably comes before the main function body, which seems a little weird. But, again, it’s nothing we haven’t seen before — recall every time you’ve seen with pkgs; before a list of packages to install.

The documentation describes a with expression as bringing something into lexical scope for the next expression. That’s technically correct — the best kind of correct — but it doesn’t really explain what it does. Its function is to — within the next expression only (more on that later) — take everything in the given set and shunt it into scope. In the with pkgs; example, it means the code

environment.systemPackages = [ pkgs.hello pkgs.cowsay ];

can be written

environment.systemPackages = with pkgs; [ hello cowsay ];

Now, this doesn’t mean anything not inside pkgs is inaccessible — the contents of pkgs has just been added to the scope, not overwritten it.

In our module, with lib; { ... } brings lib into scope for the definition of the set we’re about to create — oh yeah, the function that is our module takes a set and returns a set. That’s how modules work. We’ll see why that’s useful — and, insofar as I can tell, standard practice — for modules in a second.

Stepping into options
#

  options = {
    kepler.hello = mkEnableOption "hello";
  };

This chunk comes directly after {, opening the set we’re defining. It is, however, unlike any module we’ve dealt with previously — this options attribute is special to category three modules. Remember those?

Stack push: Module Categories
#

In my previous post, I rambled on about three categories of modules. This isn’t something that exists officially — I made them up to try to delineate modules I understood from modules I didn’t.

  1. Just a set which contains config settings, without the function header, e.g.

    { time.timeZone = "Europe/Stockholm"; }
    

    This is, ostensibly, the simplest type of module, and what I kept calling an “ad-hoc” module when defined in the root flake modules list.

  2. A function which returns a set, e.g.

    { config, pkgs, ... }:
    {
      time.timeZone = "Europe/Stockholm";
    }
    

    This is supposedly what category 1 modules are, under the syntactic sugar of just not having to define a function. The example above could just as easily be written as a category 1 module, since it doesn’t use config or pkgs or, indeed, any input it gets from being a function. However, if we wanted to add packages to the installation, we would need a category 2 module, since we need pkgs.

  3. A function which returns a specific set, e.g.

    { config, pkgs, ... }:
    {
      imports = [];
      options = {};
      config = {
        time.timeZone = "Europe/Stockholm";
      };
    }
    

    This is supposedly what category 2 modules really are — any configuration not inside the config set is just assumed to belong in there. This category also exposes the really powerful constructs of the module system; the ability to both create and define options.

The imports list isn’t technically unique to category 3 modules — indeed, it’s useable in all modules — but it didn’t really make sense to include more dead code in the “simpler” categories. Imports, as the name suggests, allows importing other modules into this one — it actually is the key part of the whole module system, as it allows breaking the configuration apart across multiple files.

Stack pop: Back to options
#

Here’s the real clincher about these advanced modules. The options set allows us to define options which can then be set/used in other modules. In our example, we create kepler.hello as an option — we’ll get to exactly how we do that in a second — and just by putting that in the options set of our module, the kepler.hello option is available in all other modules in our configuration.

Think about this for a second, and how powerful this can be. This is how all Nix(OS) modular configuration is built.

Consider:

# <nixpkgs/nixos/modules/services/networking/ssh/sshd.nix>
...
  options = {
    services.openssh = {
      enable = mkOption {
        type = types.bool;
        default = false;
        description = ''
          Whether to enable the OpenSSH secure shell daemon, which
          allows secure remote logins.
        '';
      };

That’s the definition of the services.openssh.enable option which is part of NixOS. Now, how setting that actually enables SSH… that’s what we’ll get to in a moment.

Let’s first look at the mkEnableOption and mkOption functions. First of all, these are why we want with lib; around our set, otherwise we’d have to call them as lib.mkEnableOption and so on. Maybe you prefer the second style, since it’s more explicit about where functions come from, but I prefer the shorter syntax.

… for now. My code style and ideas about “proper” syntax are subject to change without warning.

These two functions are how we declare an option for Nix to use. What they actually do is way out of scope for this post, so let’s focus on how we use them. The “real” function is mkOptionmkEnableOption is a utility function which wraps mkOption with commonly used settings for making a simple boolean switch. openssh above does not use mkEnableOption for their enable option, though. Why?

While I can’t know for certain, it’s probably because they want to set their description manually, which mkEnableOption doesn’t let you do. Or, well, not really.

According to the documentation

mkEnableOption "hello"

is equivalent to

mkOption {
  type = types.bool;
  default = false;
  example = true;
  description = "Whether to enable hello.";
}

So I guess openssh could’ve defined their enable option as

mkEnableOption "the OpenSSH secure shell daemon, which allows secure remote logins"

but I can also see why that’s clunky.

Moving on: config
#

  config = {
    environment.systemPackages = with pkgs; if config.kepler.hello then [ hello ] else [ ];
  };

First of all: the config we’re defining here is somehow not the same thing as the config we’re getting from the function call. I have no idea how or why they’re different, but they are. Or maybe they are the same, and there’s some groovy merging happening behind the scenes…? Nix still has some mysteries.

Anyway, this config set is the part we should be most used to — it behaves just like a category 1 or 2 module, as discussed above. While we, in the options set, created options which other modules can use, in the config set we define options from other modules. In the unique case of category 3 modules, we can make those definitions dependent on the settings of our own options from the same module.

In our simple module, we’re looking at the value of the kepler.hello option — which we read from the config set — to determine if we should add hello to our system packages. Since a mkEnableOption is false by default, unless we explicitly define kepler.hello = true; somewhere in our config, our conditional fails, and nothing is added to the systemPackages list.

Refactor
#

Having accidentally looked at how openssh defines its module, let’s refactor ours using some better practices.

{
  config,
  pkgs,
  lib,
  ...
}:
with lib;
let
  cfg = config.kepler.hello;
in
{
  options = {
    kepler.hello = {
      enable = mkEnableOption "hello";
    };
  };
  config = mkIf cfg.enable { environment.systemPackages = with pkgs; [ hello ]; };
}

There are three major changes.

First, we add a let block and define cfg = config.kepler.hello; so that we can use cfg as a shorthand later. How can we reference the option before we’ve defined it? Lazy evaluation, or something.

Second, we shunt the enable option to kepler.hello.enable. The only reason I didn’t do this initially is laziness.

Finally, we wrap the entire config set in a mkIf call. According to the docs — which I only barely understand — this is supposedly a neater way of doing the if-then clause we did earlier, but without causing an infinite recursion wherein we reference config while constructing it. So I guess that does mean that the config in our module is the same as the config we get from the function header. Weird.

I tried looking at the source code for the mkIf function and… nope. It seems to create some sort of set of its own which is, in turn, used in the module configuration merging logic. There’s some serious functional programming voodoo afoot here, and I am nowhere near tired/buzzed/baked enough to understand it.

For my next trick: Home Manager
#

Okay, we’ve made it work under the system scope. Can we get it to work under the Home Manager scope?

As it turns out, that was relatively easy:

# nix/home/pkgs/hello.nix

{
  config,
  pkgs,
  lib,
  ...
}:
with lib;
let
  cfg = config.kepler.hello;
in
{
  options = {
    kepler.hello = {
      enable = mkEnableOption "hello";
    };
  };
  config = mkIf cfg.enable { home.packages = with pkgs; [ hello ]; };
}

and then add

  kepler.hello.enable = true;

to the user’s configuration file.

There’s not really that much of a difference, except that we’re adding the package to home.packages instead of environment.systemPackages. Which, honestly, makes sense.

There’s some nuance here that bears explaining though: the config and options in the Home Manager module are not the same as in a system module. I understand why you’d keep the same nomenclature to keep modules looking relatively the same, but it was a bit of a mental hurdle to get over, initially. Though, that does explain why nixosConfig is passed in to the Home Manager modules as well, since the system config is no longer reachable via just config.

Having dug through the system config via the REPL, the system config and options end up at <hostname>.config and <hostname>.options respectively. That makes sense. The Home Manager configuration ends up in <hostname>.config.home-manager and, more specifically, the <hostname>.config.home-manager.users.<username> set is the one that’s passed in as config in a Home Manager module. What about the options? Great question.

I’ll tell you this much: it’s not <hostname>.options.home-manager. That’s for the core Home Manager options, and the users option defines the users map itself. So… Where on earth do per-user options go?

Sidebar: Evil Recursive REPL Hacking #

Here goes some next level Nix hacking. The goal: to enumerate every key in the system configuration, then filter that down to every occurrence of the word “kepler”.

First off, let’s assume that m always contains my system configuration. This is achieved via

nix-repl> :lf . # Load the root Kepler flake
nix-repl> m = outputs.nixosConfigurations.mercury

Now, listing the keys of the m set isn’t hard. We can just type m and slam Tab, or use the builtins.attrNames function:

nix-repl> builtins.attrNames m
[ "_module" "_type" "class" "config" "extendModules" "extraArgs" "lib" "options" "pkgs" "type" ]

Now’s the fun bit: we want to recurse into each of those keys and find its keys, then start building a list of keys, somehow.

Functionally non-functional
#

In a normal programming language, this is a job for loops — loop over the list of keys, then call some (recursive) function on each key. Unfortunately, Nix is functional, so we don’t really have access to loops. We need to think using recursion from square one.

The builtins.map function allows us to apply a function to each element of a list, which is a nice way of dealing with the list, but it isn’t good enough. Experimentation has taught me that if we recurse using just map we’ll end up with a list of lists of lists of lists of… since map runs a recursive function of each element of the list, each element could potentially be replaced with a list, recursively. What we’d really want is for the _module key to be fully, recursively evaluated before we move on to the _type key. So we need a function which can generate all the leaf-level keys of a key and return a flat list.

Folding lists and other nonsense phrases
#

Enter builtins.foldl', the scarier sibling of map. As the name suggests foldl' folds a list. The l suggests we do this from the left — this is important if we’re dealing with very long or infinite lists. The ' is there for artistic flair, I imagine.

Further analysis shows that it’s named foldl' rather than foldl to match the name and behavior of the Haskell function by the same name. Specifically, foldl is lazy in its accumulation — if you know functional programming, this makes sense, I promise — whereas foldl' is strict in its evaluation. Nix, it seems, does not offer a lazy fold, so foldl' it is.

But what does it actually mean to fold a list. The docs “helpfully” illustrate this as foldl' op nul [x0 x1 x2 ...] = op (op (op nul x0) x1) x2) .... What this means, in cleartext, is that the function takes some binary operation (op) — i.e. a function with two inputs — and applies it to the initial value (nul) and the first element of the list (x0). Then it applies the same operation to the result of the previous operation and the next value in the list (x1). And so on.

A simple example is

foldl' (acc: elem: acc + elem) 0 [1 2 3]

Here we declare the operation to be addition. I know it looks more complex, but that’s really what we’re doing. acc: elem: acc + elem declares a function with two inputs — acc and elem — which returns the sum of those two inputs. We then declare the initial value to 0 and provide a list [ 1 2 3 ] to iterate over.

Expanding this computation looks like this:

((0 + 1) + 2) + 3

Consequently, the result is 6. Easy as.

Code origami
#

So, how do we fold? Our fold can’t be a simple summation or something like that — the fold operation itself has to be recursive, since we need to recurse into each key in the set. Consequently, the definition of the op part will be where we need to focus our attention. The nul value will be a list — initialized as an empty list — in order to accumulate all the keys. Finally, the list to iterate over will be the result of attrNames s for some set s — initially m, of course.

This is going to be an exercise in function design which is… a very special beast when it comes to functional programming. So, let’s start by considering the two situations our function has to handle:

  • The next element to fold is a set…
  • or it’s anything else
Nuance, again: In the earlier paragraph I stated that we’d be folding over the list from attrNames, but we can’t assume — because of recursion — that any given element will be a set. Consequently, we can’t assume that we can use attrNames. So, we need to look at this higher level of asking if a given element is even a set.

The latter case is the “simplest”, wherein we want to add our current key to the accumulator list, but that requires that we know the current key. We can’t just gain this information from looking at the current element, so it has to be passed in as an argument to the function. This already makes things difficult, since the folding function is binary — we know that the only arguments it can have are the accumulator and the next element. Can’t sneak a string in there, so we have to do it elsewhere.

That all being said, here’s a rough draft of what we’ve got so far:

foldop = with builtins; acc: elem:
  if isAttrs elem
  then
    # set handling goes here
  else acc ++ [ _keyFromSomewhere ];

It’s not much, honestly. We’re using the isAttrs function from builtins to check if we’re dealing with a set — recall that they’re actually called attribute sets — and then not doing much else. The then block will contain the recursion into the set, and the else block depends on a key we don’t have. Great start.

From where I’m sitting, it seems we need to wrap foldl' in order to sneak in an additional value: the key.

Haskell Curry, you’ve done it again
#

All Nix functions are curried. What this means is, in a gross simplification, that all functions only take one argument. To create a multi-argument function, you create a function which takes an argument and returns another function which takes an argument, and so on. This allows us to do some very sneaky things.

As we’ve seen, our folding function needs to have an additional argument. So, let’s introduce one:

foldop = with builtins; key: acc: elem:
  if isAttrs elem
  then
    # set handling goes here
  else acc ++ [ key ];

Easy as. We’ve got our key. The problem now is that we can no longer use this function as a folding function since it’s no longer binary. That is, unless we make it binary. Currying to the rescue.

If we call our foldop function with just the first argument, it doesn’t return an error like it would in most languages. It returns a function which still needs two arguments — the acc and elem — perfect for folding. The problem then becomes: how do we construct the key?

Stack push: Trial by REPL
#

Can we make sure this runs, as is? For now, just plug in a throw instruction in the then block — we’ll deal with recursion in a moment.

# evil.nix
{
foldop = with builtins; key: acc: elem:
  if isAttrs elem
  then throw "later problem"
  else acc ++ [ key ];
}
I originally wanted to plug this code straight into the REPL, without need for loading an external file, but it was way easier to do it this way. More steps, but I think it’ll handle recursion better than the REPL has during my testing.
nix-repl> :l evil.nix
Added 1 variables.

nix-repl> foldop "" [] 1
[ "" ]

nix-repl> foldop "bar" [] 1
[ "bar" ]

nix-repl> foldop "bar" [] {}
error:
while calling the 'throw' builtin

         at /[redacted]/evil.nix:4:8:

            3|   if isAttrs elem
            4|   then throw "later problem"
             |        ^
            5|   else acc ++ [ key ];

       error: later problem

Good enough.

Back to regularly scheduled currying
#

We can actually make one very quick change already — we know that we must call something recursively on our function then append whatever it returns to the accumulator list. So:

foldop = with builtins; key: acc: elem:
  if isAttrs elem
  then acc ++ _recursionGoHere
  else acc ++ [ key ];

Here’s where things get interesting. Let’s think about what we want to call foldl' — a binary operation, an initial value, and a list to iterate over. The last two are simple.

_recursionGoHere = foldl' _someOp [] (attrNames elem)

Recall that we are, in this branch, certain that elem is a set, so attrNames elem will evaluate to a list of strings — the names of the attributes in the set.

Our _someOp will need to take the accumulator and the attribute name, extract the value of elem at that name, then feed a new key, the accumulator, and the extracted set into foldop. At least, I think that’s what I need to do.

_someOp = acc: name: foldop "${key}.${name}" acc (getAttr name elem)

I think that’s it. Now, we’re doing a bunch of shady things here, since we’re referencing variables outside of our scope — specifically, we’re using key here to get the key that was originally fed to foldop as well as the elem, where we’re using getAttr to get the value from the set.

Turns out we didn’t need to use currying after all. Oh well.

Time to start going back up the abstraction chain.

_recursionGoHere = foldl' (acc: name: foldop "${key}.${name}" acc (getAttr name elem)) [ ] (
  attrNames elem
);

And finally…

foldop =
  with builtins;
  key: acc: elem:
  if isAttrs elem then
    acc ++ (foldl' (acc: name: foldop "${key}.${name}" acc (getAttr name elem)) [ ] (attrNames elem))
  else
    acc ++ [ key ];

This formatting is brought to you by the nix fmt command.

Trying it out
#

# evil.nix

rec {
  foldop =
    with builtins;
    key: acc: elem:
    if isAttrs elem then
      acc ++ (foldl' (acc: name: foldop "${key}.${name}" acc (getAttr name elem)) [ ] (attrNames elem))
    else
      acc ++ [ key ];
}

Here’s what I actually had to put into evil.nix to make it work. Note that the set is now recursive, via the rec keyword — this is required, or else the foldop call inside won’t be recognized.

So, does it work?

nix-repl> foldop "" [] "foo"
[ "" ]

nix-repl> foldop "bar" [] "foo"
[ "bar" ]

nix-repl> foldop "bar" [] {}
[ ]

Nothing new there, aside from the fully empty list when calling it on an empty set.

nix-repl> foldop "bar" [] { a = 1; b = 2; }
[ "bar.a" "bar.b" ]

Now, that is interesting. Let’s see how it handles recursion.

nix-repl> foldop "bar" [] { a = 1; b = 2; c = { q = 1; r = 3;}; }
[ "bar.a" "bar.b" "bar.c.q" "bar.c.r" ]

That’s exactly what I was hoping for.

Since this all started with me wanting to try and find the “kepler” keys in our system config, let’s see if we can filter the resultant list.

nix-repl> keyfold = foldop "root" []

Oh look! We got to do some currying after all!

nix-repl> builtins.filter (x: (builtins.match "^.*c.*$" x) != null ) (keyfold { a = 1; b = 2; c = { q = 1; r = 3;}; })
[ "root.c.q" "root.c.r" ]

Yep, that seems to behave. So, after some additional more declarations and currying…

nix-repl> keplerMatch = x: (builtins.match "^.*kepler.*$" x) != null

nix-repl> keplerFilter = builtins.filter keplerMatch

… let’s throw it at the wall and see what sticks.

nix-repl> keplerFilter (keyfold m)
error: stack overflow (possible infinite recursion)

Well, shit.

Overcoming the overflow
#

There are a couple of options here. Either the search space simply got too large and I should’ve put the filtering logic inside the fold, rather than outside it, or we actually did hit infinite recursion. We do know that Nix sets are allowed to be recursive and self-referential. I just hadn’t expected to actually find that in the system configuration.

But actually… let’s look again at what’s in m — i.e. our system configuration.

nix-repl> builtins.attrNames m
[ "_module" "_type" "class" "config" "extendModules" "extraArgs" "lib" "options" "pkgs" "type" ]

pkgs? Is that all of nixpkgs? If so, there’s really no need for us to dig into it. The same goes for lib and probably everything that isn’t config or options. So let’s try again, with less dumb.

nix-repl> keplerFilter (keyfold m.config)
trace: Obsolete option `boot.binfmtMiscRegistrations' is used. It was renamed to `boot.binfmt.registrations'.
trace: Obsolete option `boot.bootMount' is used. It was renamed to `boot.loader.grub.bootDevice'.
error:
while calling the 'filter' builtin

         at «string»:1:1:

            1| keplerFilter (keyfold m.config)
             | ^

while calling the 'foldl'' builtin

         at /home/sldr/src/kepler/evil.nix:6:15:

            5|     if isAttrs elem then
            6|       acc ++ (foldl' (acc: name: foldop "${key}.${name}" acc (getAttr name elem)) [ ] (attrNames elem))
             |               ^
            7|     else

       (stack trace truncated; use '--show-trace' to show the full trace)

       error: The option `boot.loader.grub.bootDevice' can no longer be used since it's been removed.

Well that’s something different, to be sure.

Dicking around with the REPL, I notice I get an error as soon as I access m.config.boot.loader.grub.bootDevice, but not m.config.boot.loader.grub. Can we do something about error-handling this in our code? Specifically, we want to guard the getAttr call against errors.

Thankfully, there’s a builtin function called tryEval which appears to do what we want.

nix-repl> builtins.tryEval m.config.boot.loader.grub.bootDevice
{ success = false; value = false; }

nix-repl> builtins.tryEval m.config.boot.loader.grub
{ success = true; value = { ... }; }

So, we need to wrap our getAttr in getEval and only conditionally return the value. Question is, what should we return if we get an error? Well, we know from previous dicking around with the REPL that calling foldop on an empty set will just return acc unaltered, so let’s go for that.

let _guard = tryEval (getAttr name elem);
in if _guard.success then _guard.value else {};

Something like that. Let’s see if we can wrangle that into the function.

# evil.nix

rec {
  foldop =
    with builtins;
    key: acc: elem:
    if isAttrs elem then
      acc
      ++ (foldl' (
        acc: name:
        foldop "${key}.${name}" acc (
          let
            guard = tryEval (getAttr name elem);
          in
          if guard.success then guard.value else { }
        )
      ) [ ] (attrNames elem))
    else
      acc ++ [ key ];
}
I love functional programming declarations. They’re so cursed.

Well, let’s see if that solves the issue…

nix-repl> keyfold m.config.boot.loader.grub
trace: Obsolete option `boot.loader.grub.timeout' is used. It was renamed to `boot.loader.timeout'.
error: stack overflow (possible infinite recursion)

Double shit.


I’ve spent too much time on this rabbit trail now. While it annoys me greatly that I can’t find the Home Manager options anywhere in the config, I think the tact to take is to trawl through Home Manager’s source code, rather than trying to search through the system config.

Nevertheless, this has been a really good exercise in coding Nix and dealing with the system configuration. Oh, and modules. That was kind of the point. I think.