My workflow with NixOS. How do I work with it

12 Sep 2022 · Seven minute read · on Gianluca's blog

News! Tech related notes are NOW published to ShippingBytes. See you there! I always felt this was not the right place for me to write consistently about tech and tools. So if you want to read more about that see you at the other side

Some context

Coding is fun when you can figure out the right workflow. There is nothing fun when it comes to writing software in a way that is not sustainable or that does not sparks joy.

I started to use Nix and NixOS almost two years ago, in a previous job in a totally different context.

Back then we had to quickly and often provision operating system, build software and so on. Since I moved back to write Software and to write Rust I have to admit that building my code, or shipping operating systems is not something I have to do very often, but I decided to keep learning and fighting against NixOS because it fits my mindset.

Recently I resumed a few NUCs I keep in a box because everybody deserves a home lab, and a good home lab deserves some netbooting, so it was time to play with NixOS for something that is not my workstation or my laptop.

The workflow

Nix is code, finally. It means that there are libraries, you can import them, run tests, and execute such code. YAML, Json in my experience, at some point are a limitation, or they create friction, you ended up with an easy to break template engine.

I decided to invest some time to figure out how to use flake. And this is where I am so far:

{
  description = "A generic and minimal netbooting OS for my homelab";

  inputs =
    {
      nixpkgs.url = "github:NixOS/nixpkgs/nixos-22.05";
    };

  outputs = { self, nixpkgs, ... }:
    let
      system = "x86_64-linux";
    in
    {
      nixosConfigurations = {
        generic = nixpkgs.lib.nixosSystem {
          inherit system;
          modules = [
            ./configuration.nix
          ];
        };
      };
      packages.${system}.netboot = nixpkgs.legacyPackages.${system}.symlinkJoin {
        name = "netboot";
        paths = with self.nixosConfigurations.generic.config.system.build; [
          netbootRamdisk
          kernel
          netbootIpxeScript
        ];
        preferLocalBuild = true;
      };
    };
}

I am not the right person to tell you what all of this does because I am not an expert and it is the outcome of many videos on YouTube, questions on discourse.nixos.org, articles and beers, a lot of beers.

The output part describes what I want to build and as you can see there are two outcomes. One is a nixosConfigurations, potentially it can contain more than one NixOS description but right now I have a single one called generic and as you can see it imports a module called configuration.nix. You can see it as a ready to go NixOS provisioned as I want. This is 99% a copy paste of a traditional configuration.nix file as you may know them. The one I use comes from “Netbooting Wiki” in NixOS.org.

{ config, pkgs, lib, modulesPath, ... }: with lib; {
  imports = [
    (modulesPath + "/installer/netboot/netboot-base.nix")
  ];
  users.users.root.openssh.authorizedKeys.keys = [
    "ssh-sfdbsrbs"
  ];

  ## Some useful options for setting up a new system
  services.getty.autologinUser = mkForce "root";

  environment.systemPackages = [ pkgs.tailscale ];

  networking.dhcpcd.enable = true;

  services.openssh.enable = true;
  services.tailscale.enable = true;

  hardware.cpu.intel.updateMicrocode =
    lib.mkDefault config.hardware.enableRedistributableFirmware;

  systemd.services.tailscale-autoconnect = {
    description = "Automatic connection to Tailscale";

    # make sure tailscale is running before trying to connect to tailscale
    after = [ "network-pre.target" "tailscale.service" ];
    wants = [ "network-pre.target" "tailscale.service" ];
    wantedBy = [ "multi-user.target" ];

    # set this service as a oneshot job
    serviceConfig.Type = "oneshot";

    # have the job run this shell script
    script = with pkgs; ''
      # wait for tailscaled to settle
      sleep 2

      # check if we are already authenticated to tailscale
      status="$(${tailscale}/bin/tailscale status -json | ${jq}/bin/jq -r .BackendState)"
      if [ $status = "Running" ]; then # if so, then do nothing
        exit 0
      fi

      # otherwise authenticate with tailscale
      ${tailscale}/bin/tailscale up -authkey tskey-really
    '';
  };

  networking.firewall = {
    checkReversePath = "loose";
    enable = true;
    trustedInterfaces = [ "tailscale0" ];
    allowedUDPPorts = [ config.services.tailscale.port ];
  };

  system.stateVersion = "22.05";
}

The only difference compared with a traditional non-flake configuration is the import:

  imports = [
    (modulesPath + "/installer/netboot/netboot-base.nix")
  ];

Flake provides the utility variable modulesPath as a shortcut for accessing the nixpkgs modules described as flake input.

This OS does a few simple things:

The output nixosConfigurations is used via nixos-build. It took me some time to figure out that nixos-build used in the right wat does not replace my current operating system. Do not run nixos-build switch if you won’t want to screw up your local NixOS OS! Instead you can build this operating system in the ./result directory via:

$ nixos-rebuild build --flake .#generic

A single configuration can describe different NixOS, that’s why you have to identify what you want to build with ` .#generic`.

The second output builds the same OS but it shapes the content of the ./result directory as I want it (I am not sure if I need it but this is what the NixOS netbooting wiki does, so far so good).

To build it you can use nix build:

$ nix build .#netboot

Pretty cool! I can tar.gz that and ship it where I want. Straightforward.

How to run this VM

Do you know how boring and time consuming it is to test a new operating system?

If you want to do it on real hardware you have to set it up, and if you want to use QEMU you have a few days in front of you to remember all the flags you need, how to bridge the guest with the host and who knows what. I tried for a few days and I failed, until I discovered:

$ nixos-rebuild build-vm --flake .#generic
building the system configuration...

Done.  The virtual machine can be started by running /nix/store/dk4i22xmacnxxdmgvjhlyain5spb11yn-nixos-vm/bin/run-nixos-vm

Pure gold! If you run the run-nixos-vm script a QEMU virtual machine will appear ready for you to test your operating system. Kind of cool! I can even see it showing up in the Tailscale admin console!

A zero friction experience that boost my ability to try what I am working on.

Integration tests

Nix provides a testing framework, but I started to use it recently. It spins up one or more virtual machines and assert that they work as expected. I wrote a test that looks for the tailscale network inteface:

let
  nixpkgs = fetchTarball "https://github.com/NixOS/nixpkgs/archive/0f8f64b54ed07966b83db2f20c888d5e035012ef.tar.gz";
  pkgs = import nixpkgs { };
in
pkgs.nixosTest
  ({
    system = "x86_64-linux";

    nodes.machine = import ./configuration.nix;

    testScript = ''
      start_all()
      machine.succeed("sleep 5")
      machine.succeed(
          "ifconfig | grep tailscale0",
      )
    '';
  })

This test uses the same configuration.nix I used to generate my netbooting NixOS. It starts a node called machine and via python script it runs the bash command ifconfig | grep tailscale0. I am sure I can do better than sleep 5 but as I said, I am far away from being good at this.

You can use this approach to run assertions on multiple nodes, here an example from Nix.dev “Integration testing using virtual machines (VMs)”.

Steep learning curve

Everyone agrees that Nix and NixOS are not easy technology to pick up. And I can confirm, there are articles, blogs, dotfiles available everywhere but they look all different and it is hard to figure out if they are new, old or how to apply them to your use case.

Flake is an attempt from the community to standardize all of that, and much more. We will see!

It is also true that motivation and context can flat the curve. My plan is to write more about this topic since I am trying to spin up and automated a home lab.

I have to figure out how to do secret management but as soon as I have it sorted out I will share my homelab configuration as I share my laptops configuration in my dotfiles.

Stay tuned.

Something weird with this website? Let me know.