this post was submitted on 18 Feb 2025
5 points (100.0% liked)

Nix / NixOS

1963 readers
5 users here now

Main links

Videos

founded 2 years ago
MODERATORS
 

I'm rebuilding my home server in nixos.

Rather that configuring the various services natively in nixos, I decided to run containers via virtualisation.oci-containers whenever possible, mostly to be able to independently update the system and the various services.

Everything is going smoothly, but whenever I (for whatever reason) do nixos-rebuild boot and reboot after adding a container instead of nixos-rebuild switch, I run into this issue where podman isn't able to resolve the host (below you see the docker hub host, but it also happened with ghcr.io):

podman-apprise-start[1352]: Trying to pull docker.io/caronc/apprise:1.1.8...
podman-apprise-start[1352]: Pulling image //caronc/apprise:1.1.8 inside systemd: setting pull timeout to 5m0s
podman-apprise-start[1352]: Error: initializing source docker://caronc/apprise:1.1.8: pinging container registry registry-1.docker.io: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io: no such host

I thought that my podman-* services were missing a dependency on network-online and that they were started before the network was available, but it is't the case:

# systemctl list-dependencies podman-apprise.service 
podman-apprise.service
● ├─system.slice
● ├─network-online.target
● │ └─systemd-networkd-wait-online.service
● └─sysinit.target
●   ├─dev-hugepages.mount
[...snip...]

Do you happen to know what the issue is?

PS: Manually running systemctl start podman-whatever once fixes the issue, of course, but I wonder if there's a more robust solution?


update:

After investigating based on balsoft input below, the issue seems to be that systemd-networkd-wait-online doesn't behave as expected (by me).

Basically, systemd-networkd-wait-online waits for network interfaces to have a carrier (working ethernet cable) and an IP address. This is what in systemd-networkd docs is called the "degraded" state (no, it doesn't mean that something got worse than before... don't think too much of what "degraded" implies in English).

In my case, I have an interface that is setup via DHCP and that also has static IPs assigned:

$ cat /etc/systemd/network/00-lan1.network 
[Match]
Name=lan1

[Network]
DHCP=ipv4
IPv6AcceptRA=no
LinkLocalAddressing=no

[Address]
Address=192.168.10.10/24

[Address]
Address=192.168.10.99/24

If you are wondering, the reason I do this is that I want static IPs for my dns server and reverse proxy, but I also want my home server to use DHCP to fetch some network-wide configuration which, critically, includes the default route.

Back to the issue: IIUC, since the interface has a non-link-local address (which systemd-networkd confusingly calls a "routable" address), it is immediately considered "routable" (a state that is moar better than "degraded") and so not only it's basically ignored by the default systemd-networkd-wait-online configuration, but even adding

[Link]
RequiredForOnline=routable

to /etc/systemd/network/00-lan1.network doesn't make a difference whatsoever.

For now, my stopgap solution is to explicitly set the default route for the "lan1" network:

[Network]
Gateway=192.168.10.1

this seems to solve the issue with podman and, while the system still thinks to be "online" before being fully configured, it will suffice until I find a more elegant/robust way (ping me in a while if you are interested).

refs:
systemd-networkd-wait-online man page
systemd-networkd docs on "RequiredForOnline"
networkctl man page

top 3 comments
sorted by: hot top controversial new old
[–] balsoft@lemmy.ml 1 points 2 days ago (1 children)

The issue is that network-online.target does not necessarily mean that you actually have internet access, it usually just means that you have an IP address. I've found that it can take a dozen seconds to actually get internet connectivity, depending on your setup. As a hack, you can add while ! ping -c1 ghcr.io; do sleep 1; done or similar to PreStart of your container services.

[–] gomp@lemmy.ml 2 points 2 days ago (1 children)

Thanks that was really helpful!

In my case, the system did not have a default route - I've updated the post with details.

[–] balsoft@lemmy.ml 1 points 2 days ago

Glad to hear you got it working!