I’ve recently been working on setting up Drone CI on the tilde.team machine. However, there’s been something strange going on with the networking on there.
Starting up drone with docker-compose didn’t seem to be working: netstat -tulpn
showed the port binding properly to 127.0.0.1:8888 but I was completely unable to get anything from it (using curl the nginx proxy that was to come).
I ended up scrapping docker on the ~team box itself and moving it into a LXD container (pronounced “lex-dee”) with nesting enabled.
This got us in to another problem that had been seen before when using nginx to proxy to apps running in other containers. Requests were dropped intermittently, sometimes hanging for upwards of 30 seconds.
Getting frustrated with this error, I tried to reproduce it on another host. Both the docker-proxy and nginx->LXD proxies worked on the first try, yielding no clues as to where things were going wrong.
In a half-awake stupor last Saturday evening, I decided to try rule out IPv6 by disabling it system-wide. As is expected for sleepy work, it didn’t fix the problem and created more in the process.
Feeling satisfied that the problem didn’t lie with IPv6, I re-enabled it, only to find that I was unable to bind nginx to my allocated /64. I may or may not have ranted a bit about this on IRC but I was able to get it back up and running by restarting systemd-networkd.
One step forwards broke something and now we’re back to where we started with the original problem of the intermittent hangups to the LXD container.
Seeing my troubles on IRC, jchelpau offered to help dig in to the problem with a fresh set of eyes. He noted right away that pings over ipv6 to the containers worked fine, but ipv4 did not.
We ended up looking at the firewall configurations, only to find that one of the subnets I blocked after november’s nmap incident included lxdbr0
’s subnet (the bridge device used by LXD).
Now that I made the exception for lxdbr0
, everything is working as expected!