Overview
I want a simple way to create unprivileged Linux containers ad-hoc, without needing the complexity of LXD.
The documentation that easily surfaced online was not fully up to date, so I want to document what worked for me so I can refer back in future.
I am currently running Arch Linux with kernel version 6.5.9-arch2-1.
See also:
- Arch Wiki Linux Containers article
man 5 lxc.container.conf
Install LXC
sudo pacman -S lxcCopy the user config file
The default.conf file specifies the default configuration applied to newly created
containers. Because we want to create unprivileged containers, our user needs a copy of
this file.
mkdir ~/.config/lxc/
cp /etc/lxc/default.conf ~/.config/lxc/Set uid mappings
The uids inside the container need to map to actual values outside the container. We assign a range of high numbers for the mapping to avoid conflict with processes running on the host.
In both /etc/subgid and /etc/subuid, enter the following line (replace user with
your username).
user:100000:65535
We need to update ~/.config/lxc/default.conf to instruct the container how to map the
uids, by appending these lines.
...
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536Network configuration
There are multiple ways to configure networking with containers, but for a simple
scenario like this we can use the lxc-net script/service which ships with lxc. This
will create a bridge device which the containers can connect to via virtual ethernet
interfaces.
Configure lxc-net
If you peek inside /etc/default/lxc you should see that it sources the file
/etc/default/lxc-net if it exists, which is the file we can use to configure
lxc-net. There are multiple configuration options available, but the minimum
configuration necessary is simply to enable the lxc bridge.
Create /etc/default/lxc-net and add the line:
USE_LXC_BRIDGE="true"
Each container needs to create a veth interface to connect to the bridge, but creating the interfaces is a privileged operation. For use with unprivileged containers, we need to tell lxc-net that it should create the interfaces for the user.
To do this, we create/edit /etc/lxc/lxc-usernet to specify a quota of devices that a
user is allowed to create.
user veth lxcbr0 10
This specifies that lxc-net will create up to 10 veth devices for the specified user,
and connect them to the lxcbr0 bridge.
Update the default container configuration
We again need to ensure that ~/.config/lxc/default.conf has the correct configuration,
so that created containers know how to connect to the network. Ensure you have the
following lines.
lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:xx:xx:xx
The hardware address should be the same as what is configured for lxc-net; the default
can be found inside /usr/lib/lxc/lxc-net in the LXC_BRIDGE_MAC variable.
Enable the systemd service
Now we can enable and start the systemd service.
sudo systemctl enable --now lxc-net
You should be able to see the bridge device created, for example with the ip link
command.
...
3: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
...Firewall configuration
I have ufw installed and configured to allow outgoing, but deny incoming and routed
connections.
$ sudo ufw status verbose
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)
...
This will cause a problem, as incoming packets from the container will arrive on lxcbr0, which will attempt to route them via the main network connection of the machine.
We need to configure ufw to allow the container connections.
$ sudo ufw allow in on lxcbr0
$ sudo ufw route allow in on lxcbr0AppArmor woes
LXC has support for constraining containers via AppArmor profiles, which should be able to provide some level of protection for the host against malicious activities inside a container.
The AppArmor setting appears to affect results even if I disabled the apparmor
service, though I haven't tested what happens if I remove lsm=apparmor from my grub
kernel options.
The problem I faced is that my unprivileged containers would not connect to the network unless they were configured to be unconfined. I tried various LXC/AppArmor configs, but there is something I am missing. I might come back to it later, but for now I just configured my containers to be unconfined. At this stage I am not running anything untrusted in them.
# ~/.config/lxc/default.conf
...
lxc.apparmor.profile = unconfined
See man 5 lxc.container.conf for more information about AppArmor configuration.
Create and run a container
At this point we should be able to create and run containers, with the network connected inside.
Various LXC guides, including the official ones, talk about cgroup delegation and
specify that lxc-create needs to be invoked with a systemd-run ... command with
various options set.
However as of 2023-11-01, I find this is not necessary. The commands work without any issues.
Create a container
We can use the lxc-create command to create a container, which will download and
extract an image for us. Use the -n option to specify the name you want.
lxc-create -n mycontainer -t download -- --dist debian --release bookworm --arch amd64
When this completes you should be able to see the container in the list.
$ lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
mycontainer STOPPED 0 - - - trueStart and attach
Use the lxc-start command to start the container in the background.
$ lxc-start mycontainer
$ lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
mycontainer RUNNING 0 - 10.0.3.102 - true
Use the lxc-attach command to open a shell inside the container.
$ lxc-attach mycontainer
root@mycontainier:/#
Test the network
root@mycontainer:/# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=6.22 ms
^C
--- 1.1.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 6.215/6.215/6.215/0.000 ms
Success!
Conclusion
After a small amount of system configuration we can quite easily create Linux containers and open a shell inside. If we are not managing a complex hypervisor or orchestrating many services, we can do this quite easily without the need for LXD or any more complex container manager.