Overview

I want a simple way to create unprivileged Linux containers ad-hoc, without needing the complexity of LXD.

The documentation that easily surfaced online was not fully up to date, so I want to document what worked for me so I can refer back in future.

I am currently running Arch Linux with kernel version 6.5.9-arch2-1.

See also:

Install LXC

sudo pacman -S lxc

Copy the user config file

The default.conf file specifies the default configuration applied to newly created containers. Because we want to create unprivileged containers, our user needs a copy of this file.

mkdir ~/.config/lxc/
cp /etc/lxc/default.conf ~/.config/lxc/

Set uid mappings

The uids inside the container need to map to actual values outside the container. We assign a range of high numbers for the mapping to avoid conflict with processes running on the host.

In both /etc/subgid and /etc/subuid, enter the following line (replace user with your username).

user:100000:65535

We need to update ~/.config/lxc/default.conf to instruct the container how to map the uids, by appending these lines.

...
lxc.idmap = u 0 100000 65536
lxc.idmap = g 0 100000 65536

Network configuration

There are multiple ways to configure networking with containers, but for a simple scenario like this we can use the lxc-net script/service which ships with lxc. This will create a bridge device which the containers can connect to via virtual ethernet interfaces.

Configure lxc-net

If you peek inside /etc/default/lxc you should see that it sources the file /etc/default/lxc-net if it exists, which is the file we can use to configure lxc-net. There are multiple configuration options available, but the minimum configuration necessary is simply to enable the lxc bridge.

Create /etc/default/lxc-net and add the line:

USE_LXC_BRIDGE="true"

Each container needs to create a veth interface to connect to the bridge, but creating the interfaces is a privileged operation. For use with unprivileged containers, we need to tell lxc-net that it should create the interfaces for the user.

To do this, we create/edit /etc/lxc/lxc-usernet to specify a quota of devices that a user is allowed to create.

user veth lxcbr0 10

This specifies that lxc-net will create up to 10 veth devices for the specified user, and connect them to the lxcbr0 bridge.

Update the default container configuration

We again need to ensure that ~/.config/lxc/default.conf has the correct configuration, so that created containers know how to connect to the network. Ensure you have the following lines.

lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
lxc.net.0.hwaddr = 00:16:3e:xx:xx:xx

The hardware address should be the same as what is configured for lxc-net; the default can be found inside /usr/lib/lxc/lxc-net in the LXC_BRIDGE_MAC variable.

Enable the systemd service

Now we can enable and start the systemd service.

sudo systemctl enable --now lxc-net

You should be able to see the bridge device created, for example with the ip link command.

...
3: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
...

Firewall configuration

I have ufw installed and configured to allow outgoing, but deny incoming and routed connections.

$ sudo ufw status verbose
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)
...

This will cause a problem, as incoming packets from the container will arrive on lxcbr0, which will attempt to route them via the main network connection of the machine.

We need to configure ufw to allow the container connections.

$ sudo ufw allow in on lxcbr0
$ sudo ufw route allow in on lxcbr0

AppArmor woes

LXC has support for constraining containers via AppArmor profiles, which should be able to provide some level of protection for the host against malicious activities inside a container.

The AppArmor setting appears to affect results even if I disabled the apparmor service, though I haven't tested what happens if I remove lsm=apparmor from my grub kernel options.

The problem I faced is that my unprivileged containers would not connect to the network unless they were configured to be unconfined. I tried various LXC/AppArmor configs, but there is something I am missing. I might come back to it later, but for now I just configured my containers to be unconfined. At this stage I am not running anything untrusted in them.

# ~/.config/lxc/default.conf
...
lxc.apparmor.profile = unconfined

See man 5 lxc.container.conf for more information about AppArmor configuration.

Create and run a container

At this point we should be able to create and run containers, with the network connected inside.

Various LXC guides, including the official ones, talk about cgroup delegation and specify that lxc-create needs to be invoked with a systemd-run ... command with various options set.

However as of 2023-11-01, I find this is not necessary. The commands work without any issues.

Create a container

We can use the lxc-create command to create a container, which will download and extract an image for us. Use the -n option to specify the name you want.

lxc-create -n mycontainer -t download -- --dist debian --release bookworm --arch amd64

When this completes you should be able to see the container in the list.

$ lxc-ls -f
NAME        STATE   AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED 
mycontainer STOPPED 0         -      -    -    true

Start and attach

Use the lxc-start command to start the container in the background.

$ lxc-start mycontainer
$ lxc-ls -f
NAME        STATE   AUTOSTART GROUPS IPV4       IPV6 UNPRIVILEGED 
mycontainer RUNNING 0         -      10.0.3.102 -    true

Use the lxc-attach command to open a shell inside the container.

$ lxc-attach mycontainer
root@mycontainier:/# 

Test the network

root@mycontainer:/# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=6.22 ms
^C
--- 1.1.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 6.215/6.215/6.215/0.000 ms

Success!

Conclusion

After a small amount of system configuration we can quite easily create Linux containers and open a shell inside. If we are not managing a complex hypervisor or orchestrating many services, we can do this quite easily without the need for LXD or any more complex container manager.