-
Notifications
You must be signed in to change notification settings - Fork 366
Description
Issue Description
This is a continuation from these two Toolbx issues about the nested pseudo-terminal devices created by Podman: containers/toolbox#568 and containers/toolbox#1016 I am afraid that despite my best effots, this might conflate several different issues.
At some point, specifically commit 494007b6cadc5fe3 or containers/toolbox#581, Toolbx started mounting a separate devpts
file system inside the container's mount and user namespace using:
podman create --mount type=devpts,destination=/dev/pts --userns keep-id ...
This fixed the group ownership of the pseudo-terminal devices created under /dev/pts
, as mentioned in the commit message. ie., it changed from nobody
to tty
. That's good.
However, the devices' user ownership remained root
and didn't change to my $UID
. Why is that?
I suppose this is due to the specifics of how the OCI runtime (crun(1)
in this case) sets up the devices. However, I am getting lost in the weeds trying to track this through parseVolumes() and SpecGenToOCI() in Podman to the spec or config created by crun(1)
to what actually gets sent to the kernel.
This is how the mounts look inside the container:
⬢[rishi@toolbox ~]$ mount | grep devpts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/home/rishi/.local/share/containers/storage/overlay/c9ddcc6e8dd98c1aa6a01b7f04f79bd10b2df8b71e6f85b516162f21ef2c878b/merged/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,context="system_u:object_r:container_file_t:s0:c1022,c1023",gid=100005,mode=620,ptmxmode=666)
The first three entries are the host's devpts
-- the host's /
is bind mounted at /run/host
inside the container. Notice how Podman or crun(1)
plugs in non-default values for the gid
, mode
and ptmxmode
options.
At this point, I tried specifying a uid
:
podman create --mount type=devpts,destination=/dev/pts,uid=1000 --userns keep-id ...
That fixed the devices' user ownership, but the group ownership changed to root
from tty
, because it looks like those non-default options got reset to their defaults in the kernel. Is that expected? Compare this with the above:
⬢[rishi@toolbox ~]$ mount | grep devpts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/home/rishi/.local/share/containers/storage/overlay/6261546a819230f91b10a628b8668b02636b8df2b024b3e7c0f176b70759c455/merged/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /dev/pts type devpts (rw,relatime,context="system_u:object_r:container_file_t:s0:c1022,c1023",uid=1000,mode=600,ptmxmode=000)
devpts on /run/host/home/rishi/.local/share/containers/storage/overlay/6261546a819230f91b10a628b8668b02636b8df2b024b3e7c0f176b70759c455/merged/dev/pts type devpts (rw,relatime,context="system_u:object_r:container_file_t:s0:c1022,c1023",uid=1000,mode=600,ptmxmode=000)
Again the first three entries are from the host's devpts
, but this time we have two entries for the container's own devpts
. I wonder if that extra entry is worth ironing out.
At this point, I considered using the container's entry point to mount the devpts
file system instead of going through podman create --mount ...
. If nothing else, it has the advantage of being able to modify pre-existing containers, which is very helpful for long-living Toolbx containers. Something like:
mount --types devpts \
--options gid=tty,uid=1000,mode=620,ptmxmode=000 \
devpts \
/dev/pts
This seems to work. The devices are owned by rishi:tty
as they are on the host.
Before committing to this approach, I want to ask if it's expected that the devices' user ownership would be root
unless a uid
is specified? Note that we are using --userns keep-id
.
This still leaves the question of the problem faced by @dustymabe in containers/toolbox#568
It seems to me that the gpg-agent
and pinentry
processes running on the host, really want access to the secondary end of the nested pseudo-terminal device inside the container's mount and user namespaces (ie., the output of tty(1)
or the /dev/pts/N
device inside the container). Note how the gpg-agent(1)
manual says that the GPG_AGENT
environment variable should always reflect the output of tty(1)
.
Do you have any suggestions? Is this something that has already been solved elsewhere?
Steps to reproduce the issue
see above
Describe the results you received
see above
Describe the results you expected
see above
podman info output
host:
arch: amd64
buildahVersion: 1.29.0
cgroupControllers:
- cpu
- io
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.5-1.fc36.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.5, commit: '
cpuUtilization:
idlePercent: 99.07
systemPercent: 0.16
userPercent: 0.78
cpus: 16
distribution:
distribution: fedora
variant: workstation
version: "36"
eventLogger: journald
hostname: topinka
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.1.14-100.fc36.x86_64
linkmode: dynamic
logDriver: journald
memFree: 20129562624
memTotal: 33553547264
networkBackend: netavark
ociRuntime:
name: crun
package: crun-1.8-1.fc36.x86_64
path: /usr/bin/crun
version: |-
crun version 1.8
commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
rundir: /run/user/1000/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +WASM:wasmedge +YAJL
os: linux
remoteSocket:
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
version: |-
slirp4netns version 1.2.0-beta.0
commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
libslirp: 4.6.1
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.3
swapFree: 8589930496
swapTotal: 8589930496
uptime: 7h 54m 25.00s (Approximately 0.29 days)
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /home/rishi/.config/containers/storage.conf
containerStore:
number: 4
paused: 0
running: 4
stopped: 0
graphDriverName: overlay
graphOptions: {}
graphRoot: /home/rishi/.local/share/containers/storage
graphRootAllocated: 1695606808576
graphRootUsed: 264996347904
graphStatus:
Backing Filesystem: extfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 15
runRoot: /run/user/1000/containers
transientStore: false
volumePath: /home/rishi/.local/share/containers/storage/volumes
version:
APIVersion: 4.4.1
Built: 1677919610
BuiltTime: Sat Mar 4 09:46:50 2023
GitCommit: ""
GoVersion: go1.18.10
Os: linux
OsArch: linux/amd64
Version: 4.4.1
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
No response
Additional information
No response