Skip to content

Oddities in nested PTY and devpts handling #1158

@debarshiray

Description

@debarshiray

Issue Description

This is a continuation from these two Toolbx issues about the nested pseudo-terminal devices created by Podman: containers/toolbox#568 and containers/toolbox#1016 I am afraid that despite my best effots, this might conflate several different issues.

At some point, specifically commit 494007b6cadc5fe3 or containers/toolbox#581, Toolbx started mounting a separate devpts file system inside the container's mount and user namespace using:

podman create --mount type=devpts,destination=/dev/pts --userns keep-id ...

This fixed the group ownership of the pseudo-terminal devices created under /dev/pts, as mentioned in the commit message. ie., it changed from nobody to tty. That's good.

However, the devices' user ownership remained root and didn't change to my $UID. Why is that?

I suppose this is due to the specifics of how the OCI runtime (crun(1) in this case) sets up the devices. However, I am getting lost in the weeds trying to track this through parseVolumes() and SpecGenToOCI() in Podman to the spec or config created by crun(1) to what actually gets sent to the kernel.

This is how the mounts look inside the container:

⬢[rishi@toolbox ~]$ mount | grep devpts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/home/rishi/.local/share/containers/storage/overlay/c9ddcc6e8dd98c1aa6a01b7f04f79bd10b2df8b71e6f85b516162f21ef2c878b/merged/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,context="system_u:object_r:container_file_t:s0:c1022,c1023",gid=100005,mode=620,ptmxmode=666)

The first three entries are the host's devpts -- the host's / is bind mounted at /run/host inside the container. Notice how Podman or crun(1) plugs in non-default values for the gid, mode and ptmxmode options.

At this point, I tried specifying a uid:

podman create --mount type=devpts,destination=/dev/pts,uid=1000 --userns keep-id ...

That fixed the devices' user ownership, but the group ownership changed to root from tty, because it looks like those non-default options got reset to their defaults in the kernel. Is that expected? Compare this with the above:

⬢[rishi@toolbox ~]$ mount | grep devpts
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /run/host/home/rishi/.local/share/containers/storage/overlay/6261546a819230f91b10a628b8668b02636b8df2b024b3e7c0f176b70759c455/merged/dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000)
devpts on /dev/pts type devpts (rw,relatime,context="system_u:object_r:container_file_t:s0:c1022,c1023",uid=1000,mode=600,ptmxmode=000)
devpts on /run/host/home/rishi/.local/share/containers/storage/overlay/6261546a819230f91b10a628b8668b02636b8df2b024b3e7c0f176b70759c455/merged/dev/pts type devpts (rw,relatime,context="system_u:object_r:container_file_t:s0:c1022,c1023",uid=1000,mode=600,ptmxmode=000)

Again the first three entries are from the host's devpts, but this time we have two entries for the container's own devpts. I wonder if that extra entry is worth ironing out.

At this point, I considered using the container's entry point to mount the devpts file system instead of going through podman create --mount .... If nothing else, it has the advantage of being able to modify pre-existing containers, which is very helpful for long-living Toolbx containers. Something like:

mount --types devpts \
  --options gid=tty,uid=1000,mode=620,ptmxmode=000 \
  devpts \
  /dev/pts

This seems to work. The devices are owned by rishi:tty as they are on the host.

Before committing to this approach, I want to ask if it's expected that the devices' user ownership would be root unless a uid is specified? Note that we are using --userns keep-id.

This still leaves the question of the problem faced by @dustymabe in containers/toolbox#568

It seems to me that the gpg-agent and pinentry processes running on the host, really want access to the secondary end of the nested pseudo-terminal device inside the container's mount and user namespaces (ie., the output of tty(1) or the /dev/pts/N device inside the container). Note how the gpg-agent(1) manual says that the GPG_AGENT environment variable should always reflect the output of tty(1).

Do you have any suggestions? Is this something that has already been solved elsewhere?

Steps to reproduce the issue

see above

Describe the results you received

see above

Describe the results you expected

see above

podman info output

host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.5-1.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 99.07
    systemPercent: 0.16
    userPercent: 0.78
  cpus: 16
  distribution:
    distribution: fedora
    variant: workstation
    version: "36"
  eventLogger: journald
  hostname: topinka
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.1.14-100.fc36.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 20129562624
  memTotal: 33553547264
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8-1.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 7h 54m 25.00s (Approximately 0.29 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/rishi/.config/containers/storage.conf
  containerStore:
    number: 4
    paused: 0
    running: 4
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/rishi/.local/share/containers/storage
  graphRootAllocated: 1695606808576
  graphRootUsed: 264996347904
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 15
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/rishi/.local/share/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1677919610
  BuiltTime: Sat Mar  4 09:46:50 2023
  GitCommit: ""
  GoVersion: go1.18.10
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions