Restrictions that comes with capabilities

Posted by Marcus Folkesson on Tuesday, September 10, 2024

Restrictions that comes with capabilities

I debugged an interesting problem this weekend related to which impact capabilites could have on a running process in a Linux system.

I already knew that there are some security restrictions for applications that setuid/setgid or have capabilities set. One example is that LD_LIBRARY_PATH is silently ignored for an application with capabilities. You are simply not allowed to link in whatever you like for priviliged applications - which is a good thing.

But there are more such restrictions as we will see.

/media/tux-stop.png

Background

I had my application, my-application, which needs to execute with the CAP_NET_RAW capability in order to create raw packet sockets. Besides the raw socket handling, the application also do some cool image processing stuff with OpenCV.

If I just start the application:

1$ ./my-application
2 # Success

Everything is fine. But if I give the application some capabilities:

1$ sudo setcap cap_net_raw+eip ./my-application

Things are going nasty:

 1$ ./my-application
 2
 3******************************************************************
 4* FATAL ERROR:                                                   *
 5* This OpenCV build doesn't support current CPU/HW configuration *
 6*                                                                *
 7* Use OPENCV_DUMP_CONFIG=1 environment variable for details      *
 8******************************************************************
 9
10Required baseline features:
11NEON - NOT AVAILABLE
12OpenCV(3.4.1) Error: Assertion failed (Missing support for required CPU baseline features. Check OpenCV build configuration and required CPU/HW setup.) in initialize
13
14terminate called after throwing an instance of 'cv::Exception'

NEON is missing because I gave the application capabilities? How come?

Debugging

One good way to debug such things is with strace. At first the application ran without the error which I thought was strange, but then I realized that I must start strace with --user to preserve the capabilities - something I learned [3] the hard way.

1$ sudo strace --user=marcus ./my-application

strace produces the following output. As we can see, the openat() system call fails to open /proc/self/auxv due to permission error:

 1uname({sysname="Linux", nodename="Marcus-board", ...}) = 0
 2openat(AT_FDCWD, "/proc/self/auxv", O_RDONLY) = -1 EACCES (Permission denied)
 3write(2, "\n*******************************"..., 403) = 403
 4write(2, "\nRequired baseline features:\n", 29) = 29
 5write(2, "NEON - NOT AVAILABLE\n", 21)  = 21
 6write(2, "OpenCV(3.4.1) Error: Assertion f"..., 265) = 265
 7write(2, "terminate called after throwing "..., 48) = 48
 8write(2, "cv::Exception", 13)           = 13
 9write(2, "'\n", 2)                      = 2
10write(2, "  what():  ", 11)             = 11
11write(2, "OpenCV(3.4.1) /home/marcus/git"..., 251) = 251
12write(2, "\n", 1)                       = 1
13rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
14gettid()                                = 5221
15getpid()                                = 5221
16tgkill(5221, 5221, SIGABRT)             = 0
17--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=5221, si_uid=105} ---
18+++ killed by SIGABRT +++

So what is this /proc/self/auxv file anyway?

The Auxiliary Vector

To give some context, I'm going to briefly describe what the auxiliary vector is.

The auxiliary vector, auxv [4], exposes information about the environment where the process executes. Example on such information the aux vector contains are the pointer to the system call entry point in the memory (AT_SYSINFO), system page size (AT_PAGESZ), real UID (AT_UID) and much more.

The complete list of entries is listed in auxvec.h [5]:

 1/* Symbolic values for the entries in the auxiliary table
 2   put on the initial stack */
 3#define AT_NULL   0	/* end of vector */
 4#define AT_IGNORE 1	/* entry should be ignored */
 5#define AT_EXECFD 2	/* file descriptor of program */
 6#define AT_PHDR   3	/* program headers for program */
 7#define AT_PHENT  4	/* size of program header entry */
 8#define AT_PHNUM  5	/* number of program headers */
 9#define AT_PAGESZ 6	/* system page size */
10#define AT_BASE   7	/* base address of interpreter */
11#define AT_FLAGS  8	/* flags */
12#define AT_ENTRY  9	/* entry point of program */
13#define AT_NOTELF 10	/* program is not ELF */
14#define AT_UID    11	/* real uid */
15#define AT_EUID   12	/* effective uid */
16#define AT_GID    13	/* real gid */
17#define AT_EGID   14	/* effective gid */
18#define AT_PLATFORM 15  /* string identifying CPU for optimizations */
19#define AT_HWCAP  16    /* arch dependent hints at CPU capabilities */
20#define AT_CLKTCK 17	/* frequency at which times() increments */
21   /* AT_* values 18 through 22 are reserved */
22#define AT_SECURE 23   /* secure mode boolean */
23#define AT_BASE_PLATFORM 24	/* string identifying real platform, may
24				 * differ from AT_PLATFORM. */
25#define AT_RANDOM 25	/* address of 16 random bytes */
26#define AT_HWCAP2 26	/* extension of AT_HWCAP */
27#define AT_RSEQ_FEATURE_SIZE	27	/* rseq supported feature size */
28#define AT_RSEQ_ALIGN		28	/* rseq allocation alignment */
29#define AT_HWCAP3 29	/* extension of AT_HWCAP */
30#define AT_HWCAP4 30	/* extension of AT_HWCAP */
31
32#define AT_EXECFN  31	/* filename of program */

The primary consumer of this information is the dynamic linker, ld-linux.so [6], auxv is simply an efficient shortcut to allow the kernel to commnicate a certain set of standard information that the dynamic linker needs. But the information is not only for the dynamic linker, applications may use the information as well.

Get the information

There are several ways to get the information in the auxiliary vector:

By LD_SHOW_AUXV

The auxiliary vector supplied to a program can be viewed by setting the LD_SHOW_AUXV environment variable when running a program:

 1$ LD_SHOW_AUXV=1 sleep 1
 2
 3AT_SYSINFO_EHDR:      0x7e31fa0af000
 4AT_MINSIGSTKSZ:       3632
 5AT_HWCAP:             bfebfbff
 6AT_PAGESZ:            4096
 7AT_CLKTCK:            100
 8AT_PHDR:              0x65444f267040
 9AT_PHENT:             56
10AT_PHNUM:             13
11AT_BASE:              0x7e31fa0b1000
12AT_FLAGS:             0x0
13AT_ENTRY:             0x65444f269640
14AT_UID:               1000
15AT_EUID:              1000
16AT_GID:               1000
17AT_EGID:              1000
18AT_SECURE:            0
19AT_RANDOM:            0x7ffedf7e7149
20AT_HWCAP2:            0x2
21AT_EXECFN:            /usr/bin/sleep
22AT_PLATFORM:          x86_64
23AT_??? (0x1b): 0x1c
24AT_??? (0x1c): 0x20

By systemcall

The getauxval() function retrieves values from the auxiliary vector.

By procfs

The auxiliary vector can be obtained via /proc/self/auxv.

auxv and OpenCV

What has this to do with OpenCV? For those not familiar with OpenCV, it is a library for Computer Vision. Computer vision do often requires heavy matrix calculations and such operations is assisted by SIMD (Single Instruction, Multiple Data) instructions. NEON is an architecture extension for ARM that provides a set of SIMD instructions.

There is no generic way to determine the hardware features for the platform. Support for NEON is a hardware capability that could be read out from the the AT_HWCAP field in the aux vector.

Lets see how OpenCV v3.4.1 verify the hardware features on an ARM based Linux platform:

 1    #elif defined __arm__
 2        int cpufile = open("/proc/self/auxv", O_RDONLY);
 3
 4        if (cpufile >= 0)
 5        {
 6            Elf32_auxv_t auxv;
 7            const size_t size_auxv_t = sizeof(auxv);
 8
 9            while ((size_t)read(cpufile, &auxv, size_auxv_t) == size_auxv_t)
10            {
11                if (auxv.a_type == AT_HWCAP)
12                {
13                    have[CV_CPU_NEON] = (auxv.a_un.a_val & 4096) != 0;
14                    have[CV_CPU_FP16] = (auxv.a_un.a_val & 2) != 0;
15                    break;
16                }
17            }
18
19            close(cpufile);
20        }
21    #endif

So OpenCV does read the aux vector from /proc/self/auxv. All files in /proc/self/ is owned by the process, right?

It can easy be verified from the shell:

1$ ls -al /proc/self/auxv 
2-r-------- 1 marcus marcus 0  9 sep 20.18 /proc/self/auxv

Ownership is indeed set to marcus:marcus.

The "dumpable" attribute

All processes in Linux has a "dumpable" attribute to determine if it should produce a core dump or not. The manpage for PR_SET_DUMPABLE(2) [2] describes it as follows:

PR_SET_DUMPABLE (since Linux 2.3.20)
       Set the state of the "dumpable" attribute, which determines whether core dumps are produced for the calling
       process upon delivery of a signal whose default behavior is to produce a core dump.

       Up to and including Linux 2.6.12, arg2 must be either 0 (SUID_DUMP_DISABLE, process is not dumpable)
       or 1 (SUID_DUMP_USER, process is dumpable).  Between Linux 2.6.13 and Linux 2.6.17, the value 2 was
       also permitted, which caused any binary which normally would not be dumped to be dumped  readable  by
       root only; for security reasons, this feature has been removed.
       (See also the description of /proc/sys/fs/suid_dumpable in proc(5).)

       Normally, the "dumpable" attribute is set to 1.  However, it is reset to the current
       value contained in the file /proc/sys/fs/suid_dumpable (which by default has the value 0), in
       the following circumstances:

       •  The process's effective user or group ID is changed.

       •  The process's filesystem user or group ID is changed (see credentials(7)).

       •  The process executes (execve(2)) a set-user-ID or set-group-ID program, resulting
         in a change of either the effective user ID or the effective group ID.

       •  The process executes (execve(2)) a program that has file capabilities
         (see capabilities(7)), but only if the permitted capabilities gained exceed those already permitted for the process.

       Processes that are not dumpable can not be attached via ptrace(2) PTRACE_ATTACH.

       If a process is not dumpable, the ownership of files in the
       process's /proc/pid directory is affected as described in proc(5).

The last circumstance is of greatest interrest:

  • The process executes (execve(2)) a program that has file capabilities, but only if the permitted capabilities gained exceed those already permitted for the process.

So a process with capabilities are not able to create coredumps. But what has that to do with the permission of /proc/self/auxv ? Lets have a look in the manpage for proc [1]:

/proc/[pid]
       There  is  a  numerical  subdirectory for each running process; the subdirectory is
       named by the process ID.

       Each /proc/[pid] subdirectory contains the pseudo-files and  directories  described
       below.  These files are normally owned by the effective user and effective group ID
       of the process.  However, as a security measure, the ownership is made root:root if
       the  process's "dumpable" attribute is set to a value other than 1.  This attribute
       may change for the following reasons:

       *  The attribute was explicitly set via the prctl(2) PR_SET_DUMPABLE operation.

       *  The attribute was reset to the  value  in  the  file  /proc/sys/fs/suid_dumpable
          (described below), for the reasons described in prctl(2).

       Resetting  the "dumpable" attribute to 1 reverts the ownership of the /proc/[pid]/*
       files to the process's real UID and real GID.

Especially this bit is interresting:

However, as a security measure, the ownership is made root:root if the process's "dumpable" attribute is set to a value other than 1.

So there we have it; the process has a capability that the shell used to launch it did not have, and therefore the dumpable attribute was set to false, and therefor files under /proc/self/ were owned by root:root.

How to solve this?

According to proc(5) manpage:

Resetting the "dumpable" attribute to 1 reverts the ownership of the /proc/[pid]/* files to the process's real UID and real GID.

And this is probably the best way to solve this. The "dumpable" attribute could be set with prctl():

1prctl(PR_SET_DUMPABLE, 1,);

Be aware that there are good reasons why applications with extra privileges are not allowed to create core dumps.

Another solution that is even worse is to bypass the permissions checks by setting the CAP_DAC_OVERRIDE and/or CAP_DAC_READ_SEARCH capabilities. It works, but that's not something I'm advocating at all.

CAP_DAC_OVERRIDE
        Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

CAP_DAC_READ_SEARCH
        -  Bypass file read permission checks and directory read and execute permission checks;
        -  invoke open_by_handle_at(2);
        -  use the linkat(2) AT_EMPTY_PATH flag to create a link to a file referred to by a file descriptor.

Summary

Give applications extra priviliges could have some unexpected side effects. Some of them are:

  • LD_LIBRARY_PATH is silently ignored for applications with file capabilities
  • The "dumpable" process attribute is set to 0 for applications with capabilities
  • An side effect for non-dumpable processes is that the files in /proc/self/ is owned by root:root

There are good side effects but you have to be aware of them.