by John Simpson
Vulnerabilities in the Linux kernel are not uncommon. There are roughly 26 million lines of code, with 3,385,121 lines added and 2,512,040 lines removed in 2018 alone. The sheer complexity of that much code means that vulnerabilities are bound to exist. However, what is not at all common is the existence of unauthenticated remote code execution (RCE) vulnerabilities — a critical issue that every system administrator hopes to avoid.
On May 8, 2019, the National Vulnerability Database (NVD) published details for a Linux kernel vulnerability, CVE-2019-11815, with a Common Vulnerability Scoring System (CVSS) 3.0 base score of 8.1. The details of the vulnerability include: having an attack vector of “network,” no privileges required, and administrative level code execution — i.e., the confidentiality, integrity, and availability (CIA) impact are all “high.” At first glance, this seems like a worst-case scenario. But assessing a vulnerability’s potential impact goes beyond the attack vector, privileges, and CIA impact of the CVSS base score.
One component of the CVSS 3 base score is attack complexity, for which this vulnerability has a rating of “high” as well. This means that a successful attack is dependent on a very specific set of circumstances that is hard to achieve. According to the CVSS 3.0 standard, this rating means that “a successful attack depends on conditions beyond the attacker’s control” and “a successful attack cannot be accomplished at will, but requires the attacker to invest in some measurable amount of effort in preparation or execution against the vulnerable component before a successful attack can be expected.”
Looking at the vulnerability itself in some detail will reveal why the scoring is technically correct, especially when taking the attack complexity rating into account, but is not completely representative of the actual risk to enterprises and users.
Breaking down the vulnerability
The description of the vulnerability from the NVD states that the issue was “discovered in rds_tcp_kill_sock in net/rds/tcp.c in the Linux kernel before 5.0.8,” and that there is “a race condition leading to a use-after-free, related to net namespace cleanup.” This is an accurate and concise description of the vulnerability from a code perspective, but the lack of some critical information may lead to alarm given the mention of TCP, or Transmission Control Protocol.
The first major component of this vulnerability is Reliable Datagram Sockets (RDS), a socket interface and protocol developed by Oracle, which was created to allow a single transport socket to facilitate sending and receiving to a very large number of different endpoints. This vulnerability involves RDS when TCP is used as the underlying transport protocol: The application data in an RDS header is encapsulated and sent via TCP, typically to port 16385, where it is then unencapsulated and passed to the RDS socket.
Beyond Oracle’s documentation and a very short Wikipedia page, there is not much information about RDS or where it’s typically used. The obscurity of this protocol, combined with the existence of previous local privilege escalation vulnerabilities, has led most popular Linux distributions such as Ubuntu to blacklist kernel modules relating to RDS for many years. This immediately reduces the potential harm of such a vulnerability by a large margin.
What if the rds and rds_tcp kernel modules are enabled?
When using RDS over TCP, the underlying TCP transport is completely managed by the kernel. This means that when a client establishes a new RDS socket, the TCP socket is opened by the kernel in rds_tcp_conn_path_connect() in tcp_connect.c, which is called by the worker thread function rds_connect_worker() in threads.c.
Figure 1. rds_connect_worker() in threads.c calling rds_tcp_conn_path_connect()
The RDS-specific portion of the vulnerability arises when the underlying TCP client-side socket continually fails to connect. When TCP connect() fails, the rds_tcp_restore_callbacks() function is called, and sets the t_sock pointer in the rds_tcp_connection structure to NULL, which is completely reasonable behavior.
Figure 2. rds_tcp_conn_path_connect() calling rds_tcp_restore_callbacks()
Figure 3. t_sock set to NULL in rds_tcp_restore_callbacks()
The problem arises when we introduce the second major component of the vulnerability: network namespaces. Network namespaces allow for the use of a separate set of interfaces and routing tables for a given namespace, where traditionally the entire operating system shares the same interfaces and routing tables as every other process. This namespace functionality is used by platforms such as Docker to provide network isolation for containers.
When an RDS-TCP socket is initialized in rds_tcp_init(), the network namespaces function register_pernet_device() is called, passing in a pointer to a pernet_operations structure, rds_tcp_net_ops, which contains initialization and exit functions to perform when a network namespace is initialized or removed and the socket is active.
Figure 4. register_pernet_device() called to register network namespace device
Figure 5. rds_tcp_exit_net() as the exit function for the network namespace device
The exit function rds_tcp_exit_net() will call rds_tcp_kill_sock(), which is used to perform cleanup of various parts of the RDS-TCP socket. Part of the process is the creation of a list of connections to be cleaned up, called the tmp_list.
One of the checks performed on each connection is to see if the t_sock pointer is NULL for the underlying TCP socket in use and if so, the t_tcp_node is not added to the “cleanup list.” As a result, rds_conn_destroy() is not called for those nodes and much of the “cleanup” is not performed.
Figure 6. rds_tcp_kill_sock() skipping cleanup if t_sock is NULL
Most importantly, the rds_connect_worker() thread is not stopped and will continue to try reconnecting. Eventually, the underlying net structure is freed as part of the namespace cleanup, and may be used by a still running rds_connect_worker(), triggering a use-after-free issue. Technically, this flaw is as described: no privileges required, and administrative level code execution possible if exploited.
The fix for the issue is simple: System administrators simply need to ensure the vulnerable modules are disabled or an updated kernel is installed.
The real risks posed by CVE-2019-11815
Given the characteristics of CVE-2019-11815, what does this mean for users? A potential victim would first have to have the commonly blacklisted rds and rds_tcp modules loaded — if these are not loaded, no further movement is possible. If an attacker happens to find such a rare target — because the TCP connect() is performed only by RDS-TCP clients, not servers — an attacker would then have to entice their target into connecting to an attacker-controlled RDS-TCP socket from within a network namespace.
The attacker’s next job would be to cause a failure on the underlying TCP connection and at the same time to cause the target user’s network namespace to be cleaned up — a task that a remote attacker has practically no chance of performing. To make things even more impossible, race conditions — flaws caused by unexpected timing of events that affect other actions — are notoriously difficult to exploit and would likely require a large number of attempts.
With all these conditions taken into consideration, the chances of this vulnerability being “remotely exploitable without authentication” are essentially zero. There is a very small chance that this could be used as a local privilege escalation, but that would require that the commonly blacklisted rds and rds_tcp modules are loaded.
Although the CVSS score of this vulnerability is technically correct in its assessment, users should be aware that risk is also dependent on the probability of the attack due to its complexity and the conditions required for an attacker to be successful. The circumstances in which this attack would be feasible are unlikely to ever be seen in a real production environment. The vast majority of Linux servers are simply not vulnerable in a remote context.