Short bio: Computer Scientist, FOSS supporter (read more)
Tux Machines (TM)-specific
Back in February, we published a discussion of the vmsplice() exploit which showed how the failure to check permissions for a read operation led to a buffer overflow within the kernel. Subsequently, a linux-kernel reader pointed out that the article stopped short of a complete explanation: this is not an ordinary buffer overflow exploit. This article picks up where the last one left off and describes how the vmsplice() exploit makes use of this buffer overflow to take over the system. When vmsplice() is being used to feed data from memory into a pipe, the function charged with making it all happen is vmsplice_to_pipe(), found in fs/splice.c. It declares a couple of arrays of interest:
struct page *pages[PIPE_BUFFERS];
struct partial_page partial[PIPE_BUFFERS];
PIPE_BUFFERS, remember, is 16 on exploitable configurations. Both of these arrays are passed into get_iovec_page_array(), which, as described in the previous article, makes a call to get_user_pages() to fill in the pages array. As a result of the failure to check whether the calling application is allowed to read the requested region of memory, get_user_pages() will overflow the pages array, writing far more than PIPE_BUFFERS pointers into it. These are, however, pointers to legitimate kernel data structures; it remains to be seen how this overflow enables the attacker to take control of the system.
The partial array is also passed into get_iovec_page_array(); it describes the portion of each page which should be written into the pipe. To that end, a loop like this is run immediately after returning from get_user_pages():