Run-time Solutions to the Stack Overflow Problem

A Proof-of-concept Kernel Space Solution

This work is described in the following publication: Toward Preventing Stack Overflow Using Kernel Properties (by Benjamin Teissier and Stefan D. Bruda, in Proceedings of the 9th International Conference on Software Engineering and Applications, August 2014, Vienna, Austria).

An extended version of this publication is available locally. The code described in the paper is also available.

This solution is implemented at the kernel level. A system implementing our method will patch all programs at launch time so that every call and ret will also perform an appropriate system call. As a consequence of the system call attached to calls a copy of the new top of the stack is stored in kernel space. This copy is then compared with the top of the stack stack through the system call associated to ret. This solution is a good proof of concept but is not production ready (see the paper above for a discussion on the matter).

A Complete Solution Based on ptrace

This work is described in the following publication: Counter-Measures against Stack Buffer Overflows in GNU/Linux Operating Systems (by Erick Leon and Stefan D. Bruda, Procedia Computer Science 83, 2016, pp. 1301–1306). An extended version of this paper is the following MSc thesis: The "ptrace" Solution to Stack Integrity Attacks in GNU/Linux Systems (Erick Leon, May 2015).

This code offers a reactive measure against stack pointer attacks in GNU/Linux operating systems, for example stack buffer overflow and code injection.

It will launch a target process, stop it at each and every instruction, check for call and ret opcodes, validate the stack pointer and continue on to the next instruction.

The validation of the stack pointer consists of:

  1. When a call is found, it will store the stack pointer in a separate buffer.
  2. When a ret is found, it will compare the top value in the separate buffer to the current value in the stack; if they're the same then the execution continues, however if they're different, a security incident is assumed thus the target process is killed and a message is issued to the system administrator through the network with information about the incident.

We also include a fast version that operates in effectively the same manner with the only difference being that it will only verify the stack pointer whenever a write system call is intercepted. This is an approach that attempts to solve the issue of time due to the fact that the full version will check every single instruction and thus takes a significant ammount of time. In order to justify this approach we argue that in practice, we were able to detect that write system calls were issued at key points during the stack overflow and thus, our fast version works under a specific set of conditions.This is further detailed in the thesis.

Files

The complete source code is available. The following are the executable programs that constitute our solution:

Name Description
64full Compiled from 64full.c, it is the code that single-steps through the process at every instruction executed for 64-bit machines. In order to enable injection, notice the commented line.
64fast Compiled from 64fast.c, it is the code that single-steps through the process whenever a write() system call is detected for 64-bit machines. In order to enable injection, notice the commented line.
32full Compiled from 32full.c, it is the code that single-steps through the process at every instruction executed for 32-bit machines. In order to enable injection, notice the commented line.
Loop/Loo Compiled from loop.c or loo.c, it is the code for a simple loop used for testing with 64-bit and 32-bit machines respectively.
Vul/Vuln Compiled from vul.c or vuln.c, it is the code that exploits a buffer overflow obtained from the internet, used for testing with 64-bit and 32-bit machines respectively. (read included pdf for further details)
Client.py Python script for the client messaging socket.
Server.py Python script for the server messaging socket.

Instructions

The directories named “64” and “32” store the files associated to the 64-bit and 32-bit versions respectively. In order to compile the files, it is necessary to have gcc and python installed in the system.

To compile:
gcc -o <NameOfOutputFile> <NameOfCodeFile.c>
example:
gcc -o 64full 64full.c

To execute:
./<NameOfOutputFile> <NameOfTargetProcess>
example:
./64full /usr/bin/cal

Both python files (client.py and server.py) need the proper permissions, example:
chmod 777 client.py
chmod 777 server.py

Limitations and open problems

  • The c2 opcode in 32-bit systems presents an unexpected behavior (documented in the thesis), this is probably related to the way 32-bit libraries were developed. We solve this issue by detecting only c2 opcodes followed by zero values. This could potentially be optimized if other things are taken into consideration, however our solution works as it is.
  • The rex.w + ff prefix opcode for call happens when certain conditions are met so they may vary from distribution to distribution. In order to simplify this problem we intercept opcodes with any two values followed by ff, which in turn can potentially intercept certain jmp opcodes. This can be optimized given a high level of expertise in the matter.
  • While all three versions work without a problem with command line applications, they will not work with GUIs. This is a problem that requires a high level of expertise but is probably related to the way threads are generated in the target process. With that being said, it falls outside of the scope of our work and thus we leave it as an open problem.