eBPF: The Next Generation of Linux

According to experts, eBPF is a revolutionary technology. For them, it is also the next generation of Linux due to all the capabilities that it offers to developers. But, what are these capabilities? How can it help to solve day-to-day problems?

What is eBPF?

To understand what it is and how it works, we need to review the origins of this technology first.

eBPF Origins

Probably, you’re familiar with the BPF (Berkeley Packet Filter) concept. Released in 1992 and developed by Steven McCanne and Van Jacobson, “BPF originated as a technology for optimizing packet filters. If you run tcpdump with an expression (matching on a host or port), it gets compiled into optimal BPF bytecode which is executed by an in-kernel sandboxed virtual machine”, as Brendan Gregg explains in this post. In other text, Gregg also summarizes Linux tracing improvement across the years. However, in his experience, he’d faced some limitations using BPF and kernels.

Working around these limitations, Alexei Starovoitov proposed a rewrite first. Then, he developed eBPF, or extended Berkeley Packet Filter, with Daniel Borkmann in 2014.

Enhancing and Extending BPF

Nowadays, its developers present eBPF as “a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in an operating system kernel. It can be used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules”. This enables the possibility “to run on events other than packets, and do actions other than filtering”, as Gregg refers.

Using eBPF it’s viable to introduce new solutions related to SDN management, DDoS mitigation, and intrusion detection through early packet drop, improve network performance, load balancing, observability, and more.

How Does eBPF Work?

As the BPF Performance Tools book, edited by OReilly, points out, it is “a flexible and efficient technology composed of an instruction set, storage objects, and helper functions. It can be considered a virtual machine due to its virtual instruction set specification. These instructions are executed by a Linux kernel BPF [referring to eBPF] runtime, which includes an interpreter and a JIT compiler for turning BPF instructions into native instructions for execution. BPF instructions must first pass through a verifier that checks for safety, ensuring that the BPF program will not crash or corrupt the kernel (it doesn’t, however, prevent the end user from writing illogical programs that may execute but not make sense).”

eBPF programs

To work, eBPF-compatible programs are necessary. Up next, are some considerations for these programs:

They are event-driven.
Run when the kernel or an application passes a certain hook point.
Pre-defined hooks include system calls, function entry/exit, kernel tracepoints, network events, and several others.
If a predefined hook does not exist for a particular need, it is possible to create a kernel probe (kprobe) or user probe (uprobe) to attach eBPF programs almost anywhere in kernel or user applications.
It is possible to write them indirectly (via projects like Cilium, bcc, or bpftrace which provide an abstraction on top of eBPF) or directly (when no higher-level abstraction exists).

Other Basics

Loader & Verification Architecture: the program passes through Verification and JIT Compilation before being attached to the requested hook.
Maps: the documentation indicates that “eBPF programs can leverage the concept of maps to store and retrieve data in a wide set of data structures”.
Helper calls: function calls can be turned into helper functions to avoid calling into arbitrary kernel functions.
Tail and function calls: they can call and execute another eBPF program and replace the execution context.

Development Toolchains

Toolchains help in the development and management of eBPF Programs. Some of the more extended and used are:

BCC: a framework to write python programs with eBPF programs embedded inside them.
bpftrace: is a high-level tracing language for Linux eBPF.
eBPF Go Library: to decouple the process of getting to the eBPF bytecode and the loading and management of eBPF programs.
libbpf C/C++ Library: to decouple the loading of eBPF object files generated from the clang/LLVM compiler into the kernel.

Safety

Any error in the implementation affects vital components of software infrastructure, also, the open-sourced nature of this technology has a chance to generate an effect. This way, safety is key to guarantee success. In this matter, eBPF offers multi-layer security that includes: required privileges, a verifier, a hardening process, and an abstracted runtime context.

Why Use eBPF?

Programmability: it enables “adding additional protocol parsers and easily program any forwarding logic to meet changing requirements”, according to creators.
Networking: programmability facilitates creating networking solutions for any kind of project including packet processing.
Tracing and profiling: attaching programs to trace points, the kernel, and user application probe points, it is possible to get visibility and generate useful insights.
Observability and monitoring: metrics, histograms, events… Data comes from different sources, generating more visibility.
Security: being able to view network operations enables a way to detect issues and increase control, using system call filtering, network-level filtering, and process context tracing.

BPF Foundation

The number and diversity of implementations and use cases for eBFP have increased substantially since its launch. It’s a good sign as it is currently helping to create real-life solutions. However, the documentation for the projects is not always the best, with information from different sources and platforms. Additionally, there are no standardized parameters. As Alexei Starovoitov remarked in the 7th eBFP anniversary, “it leads to a confusing user experience. BPF implementations compete with each other.”

In response, the BPF Foundation and BPF Steering Committee were created “to optimize collaboration between projects and ensure that the core of eBPF is well maintained and equipped with a clear roadmap and vision for the bright future ahead of eBPF. This is where the eBPF Foundation comes in, and establishes an eBPF steering committee to take care of the technical direction and vision of eBPF”.

Developers working with this technology can also contribute to this matter through communities.

Summary

eBPF “has grown from a Linux curiosity to a cornerstone of the way many technologies are built, but that hasn’t come without a fair amount of growing pain”, reflects its creator Alexei Starovoitov. It is being extensively used to drive a wide variety of use cases.

From security to observability, load balancing, development of new applications, and more. Companies large and small in the tech world have already incorporated eBPF as part of their stack. And they are reporting positive results.