As we explain in a previous article, eBPF allows the development of new solutions in different areas. Some are related to SDN management, DDoS mitigation and intrusion detection through early packet drop. Others help to improve network performance, load balancing, observability and more.
Now, you can discover some use cases and success stories from real world projects.
eBPF: an overview
Even when BPF (Berkeley Packet Filter) emerged in 1992 as a solution for optimizing packet filters, it had some limitations. Working around these limitations, Alexei Starovoitov initially proposed a rewrite for BPF. Then he developed eBPF, or extended Berkeley Packet Filter, with Daniel Borkmann in 2014.
Nowadays, its creators present eBPF as “a revolutionary technology with origins in the Linux kernel that can run sandboxed programs in an operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or load kernel modules”. This enables the possibility “to run on events other than packets, and do actions other than filtering”, as Brandan Gregg refers.
eBPF Use Cases
Previously, we listed 5 reasons to use eBPF. More than reasons, there are 5 areas where your project can be improved and boosted by it. It is: programmability, networking, tracing and profiling, observability and monitoring, and security.
Facebook’s load balancer
Facebook’s servers process millions and millions of visits every day. So how to optimize the traffic and guarantee the user experience in a reliable, safe and fast way? The company’s engineers are using Katran. It “creates a software-based solution to load balancing with a reengineered forwarding plane that takes advantage of recent innovations in kernel engineering”. These innovations are eXpress Data Path (XDP) and the eBPF virtual machine, as explained by Nikita Shirokov and Ranjeeth Dasineni.
Facebook uses a network load balancer (also called layer 4 load balancer, or L4LB). It operates on packets rather than serving application level requests. To do this, a virtual IP address (VIP) is addressed “to the internet at each location. Packets destined to the VIP are then seamlessly distributed among the backend servers”, by the distribution algorithm. Then, the packets are sent to the globally distributed network of points of presence (PoP). The PoPs also act as proxies for Facebook’s data centers.
However, the first generation L4LB, based on the IPVS kernel module, presented some challenges related to backends. “In the second iteration, we leveraged the eXpress Data Path (XDP) framework and the new BPF virtual machine (eBPF) to run the software load balancer together with the backends on a large number of machines”, added the engineers.
Comparing generations, “both are software load balancers running on backend servers. Katran (right) allows us to colocate the load balancer with backend application, thus increasing the load balancer capacity”. Additionally, “Katran is deployed today on backend servers in Facebook’s points of presence (PoPs), and it has helped us improve the performance and scalability of network load balancing and reduce inefficiencies such as busy loops when there are no incoming packets”.
Facebook also uses eBPF to enforce encryption policies within its network. Thinking in different options and scenarios to provide transparent enforcement, the team decided to develop and deploy a SSLWall. It’s “a system that cuts off non-SSL connections across various boundaries”, as explained in this blog post. This approach requires work in the kernel context. It’s here where engineers take advantage of eBPF capabilities, such as tc-bpf, kprobes, and maps.
The eBPF programs are managed through a daemon, which also sends logs to Scribe. “This makes management of releases easier to deal with, as we only have one software unit to monitor instead of needing to track a daemon and eBPF release. Additionally, we can modify the schema of our BPF tables, which both user space and kernel space consult, without compatibility concerns between releases.” Proxies are part of the final infrastructure too.
Cloudflare’s Magic Firewall
Cloudflare is one of the leaders in the cloud computing market. Being a provider for companies around the world, it’s mandatory to offer a flawless service, but also a safe one to protect their assets. In this matter, Cloudflare used eBPF to build programmable packet filtering for the product called Magic Firewall.
“Magic Firewall allows custom packet-level rules, enabling customers to deprecate hardware firewall appliances and block malicious traffic at Cloudflare’s network”, according to the company. With cyberattacks being more frequent and sophisticated every day, it was necessary to shield Cloudlare’s network and services.
How does eBPF enhance Magic Firewall
To achieve this goal, the engineering team is using eBPF capabilities. “With eBPF, you can insert packet processing programs that execute in the kernel, giving you the flexibility of familiar programming paradigms with the speed of in-kernel execution (…) We wanted to find a way to use eBPF to extend our use of nftables in Magic Firewall. This means being able to match, using an eBPF program within a table and chain as a rule. By doing this we can have our cake and eat it too, by keeping our existing infrastructure and code, and extending it further”.
Altogether with using iptables and nftables, Cloudflare constructed an eBPF program. With this, it was able to load it into an existing nftables table and chain and integrated into the tooling through Cilium. Now, Magic Firewall is more flexible and powerful. In addition to having an integrated solution, Cloudflare affirmed they can “look deeper into packets and implement more complex matching logic than nftables alone could provide. Since our firewall is running as software on all Cloudflare servers, we can quickly iterate and update features”.
Regarding observability and monitoring tasks, “eBPF enables the collection & in-kernel aggregation of custom metrics and generation of visibility events based on a wide range of possible sources”, refers the eBPF official site. This way, it extends depth of visibility and generates historigrams and data structures that facilitate the analysis. Currently, there are several open-source plugins and applications you can orchestrate with your cloud-infrastructure.
In the last years, Netflix has been “using eBPF to understand what software is doing, what the software is blocking in ways we couldn’t see before in production”, affirms Brendan Gregg, senior performance architect in the company. “We can log whenever machines talk to other machines, and we can use that for capacity planning and security analysis. It’s enabling us to use technologies in Linux that we couldn’t use before, kprobes, uprobes in production”.
Netflix’s new data flow
Several products, technologies and services compose Netflix’s cloud infrastructure. It represents some challenges related to overall observability. In order to solve these problems, the company deployed the Cloud Network Insight. It’s “a suite of solutions that provides both operational and analytical insight into the cloud network infrastructure to address the identified problems”, it’s the definition from the Netflix team. In the Cloud Network Insight, different sources (such as VPC Flow Logs, ELB Access Logs, eBPF flow logs on the instances, etc, collect the data). In the specific case of eBPF, “the Flow Exporter is a sidecar that uses eBPF tracepoints to capture TCP flows at near real time on instances that power the Netflix microservices architecture”.
Flow Collector consumes two data streams, and the data goes through Keystone that routes to the datastores. Finally, data feeds “various use cases within Netflix like network monitoring and network usage forecasting available via Lumen dashboards and machine learning based network segmentation. The data is also used by security and other partner teams for insight and incident analysis.”
This solution has shown to be scalable, being able to manage billions of eBPF flow logs per hour, affirms the team, while providing visibility.
The number of open-source applications and tools based on eBPF is increasing. The adoption of this technology is facilitating this process and confirms its usability in real life projects. Big and small companies are benefiting from it in different ways. It seems like we are going to be listening and reading a lot more about eBPF in the next months and years.