Apache Kafka

An open-source streaming platform used for versatile pipelines, data integration, goal-oriented applications, and streaming applications. It’s one of the most popular projects of Apache Software Foundation, which was released in January 2011. It aims at providing low latency real-time data feeds. Apache Kafka is based on the commit log, and it allows users to subscribe to it and publish data to any number of systems or real-time applications.

Project Background

  • Platform: Apache Kafka 
  • Author: LinkedIn
  • Released: January 2011
  • Type: Stream processing
  • License: Apache License v2
  • Support: Cross-platform
  • Language: Multi-language
  • GitHub: apache/Kafka

Applications

  • Perform computations at an in-memory state
  • Graph processing
  • Machine Learning
  • Complex Event Processing
  • Event-driven analytics
  • Real-time search index building
  • Continuous ETL in e-commerce

Summary

  • Top users are banks, manufacturers, insurance, and telecom.
  • It integrates with many event sources and event sinks—for example, Postgres, JMS, Elasticsearch, AWS S3, and more.
  • The Kafka cluster can handle failures with the masters and databases.
  • By using ingest pipelines, it can replicate the events.
  • Kafka is very fast and guarantees zero downtime and zero data loss.
Scroll to Top