Kafka之overview

Kafka之overview

Kafka - pub/sub, centrlized booker .

Topics

Topics breokend down into a number of Partition - Redynndancy And Scalability.

Screen Shot 2020-12-20 at 10.06.32 PM

Producer And Consumer

producer create new msg; consumer read msg.

Producer

In general Producer don't care waht partition a specific msg is written to (but in some case, producer can direct mg to specific partitions)

partition : msgkey + partitioner (hash -> map to specific partition)

Consumer

Key: consumer keep track of which msg it has already consumed by keeping track of the offset of msgs.

Consumer Group

Each partition only consumed by one member in group, consumer stop and restart without losing its place.

Mapping of consumer to partition -ownership

If a single consumer fails, remaining member of the group will rebalance the partition .

Screen Shot 2020-12-20 at 11.38.44 PM

Broker and Cluster

Broker

A single kafka server is called a broker.

  • Producer

    Recive msg from producer, assign offsets, commit msg to storage on disk

  • Consumer

    Responding to fetch requests and respond with msg that had been commited to disl .

cluster

A partition is owned by a single broker , the broker is called the leader of the partition.

Partition maybe assigned to multiple brokers - partition replication.

All consumer and producers operate on that partition must connect to the leader.

Screen Shot 2020-12-21 at 3.10.46 PM

Retention

retaining messages for some period of time (e.g., 7 days) or until the topic reaches a certain size in bytes (e.g., 1 GB).

Individual topics can also be configured with their own retention settings so that messages are stored for only as long as they are useful.


Why Kafka

  • multiple producers
  • multiple consumer
  • Disk based retention
  • Scalable - cluster
  • high performance

Use Case

Activity tracking

messaging

metrics and logging

streaming processing - eg data pipeline, kafka as intermediary