Kafka之Internal
kafka之Internal
Controller
The controller is one of the Kafka brokers that, in addition to the usual broker functionality, is responsible for electing partition leaders .
When the controller notices that a broker left the cluster (by watching the relevant Zookeeper path), it knows that all the partitions that had a leader on that broker will need a new leader.
Replication
Data in kafka is orgnized by topics, each topic is partitioned and each parttion can have multiple replcas.
Two types of replica:
Leader Replica
each partition has a single replica designated as the leader. All produce and consume request go through the leader.
Follower Replica
any replicas that are not leaders are called followers, only job is replicate msg from leader and stay up-to-date .
ISR : in-sync-replica eligible to be elected as partition leader. (Replica.lag.time.max.ms)
request processing
If broker recieves a produce or fetch for specific partition and leader for this partition is on a different broker, response - "Not a leader for partition"
How do the clients know where to send the requests?
metadata request
- all broker has metatadata cached
metadata.max.age.ms
as cluster could change, if client receives "Not leader" error, refresh metadata
produce msg
ask - controls how kafka determins a write as "written success"
Notes: the messages are written to the filesystem cache and there is no guarantee about when they will be written to disk. Kafka does not wait for the data to get persisted to disk—it relies on replication for message durability.
Consume msg
(Ack = all) , consumer wait until all the in-sync replicas get the message and only then allow consumers to read it , if replication between brokers is slow for some reason, it will take longer for new messages to arrive to consumers.
replica.lag.time.max.ms
—the amount of time a replica can be delayed in replicating new messages while still being considered in-sync.
Retention
Kafka does not keep data forever, nor does it wait for all consumers to read a message before deleting it.
segment
Because finding the messages that need purging in a large file and then deleting a portion of the file is both time-consuming and error-prone, we instead split each partition into segments. By default, each segment contains either 1 GB of data or a week of data, whichever is smaller. As a Kafka broker is writing to a partition, if the segment limit is reached, we close the file and start a new one.