Home > Positioning > Subjects > Apache Kafka > The log and the record

The log and the record

The log

Kafka’s core structure is the commit log: an ordered, append-only sequence of records. A topic is a named log, divided into partitions, each an independent ordered log. Order is guaranteed within a partition — every record there has a monotonically increasing offset — but not across a topic. Partitioning is what lets a topic scale horizontally and be consumed in parallel.

The record

A Kafka record is a self-contained unit of data in motion:

Identity, payload, contextual headers, order, and time — that is what a datum needs to travel and be interpreted away from its source.

Reading

Consumers are not pushed records; they pull, each tracking its own offset. Because records persist (by time- or size-based retention), a consumer can start at the beginning, resume where it left off, or replay history. This is the departure from a traditional queue, where a message is consumed and gone: in Kafka the log is the durable record and reading is non-destructive.

The log as a unifying abstraction

Kreps’ argument — the 2013 essay “The Log,” expanded as the short book I ❤ Logs — is that this same structure underlies database replication, state-machine replication, and data integration: an ordered log of changes is what keeps independent systems consistent and lets new ones be built by replaying it. Kafka makes that log a first-class, shared piece of infrastructure.

Sources