Producers & Consumers
Producers and consumers are the client applications that write to and read from Kafka. Their configuration — especially around acknowledgements and offset commits — determines the delivery guarantees your system actually provides. This page covers the concepts and shows working Java clients.
How producers work
A producer publishes events to a topic. It batches records, optionally compresses them, selects a partition (by key hash or round-robin), and sends them to the partition’s leader broker. The producer waits for an acknowledgement governed by the acks setting.
The acks config is the central durability lever:
acks=0— fire and forget. The producer does not wait; fastest, but events can be lost.acks=1— wait for the leader to write. Lost if the leader fails before followers replicate.acks=all— wait for all in-sync replicas. Strongest durability; survives broker loss.
Combine
acks=allwithenable.idempotence=true(the default in modern clients) to get exactly-once producing — no duplicates from retries — without sacrificing durability.
Delivery semantics
The end-to-end guarantee depends on producer settings and how the consumer commits offsets.
| Semantic | Meaning | How to achieve |
|---|---|---|
| At-most-once | Events may be lost, never duplicated | acks=0/1; commit offset before processing |
| At-least-once | Events never lost, may be duplicated | acks=all + retries; commit offset after processing |
| Exactly-once | Each event processed once, no loss or dupes | Idempotent producer + transactions, or transactional read-process-write |
At-least-once is the common default. Exactly-once across read-process-write pipelines uses Kafka transactions (transactional.id) or the idempotent consumer pattern with deduplication in your store.
How consumers work
A consumer subscribes to topics and polls for records in a loop. It belongs to a consumer group identified by group.id; Kafka assigns partitions across the group’s members. The consumer periodically commits the offset of records it has processed so it can resume after a restart or rebalance.
Disable auto-commit and commit manually after successful processing. Auto-commit can advance the offset before your handler finishes, turning a crash into silent data loss.
A Java producer
import org.apache.kafka.clients.producer.*;
import java.util.Properties;
public class OrderProducer {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("acks", "all");
props.put("enable.idempotence", "true");
try (Producer<String, String> producer = new KafkaProducer<>(props)) {
ProducerRecord<String, String> record =
new ProducerRecord<>("orders", "customer-42", "{\"amount\": 99.50}");
producer.send(record, (metadata, ex) -> {
if (ex != null) ex.printStackTrace();
else System.out.printf("Sent to partition %d at offset %d%n",
metadata.partition(), metadata.offset());
});
}
}
}
Output:
Sent to partition 1 at offset 7
A Java consumer
import org.apache.kafka.clients.consumer.*;
import java.time.Duration;
import java.util.*;
public class OrderConsumer {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "billing");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("enable.auto.commit", "false");
props.put("auto.offset.reset", "earliest");
try (Consumer<String, String> consumer = new KafkaConsumer<>(props)) {
consumer.subscribe(List.of("orders"));
while (true) {
ConsumerRecords<String, String> records =
consumer.poll(Duration.ofMillis(500));
for (ConsumerRecord<String, String> r : records) {
System.out.printf("key=%s value=%s partition=%d offset=%d%n",
r.key(), r.value(), r.partition(), r.offset());
}
consumer.commitSync(); // commit AFTER processing this batch
}
}
}
}
Run multiple instances with the same group.id to scale out — Kafka splits the partitions across them automatically, and rebalances when instances come and go.
Best Practices
- Use
acks=allwith the idempotent producer for durable, duplicate-free writes. - Design consumers to be idempotent so at-least-once redelivery is harmless.
- Commit offsets manually, only after processing succeeds.
- Set a sensible
auto.offset.reset(earliestfor completeness,latestfor live-only). - Tune
max.poll.recordsandmax.poll.interval.msso slow processing does not trigger rebalances. - Reserve exactly-once (transactions) for pipelines that genuinely need it — it adds overhead.