Apache Kafka core 3 min read

Producers & Consumers

Producers and consumers are the client applications that write to and read from Kafka. Their configuration — especially around acknowledgements and offset commits — determines the delivery guarantees your system actually provides. This page covers the concepts and shows working Java clients.

How producers work

A producer publishes events to a topic. It batches records, optionally compresses them, selects a partition (by key hash or round-robin), and sends them to the partition’s leader broker. The producer waits for an acknowledgement governed by the acks setting.

The acks config is the central durability lever:

acks=0 — fire and forget. The producer does not wait; fastest, but events can be lost.
acks=1 — wait for the leader to write. Lost if the leader fails before followers replicate.
acks=all — wait for all in-sync replicas. Strongest durability; survives broker loss.

Combine acks=all with enable.idempotence=true (the default in modern clients) to get exactly-once producing — no duplicates from retries — without sacrificing durability.

Delivery semantics

The end-to-end guarantee depends on producer settings and how the consumer commits offsets.

Semantic	Meaning	How to achieve
At-most-once	Events may be lost, never duplicated	`acks=0/1`; commit offset before processing
At-least-once	Events never lost, may be duplicated	`acks=all` + retries; commit offset after processing
Exactly-once	Each event processed once, no loss or dupes	Idempotent producer + transactions, or transactional read-process-write

At-least-once is the common default. Exactly-once across read-process-write pipelines uses Kafka transactions (transactional.id) or the idempotent consumer pattern with deduplication in your store.

How consumers work

A consumer subscribes to topics and polls for records in a loop. It belongs to a consumer group identified by group.id; Kafka assigns partitions across the group’s members. The consumer periodically commits the offset of records it has processed so it can resume after a restart or rebalance.

Disable auto-commit and commit manually after successful processing. Auto-commit can advance the offset before your handler finishes, turning a crash into silent data loss.

A Java producer

import org.apache.kafka.clients.producer.*;
import java.util.Properties;

public class OrderProducer {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("acks", "all");
        props.put("enable.idempotence", "true");

        try (Producer<String, String> producer = new KafkaProducer<>(props)) {
            ProducerRecord<String, String> record =
                new ProducerRecord<>("orders", "customer-42", "{\"amount\": 99.50}");

            producer.send(record, (metadata, ex) -> {
                if (ex != null) ex.printStackTrace();
                else System.out.printf("Sent to partition %d at offset %d%n",
                    metadata.partition(), metadata.offset());
            });
        }
    }
}

Output:

Sent to partition 1 at offset 7

A Java consumer

import org.apache.kafka.clients.consumer.*;
import java.time.Duration;
import java.util.*;

public class OrderConsumer {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("group.id", "billing");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("enable.auto.commit", "false");
        props.put("auto.offset.reset", "earliest");

        try (Consumer<String, String> consumer = new KafkaConsumer<>(props)) {
            consumer.subscribe(List.of("orders"));
            while (true) {
                ConsumerRecords<String, String> records =
                    consumer.poll(Duration.ofMillis(500));
                for (ConsumerRecord<String, String> r : records) {
                    System.out.printf("key=%s value=%s partition=%d offset=%d%n",
                        r.key(), r.value(), r.partition(), r.offset());
                }
                consumer.commitSync();   // commit AFTER processing this batch
            }
        }
    }
}

Run multiple instances with the same group.id to scale out — Kafka splits the partitions across them automatically, and rebalances when instances come and go.

Best Practices

Use acks=all with the idempotent producer for durable, duplicate-free writes.
Design consumers to be idempotent so at-least-once redelivery is harmless.
Commit offsets manually, only after processing succeeds.
Set a sensible auto.offset.reset (earliest for completeness, latest for live-only).
Tune max.poll.records and max.poll.interval.ms so slow processing does not trigger rebalances.
Reserve exactly-once (transactions) for pipelines that genuinely need it — it adds overhead.