Interview Prep
Interview Questions on Kafka — Topics, Partitions, Consumer Groups, and What Backend Interviews Actually Test
Apache Kafka is the backbone of event-driven architecture at companies like Flipkart, Swiggy, Razorpay, and every major Indian fintech. If you are interviewing for a backend or data engineering role, Kafka questions are coming. Here is what they ask.

Kafka processes trillions of messages daily at companies like LinkedIn, Uber, and Netflix. In India, every fintech and e-commerce company uses it.
Why Kafka Is on Every Backend Interview
Apache Kafka is a distributed event streaming platform used for real-time data pipelines and event-driven microservices. In India, it is used by every major product company — Flipkart for order processing, Swiggy for delivery tracking, Razorpay for payment events, and PhonePe for transaction streaming. Service companies working on these clients also need Kafka expertise.
Kafka interviews test your understanding of distributed messaging, partitioning strategy, consumer group mechanics, and how to handle failures. This guide covers the questions that actually get asked — from architecture basics to production scenarios.
Kafka Architecture
Q1: What is Kafka? Explain its core components.
Kafka = distributed event streaming platform Core components: Producer → publishes messages to topics Consumer → reads messages from topics Broker → Kafka server that stores messages Topic → category/feed name for messages Partition → subdivision of a topic (parallelism) Offset → unique ID of a message within a partition ZooKeeper/KRaft → cluster coordination (metadata) Architecture: ┌──────────┐ ┌─────────────────────┐ ┌──────────┐ │ Producer │ ──→ │ Kafka Cluster │ ──→ │ Consumer │ │ Producer │ ──→ │ Broker 1 │ Broker 2│ ──→ │ Consumer │ │ Producer │ ──→ │ Broker 3 │ Broker 4│ ──→ │ Consumer │ └──────────┘ └─────────────────────┘ └──────────┘ Key characteristics: - Distributed (runs on multiple servers) - Fault-tolerant (data replicated across brokers) - High throughput (millions of messages/second) - Persistent (messages stored on disk) - Pull-based (consumers pull, not push)
Q2: What are topics and partitions? Why partition?
Topic = logical channel for messages Example: "orders", "payments", "user-events" Partition = physical subdivision of a topic Topic "orders" with 3 partitions: Partition 0: [msg0, msg3, msg6, msg9...] Partition 1: [msg1, msg4, msg7, msg10...] Partition 2: [msg2, msg5, msg8, msg11...] Why partition? 1. Parallelism — multiple consumers read simultaneously 2. Scalability — partitions spread across brokers 3. Ordering — messages within a partition are ordered (but NOT across partitions) Partition key determines which partition a message goes to: - Same key → same partition (ordering guaranteed) - No key → round-robin (load balanced) Example: partition by user_id - All events for user_123 go to same partition - Guarantees order for that user's events
Q3: What are consumer groups? How does rebalancing work?
Consumer Group = set of consumers that share the work Topic with 4 partitions, Consumer Group with 2 consumers: Consumer A reads: Partition 0, Partition 1 Consumer B reads: Partition 2, Partition 3 Rules: - Each partition is consumed by EXACTLY ONE consumer in a group - One consumer can read multiple partitions - If consumers > partitions, extra consumers are idle Rebalancing happens when: - New consumer joins the group - Consumer leaves (crash or shutdown) - New partitions added to topic During rebalancing: - All consumers stop reading briefly - Partitions are reassigned - Consumers resume from last committed offset // This is why you should not have more consumers // than partitions — extras just sit idle
Reliability and Delivery Guarantees
Q4: What are the delivery semantics in Kafka?
At-most-once: - Message may be lost, never duplicated - Consumer commits offset BEFORE processing - If processing fails, message is skipped - Use case: metrics, logs (loss is acceptable) At-least-once (default): - Message is never lost, may be duplicated - Consumer commits offset AFTER processing - If commit fails, message is reprocessed - Use case: most applications (handle duplicates) Exactly-once: - Message is processed exactly once - Requires idempotent producer + transactional API - Kafka supports this since version 0.11 - Use case: financial transactions, billing // Producer acks setting: // acks=0 → fire and forget (fastest, least safe) // acks=1 → leader acknowledges (balanced) // acks=all → all replicas acknowledge (safest, slowest)
Q5: What is the replication factor? How does Kafka handle broker failure?
Replication factor = number of copies of each partition Topic "orders" with replication factor 3: Partition 0: Broker 1 (leader), Broker 2, Broker 3 Partition 1: Broker 2 (leader), Broker 3, Broker 1 Leader: handles all reads and writes for a partition Followers: replicate data from leader ISR (In-Sync Replicas): - Followers that are caught up with the leader - If leader fails, a new leader is elected from ISR Broker failure scenario: 1. Broker 1 goes down 2. Partitions where Broker 1 was leader need new leaders 3. Kafka elects new leaders from ISR 4. Producers/consumers automatically redirect 5. When Broker 1 comes back, it catches up as follower // Best practice: replication factor = 3 // Can tolerate 1 broker failure without data loss

Kafka's partition and replication model is what makes it fault-tolerant. Understanding this is critical for senior interviews.
Practical and Scenario Questions
Q6: How do you decide the number of partitions for a topic?
Factors: 1) Expected throughput — more partitions = more parallelism. 2) Number of consumers — partitions should be ≥ consumers in the largest consumer group. 3) Message ordering requirements — if you need global ordering, use 1 partition (but lose parallelism). 4) Broker capacity — each partition uses memory and file handles. Rule of thumb: start with number of brokers × number of consumers, then tune based on load testing.
Q7: Kafka vs RabbitMQ — when to use which?
Kafka RabbitMQ ───────────────────── ───────────────────── Log-based (append-only) Queue-based (message deleted) Pull model Push model High throughput (millions/s) Lower throughput (thousands/s) Messages persist on disk Messages deleted after consume Replay possible No replay Better for streaming Better for task queues Consumer groups Competing consumers Use Kafka when: - Event streaming / event sourcing - High throughput needed - Need to replay messages - Multiple consumers for same data Use RabbitMQ when: - Task queue / work distribution - Complex routing needed - Request-reply pattern - Lower latency per message
Q8: How do you handle message ordering in Kafka?
Key insight: Kafka guarantees ordering WITHIN a partition, not across partitions. To ensure ordering for related messages, use the same partition key. Example: for an e-commerce order, use order_id as the partition key — all events for that order (created, paid, shipped, delivered) go to the same partition and are consumed in order.
If you need global ordering across all messages, use a single partition — but this eliminates parallelism and becomes a bottleneck at scale. The trade-off between ordering and throughput is a common interview discussion point.
How to Prepare
Kafka Interview — Priority by Role
Backend Developer
- • Producer/consumer basics
- • Topics and partitions
- • Consumer groups
- • Delivery semantics
- • Kafka vs RabbitMQ
Data Engineer
- • Kafka Connect
- • Kafka Streams
- • Schema Registry (Avro)
- • Exactly-once semantics
- • Data pipeline design
DevOps / SRE
- • Cluster management
- • Replication & ISR
- • Monitoring (lag, throughput)
- • Broker failure handling
- • Performance tuning