Resetting a Consumer Group in Kafka
I’ve been using Replicator as a powerful way to copy data from my Kafka rig at home onto my laptop’s Kafka environment. It means that when I’m on the road I can continue to work with the same set of data and develop pipelines etc. With a VPN back home I can even keep them in sync directly if I want to.
I hit a problem the other day where Replicator was running, but I had no data in my target topics on my laptop. After a bit of head-scratching I realised that my local Kafka environment had been rebuilt (I use Docker Compose so complete rebuilds to start from scratch are easy), hence no data in the topic. But, even after restarting the Replicator Kafka Connect worker, I still had no data loaded into the empty topics. What was going on? Well Replicator acts as a consumer from the source Kafka cluster (on my home server), and so far as that Kafka cluster was concerned, Replicator had already read the messages. It thought that because even though I’d rebuilt everything on my laptop, Replicator was using the same connector name as before, and the connector name is used as the Consumer group name - which is how the source Kafka cluster keeps track of the offsets. So my "new" Kafka environment was going back to the source, which viewed it as the existing "old" one, which had already received the messages.