Prompted by a question on StackOverflow I thought I’d take a quick look at setting up ksqlDB to ingest CDC events from Microsoft SQL Server using Debezium. Some of this is based on my previous article, Streaming data from SQL Server to Kafka to Snowflake ❄️ with Kafka Connect.
Setting up the Docker Compose I like standalone, repeatable, demo code. For that reason I love using Docker Compose and I embed everything in there - connector installation, the kitchen sink - the works.
I use Hugo for my blog, hosted on GitHub pages. One of the reasons I’m really happy with it is that I can use Asciidoc to author my posts. I was writing a blog recently in which I wanted to include some code that’s hosted on GitHub. I could have copied & pasted it into the blog but that would be lame!
With Asciidoc you can use the include:: directive to include both local files:
There’s ways, and then there’s ways, to count the number of records/events/messages in a Kafka topic. Most of them are potentially inaccurate, or inefficient, or both. Here’s one that falls into the potentially inefficient category, using kafkacat to read all the messages and pipe to wc which with the -l will tell you how many lines there are, and since each message is a line, how many messages you have in the Kafka topic:
Google Chrome automagically adds sites that you visit which support searching to a list of custom search engines. For each one you can set a keyword which activates it, so based on the above list if I want to search Amazon I can just type a<tab> and then my search term
At the beginning of all this my aim was to learn something new (Go), and use it to write a version of a utility that I’d previously hacked together in Python that checks your Apache Kafka broker configuration for possible problems with the infamous advertised.listeners setting. Check out a blog that I wrote which explains all about Apache Kafka and listener configuration.
You can find the code at https://github.
So far I’ve been running all my code either in the Go Tour sandbox, using Go Playground, or from a single file in VS Code. My explorations in the previous article ended up with a a source file that was starting to get a little bit unwieldily, so let’s take a look at how that can be improved.
Within my most recent code, I have the main function and the doProduce function, which is fine when collapsed down:
Having ticked off the basics with an Apache Kafka producer and consumer in Go, let’s now check out the AdminClient. This is useful for checking out metadata about the cluster, creating topics, and stuff like that.
Last time I looked at creating my first Apache Kafka consumer in Go, which used the now-deprecated channel-based consumer. Whilst idiomatic for Go, it has some issues which mean that the function-based consumer is recommended for use instead. So let’s go and use it!
Instead of reading from the Events() channel of the consumer, we read events using the Poll() function with a timeout. The way we handle events (a switch based on their type) is the same:
I looked last time at the very bare basics of writing a Kafka producer using Go. It worked, but only with everything lined up and pointing the right way. There was no error handling of any sorts. Let’s see about fixing this now.
In the previous exercise I felt my absence of a formal CompSci background with the introduction of Binary Sorted Trees, and now I am concious of it again with learning about mutex. I’d heard of them before, mostly when Oracle performance folk were talking about wait types - TIL it stands for mutual exclusion!
I’ve been playing around with the new SerDes (serialisers/deserialisers) that shipped with Confluent Platform 5.5 - Protobuf, and JSON Schema (these were added to the existing support for Avro). The serialisers (and associated Kafka Connect converters) take a payload and serialise it into bytes for sending to Kafka, and I was interested in what those bytes look like. For that I used my favourite Kafka swiss-army knife: kafkacat.