rmoff's random ramblings
about talks

Kafka Connect - Deep Dive into Single Message Transforms

Published Jan 4, 2021 by in Kafka Connect, Single Message Transform, TwelveDaysOfSMT at https://rmoff.net/2021/01/04/kafka-connect-deep-dive-into-single-message-transforms/

KIP-66 was added in Apache Kafka 0.10.2 and brought new functionality called Single Message Transforms (SMT). Using SMT you can modify the data and its characteristics as it passes through Kafka Connect pipeline, without needing additional stream processors. For things like manipulating fields, changing topic names, conditionally dropping messages, and more, SMT are a perfect solution. If you get to things like aggregation, joining streams, and lookups then SMT may not be the best for you and you should head over to Kafka Streams or ksqlDB instead.

I recently completed a twelve-day exercise of digging into many of the Single Message Transform that are available - almost all of them ship with Apache Kafka itself. For each one I recorded a video, wrote up a blog detailing the SMT, and built a test environment in Docker so that you can go and try it out too :-)

✨ The Highlights ✨

SMT as a concept are a highlight of Kafka Connect in themselves, but here are a handful of the ones that thought were particularly neat:

  • Add the timestamp of a field to the topic name

  • Filtering out null records

  • Conditionally renaming fields based on the topic name

  • Changing the topic name to which a source connector writes

  • Changing the data type of fields as they pass through Kafka Connect

🎥 Videos Playlist

smtplaylist

👾 Code

You can grab the Docker Compose and tutorial files on GitHub

📝 The Complete List

Here are links to the blogs and videos of each Single Message Transform:

  • Community Transformations

  • Predicate and Filter

  • ReplaceField

  • Cast

  • TimestampConverter

  • TimestampRouter

  • InsertField II

  • MaskField

  • RegExRouter

  • Flatten

  • ValueToKey and ExtractField

  • InsertField (timestamp)


Robin Moffatt

Robin Moffatt is a Principal DevEx Engineer at LakeFS. He likes writing about himself in the third person, eating good breakfasts, and drinking good beer.

Story logo

© 2023