
Stumbling into AI: Part 3—RAG
A short series of notes for myself as I learn more about the AI ecosystem as of September 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents.
Having poked around MCP and Models, next up is RAG.
RAG has been one of the buzzwords of the last couple of years, with any vendor worth its salt finding a way to crowbar it into their product. I’d been sufficiently put off it by the hype to steer away from actually understanding what it is. In this blog post, let’s fix that—because if I’ve understood it correctly, it’s a pattern that’s not scary at all.
Stumbling into AI: Part 2—Models
A short series of notes for myself as I learn more about the AI ecosystem as of September 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents.
Having poked around MCP and got a broad idea of what it is, I want to next look at Models. What used to be as simple as "I used AI" actually boils down into several discrete areas, particularly when one starts looking at using LLMs beyond writing a rap about Apache Kafka in the style of Monty Python and using it to build agents (like the Flink Agents that prompted this exploration in the first place).
Stumbling into AI: Part 1—MCP
A short series of notes for myself as I learn more about the AI ecosystem as of September 2025. The driver for all this is understanding more about Apache Flink’s Flink Agents project, and Confluent’s Streaming Agents.
The first thing I want to understand better is MCP.
Interesting links - August 2025
Not got time for all this? I’ve marked 🔥 for my top reads of the month :)
Kafka to Iceberg - Exploring the Options
You’ve got data in Apache Kafka.
You want to get that data into Apache Iceberg.
What’s the best way to do it?

Perhaps invariably, the answer is: IT DEPENDS. But fear not: here is a guide to help you navigate your way to choosing the best solution for you 🫵.
Connecting Apache Flink SQL to Confluent Cloud Kafka broker
This is a quick blog post to remind me how to connect Apache Flink to a Kafka topic on Confluent Cloud. You may wonder why you’d want to do this, given that Confluent Cloud for Apache Flink is a much easier way to run Flink SQL. But, for whatever reason, you’re here and you want to understand the necessary incantations to get this connectivity to work.
Interesting links - July 2025
Not got time for all this? I’ve marked 🔥 for my top reads of the month :)
Keeping your Data Lakehouse in Order: Table Maintenance in Apache Iceberg
Iceberg nicely decouples storage from ingest and query (yay!). When we say "decouples" it’s a fancy way of saying "doesn’t do". Which, in the case of ingest and query, is really powerful. It means that we can store data in an open format, populated by one or more tools, and queried by the same, or other tools. Iceberg gets to be very opinionated and optimised around what it was built for (storing tabular data in a flexible way that can be efficiently queried). This is amazing!
But, what Iceberg doesn’t do is any housekeeping on its data and metadata. This means that getting data in and out of Apache Iceberg isn’t where the story stops.
Writing to Apache Iceberg on S3 using Kafka Connect with Glue catalog
Without wanting to mix my temperature metaphors, Iceberg is the new hawtness, and getting data into it from other places is a common task. I wrote previously about using Flink SQL to do this, and today I’m going to look at doing the same using Kafka Connect.
Kafka Connect can send data to Iceberg from any Kafka topic. The source Kafka topic(s) can be populated by a Kafka Connect source connector (such as Debezium), or a regular application producing directly to it.