Interesting links - November 2025

Published by in Interesting Links at https://rmoff.net/2025/11/26/interesting-links-november-2025/

Welcome to the 10th edition of Interesting Links. I’ve got over a hundred links for you this month—all of them, IMHO, interesting :)

I’ll start off by shamelessly plugging the articles that I published this month:


RFC 🔗

For you youngsters: Request For Comments

This newsletter has grown, both in audience and number of links. Back in February there were fewer than two dozen links. This month, there’s nearly 150 😲.

I’d love to hear from you whether you would like to see fewer links, or if the current amount is about right. Also let me know if there are areas of which you want to see more (or less).

Use the comment section at the end of this article to feedback, or find me on Twitter, LinkedIn, etc.

Email? 🔗

Would you prefer to read this as an email? If there’s the appetite I’m happy to set something up, either just x-posting to Substack, or perhaps something self-hosted like ListMonk.

Again - leave a comment below, or find me online :)

Call for Papers - Current 2026 🔗

The Call for Papers for both Current London and Current Bengaluru are open, closing on December 22nd.

Tip

If you need a hand with writing your abstract, you might find these articles that I’ve written helpful:

And if you’re a speaker, check out the excellent article titled "The Silent Crowd" from Sam Harris which includes this important point (amongst others):

To change slides every thirty seconds is to be rendered nearly invisible by the apparatus. Having too many images can also force you to race to the end of your talk. A final flurry of slides and apologies depresses everyone.

Not got time for all this? I’ve marked 🔥 for my top reads of the month :)

Kafka and Event Streaming 🔗

Stream Processing 🔗

Data Platforms, Architectures, and Modelling 🔗

Data Engineering, Pipelines, and CDC 🔗

  • Is it too meta, in a list of interesting links, to link to a list of links? Regardless, this list from Faruk Tufekci of resources for analytics engineers is really useful.

  • Detailed articles from Jan Zedníček looking at how to use dbt to handle and implement SCD2.

  • Cutting over from historical to realtime data in a pipeline can be a tricky problem—Nicoleta Lazar from Fresha has a nice article detailing how they do it with Snowflake, Flink, and Airflow.

  • I’m a fan of the Write-Audit-Publish (WAP) pattern, and enjoyed this article from Soumil Shah showing how to do WAP with Amazon S3 Tables.

  • 🔥 An excellent roundup of the Q&A that Simon Späti, Mehdi Ouazza, Julien Hurault, and Ben Rogojan did based on common questions from Reddit’s r/dataengineering. Lots of useful content here.

  • LinkedIn’s Gaojie Liu and Jialin Liu explain how the ingestion pipeline for Venice works.

  • Hans-Peter Grahsl has published a nice Docker Compose to spin up Flink, Fluss, and LanceDB. The README has a good overview of how and why you might want to experiment with the particular stack.

  • The TinyETL project from Alex Nemeth looks interesting for simple full-load data movement between standard formats and RDBMS.

  • 🔥 Excellent detailed post from Andrew Zhang and Sanketh Balakrishna at Datadog explaining how they use Kafka Connect and Debezium to replicate from Postgres to Elasticsearch and Iceberg, including handling schemas and more.

  • 🔥 If the above article from Datadog whet your appetite for what you can build with Kafka Connect, you’ll love this practical and clear introduction to Kafka Connect and its components and concepts from Stefan Kecskes.

Open Table Formats (OTF), Catalogs, Lakehouses etc. 🔗

Lots of links in this category this month! I’ve split out some of the technology-specific stuff into their own sections below.

Apache Fluss 🔗

It’s not a table format…it’s not a lakehouse…it’s…Fluss ¯\_(ツ)_/¯ (If you’ve got a better category or mental-model for me to bucket it into, let me know in the comments below!)

  • Giannis Polyzos and Jark Wu have details of the Fluss 0.8 release

  • A useful overview from Alibaba of Fluss and Paimon; what they do, where they overlap, how to decide if they fit your requirements.

  • Real-world details of Fluss in action in this blog from Xinyu Zhang and Lilei Wang at TaoBao, looking in detail at why they adopted it and how they use it.

  • 🔥 The Future Data Systems Seminar Series from Carnegie Mellon University Database Research Group is a very cool free resource with weekly deep-dives from experts in the industry. The lecture on 8th December is from the original creator of Fluss, Jark Wu. All the talks are recorded and available online afterwards.

Apache Iceberg 🔗

Apache Hudi 🔗

RDBMS 🔗

General Data Stuff 🔗

AI 🔗

I warned you previously…this AI stuff is here to stay, and it’d be short-sighted to think otherwise. As I read and learn more about it, I’m going to share interesting links (the clue is in the blog post title) that I find—whilst trying to avoid the breathless hype and slop.

AI in the Enterprise 🔗

Coding with AI 🔗

Agents and MCP 🔗

  • 🔥 I love this practical example from Thomas Ptacek that demonstrates what an Agent actually is : You Should Write An Agent .

  • 🔥 12 Factor Agents is a very practical guide from Dex Horthy (modelled on the idea of 12 Factor Apps) looking at all the practical considerations you should have when designing and productionising LLM applications.

  • A useful list of Agentic Patterns from Philipp Schmid.

  • Viktor Gamov recently did an excellent talk looking at How MCP Bridges LLMs and Data Streams

  • What’s the difference between prompt engineering and context engineering? And what is context engineering and why does it matter so much? The team at Anthropic have written a good blog post looking at these questions and more.

And finally… 🔗

Nothing to do with data, but stuff that I’ve found interesting or has made me smile.

Think 🔗

Rant 🔗

Watch 🔗

Nerd 🔗


Note

Just a reminder - leave a comment 👇 🔗

  • Is the current amount of links in this newsletter about right, or would you like to see fewer?

  • Are there any areas of which you want to see more (or less)?

  • Would you prefer to read this as an email?

Leave a comment below, or find me online :)


TABLE OF CONTENTS