Kafka Connect Change Log Level and Write Log to File
By default Kafka Connect sends its output to stdout, so you’ll see it on the console, Docker logs, or wherever. Sometimes you might want to route it to file, and you can do this by reconfiguring log4j. You can also change the configuration to get more (or less) detail in the logs by changing the log level.
Finding the log configuration file
The configuration file is called connect-log4j.properties and usually found in etc/kafka/connect-log4j.properties.
Replacing UTF8 non-breaking-space with bash/sed on the Mac
A script I’d batch-run on my Markdown files had inserted a UTF-8 non-breaking-space between Markdown heading indicator and the text, which meant that # My title actually got rendered as that, instead of an H3 title.
Looking at the file contents, I could see it wasn’t just a space between the # and the text, but a non-breaking space.
How KSQL handles case
KSQL is generally case-sensitive. Very sensitive, at times ;-)
KSQL REST API cheatsheet
Full reference is here
Confluent Schema Registry REST API cheatsheet
The Schema Registry support a REST API for finding out information about the schemas within it. Here’s a quick cheatsheat with REST calls that I often use.
What to Do When Docker on the Mac Runs Out of Space
I use Docker and Docker Compose a lot. Like, every day. It’s a fantastic way to build repeatable demos and examples, that can be torn down and spun up in a repeatable way. But…what happens when the demo that was working is spun up and then tail spins down in a blaze of flames?
Quick Thoughts on Not Writing a Crap Abstract
I’ve reviewed a bunch of abstracts in the last couple of days, here are some common suggestions I made:
-
No need to include your company name in the abstract text. Chances are I’ve not heard of your company, and even if I have, what does it add to my comprehension of your abstract and what you’re going to talk about? Possible exception would be the "hot" tech companies where people will see a talk just because it’s Netflix etc
-
I really don’t want just to read your project documentation/summary. It makes me worry your talk will be death by PowerPoint of the minutiae of something that’s only relevant in your company.
-
Following on from above, I want to see that there’s going to be things you’ll share that are useful for other people in a similar situation. Something that’s specific to your project, your company, doesn’t translate to mass-usefulness. Something that other people will hit, whether it’s technical or org-cultural, now that is interesting and is going to be useful
-
If my eyes start to glaze over reading the abstract intro, already I’m assuming that your talk will make me bored too. Read it back out loud to yourself…make sure each word justifies its place in the text. Boilerplate filler and waffle should be left on the cutting room floor.
-
You need to strike a balance between giving enough detail about the contents of your talk that I am convinced you have interesting things to share, but without listing every nut and bolt of detail. Too much detail and it just becomes a laundry list. You need to whet people’s appetite for the actual meal, not put them off their food.
-
For heaven’s sake, proof read! If you can’t be arsed to use a spell checker, then I definitely wouldn’t trust you to prepare a talk of any quality. I’ve recently started using Grammarly and it’s excellent.
Moving from Ghost to Hugo
Why?
I’ve been blogging for quite a few years now, starting on Blogger, soon onto WordPress, and then to Ghost a couple of years ago. Blogger was fairly lame, WP yucky, but I really do like Ghost. It’s simple and powerful and was perfect for my needs. My needs being, an outlet for technical content that respected formatting, worked with a markup language (Markdown), and didn’t f**k things up in the way that WP often would in its WYSIWYG handling of content.
Pull new version of multiple Docker images
Tiny little snippet this one. Given a list of images:
$ docker images|grep confluent
confluentinc/cp-enterprise-kafka 5.0.0 d0c5528d7f99 3 months ago 600MB
confluentinc/cp-kafka 5.0.0 373a4e31e02e 3 months ago 558MB
confluentinc/cp-zookeeper 5.0.0 3cab14034c43 3 months ago 558MB
confluentinc/cp-ksql-server 5.0.0 691bc3c1991f 4 months ago 493MB
confluentinc/cp-ksql-cli 5.0.0 e521f3e787d6 4 months ago 488MB
…Now there’s a new version available, and you want to pull down all the latest ones for it:
docker images|grep "^confluentinc"|awk '{print $1}'|xargs -Ifoo docker pull foo:5.1.0Docker Tips and Tricks with Kafka Connect, ksqlDB, and Kafka
A few years ago a colleague of mine told me about this thing called Docker, and I must admit I dismissed it as a fad…how wrong was I. Docker, and Docker Compose, are one of my key tools of the trade. With them I can build self-contained environments for tutorials, demos, conference talks etc. Tear it down, run it again, without worrying that somewhere a local config changed and will break things.
Streaming data from Oracle into Kafka
This is a short summary discussing what the options are for integrating Oracle RDBMS into Kafka, as of December 2018 (refreshed June 2020). For a more detailed background to why and how at a broader level for all databases (not just Oracle) see this blog and this talk.
What techniques & tools are there?
Franck Pachot has written up an excellent analysis of the options available here.
Tools I Use: iPad Pro
I’ve written recently about how I create the diagrams in my blog posts and talks, and from discussions around that, a couple of people were interested more broadly in how I use my iPad Pro. So, on the basis that if two people are interested maybe others are (and if no-one else is, I have a copy-and-paste answer to give to those two people) here we go.
Kit
- iPad Pro 10.5" (2018)
- 256GB model
- Apple Pencil
- Apple Keyboard
- iPad wallet/protector
- Matte screen protector
Background
I travel quite a lot for work, so want something with a decent battery life for stuff like:
So how DO you make those cool diagrams?
I write and speak lots about Kafka, and get a fair few questions from this. The most common question is actually nothing to do with Kafka, but instead:
How do you make those cool diagrams?
So here’s a short, and longer, answer!
Update July 2019
I’ve moved away from Paper -> read more here
tl;dr
An iOS app called Paper, from a company called FiftyThree
So, how DO you make those cool diagrams?
Disclaimer: This is a style that I have copied straight from my esteemed colleagues at Confluent, including Neha Narkhede and Ben Stopford, as well as others including Martin Kleppmann.
Get mtr working on the Mac
Install
Not sure why the brew doesn’t work as it used to, but here’s how to get it working:
brew install mtr
sudo ln /usr/local/Cellar/mtr/0.92/sbin/mtr /usr/local/bin/mtr
sudo ln /usr/local/Cellar/mtr/0.92/sbin/mtr-packet /usr/local/bin/mtr-packet
(If you don’t do the two symbolic links (ln) you’ll get mtr: command not found or mtr: Failure to start mtr-packet: Invalid argument)
Run
sudo mtr google.com
Kafka Connect CLI tricks
I do lots of work with Kafka Connect, almost entirely in Distributed mode—even just with 1 node -> makes scaling out much easier when/if needed. Because I’m using Distributed mode, I use the Kafka Connect REST API to configure and manage it. Whilst others might use GUI REST tools like Postman etc, I tend to just use the commandline. Here are some useful snippets that I use all the time.
I’m showing the commands split with a line continuation character (\) but you can of course run them on a single line. You might also choose to get fancy and set the Connect host and port as environment variables etc, but I leave that as an exercise for the reader :)
ERROR: Invalid interpolation format for “command” option in service…
Doing some funky Docker Compose stuff, including:
Flatten CDC records in KSQL
The problem - nested messages in Kafka
Data comes into Kafka in many shapes and sizes. Sometimes it’s from CDC tools, and may be nested like this:
Exploring JMX with jmxterm
- Check out the jmxterm repository
- Download jmxterm from https://docs.cyclopsgroup.org/jmxterm
Accessing Kafka Docker containers’ JMX from host
See also docs.
To help future Googlers… with the Confluent docker images for Kafka, KSQL, Kafka Connect, etc, if you want to access JMX metrics from within, you just need to pass two environment variables: <x>_JMX_HOSTNAME and <x>_JMX_PORT, prefixed by a component name.
-
<x>_JMX_HOSTNAME- the hostname/IP of the JMX host machine, as accessible from the JMX Client.This is used by the JMX client to connect back into JMX, so must be accessible from the host machine running the JMX client.
Sending multiline messages to Kafka
You can use kafkacat to send messages to Kafka that include line breaks. To do this, use its -D operator to specify a custom message delimiter (in this example /):
kafkacat -b kafka:29092 \
-t test_topic_01 \
-D/ \
-P <<EOF
this is a string message
with a line break/this is
another message with two
line breaks!
EOF
Note that the delimiter must be a single byte - multi-byte chars will end up getting included in the resulting message See issue #140