KSQL REST API cheatsheet
Full reference is here
Full reference is here
The Schema Registry support a REST API for finding out information about the schemas within it. Here’s a quick cheatsheat with REST calls that I often use.
I use Docker and Docker Compose a lot. Like, every day. It’s a fantastic way to build repeatable demos and examples, that can be torn down and spun up in a repeatable way. But…what happens when the demo that was working is spun up and then tail spins down in a blaze of flames?
I’ve reviewed a bunch of abstracts in the last couple of days, here are some common suggestions I made:
No need to include your company name in the abstract text. Chances are I’ve not heard of your company, and even if I have, what does it add to my comprehension of your abstract and what you’re going to talk about? Possible exception would be the "hot" tech companies where people will see a talk just because it’s Netflix etc
I really don’t want just to read your project documentation/summary. It makes me worry your talk will be death by PowerPoint of the minutiae of something that’s only relevant in your company.
Following on from above, I want to see that there’s going to be things you’ll share that are useful for other people in a similar situation. Something that’s specific to your project, your company, doesn’t translate to mass-usefulness. Something that other people will hit, whether it’s technical or org-cultural, now that is interesting and is going to be useful
If my eyes start to glaze over reading the abstract intro, already I’m assuming that your talk will make me bored too. Read it back out loud to yourself…make sure each word justifies its place in the text. Boilerplate filler and waffle should be left on the cutting room floor.
You need to strike a balance between giving enough detail about the contents of your talk that I am convinced you have interesting things to share, but without listing every nut and bolt of detail. Too much detail and it just becomes a laundry list. You need to whet people’s appetite for the actual meal, not put them off their food.
For heaven’s sake, proof read! If you can’t be arsed to use a spell checker, then I definitely wouldn’t trust you to prepare a talk of any quality. I’ve recently started using Grammarly and it’s excellent.
I’ve been blogging for quite a few years now, starting on Blogger, soon onto WordPress, and then to Ghost a couple of years ago. Blogger was fairly lame, WP yucky, but I really do like Ghost. It’s simple and powerful and was perfect for my needs. My needs being, an outlet for technical content that respected formatting, worked with a markup language (Markdown), and didn’t f**k things up in the way that WP often would in its WYSIWYG handling of content.
Tiny little snippet this one. Given a list of images:
$ docker images|grep confluent
confluentinc/cp-enterprise-kafka 5.0.0 d0c5528d7f99 3 months ago 600MB
confluentinc/cp-kafka 5.0.0 373a4e31e02e 3 months ago 558MB
confluentinc/cp-zookeeper 5.0.0 3cab14034c43 3 months ago 558MB
confluentinc/cp-ksql-server 5.0.0 691bc3c1991f 4 months ago 493MB
confluentinc/cp-ksql-cli 5.0.0 e521f3e787d6 4 months ago 488MB
…Now there’s a new version available, and you want to pull down all the latest ones for it:
docker images|grep "^confluentinc"|awk '{print $1}'|xargs -Ifoo docker pull foo:5.1.0A few years ago a colleague of mine told me about this thing called Docker, and I must admit I dismissed it as a fad…how wrong was I. Docker, and Docker Compose, are one of my key tools of the trade. With them I can build self-contained environments for tutorials, demos, conference talks etc. Tear it down, run it again, without worrying that somewhere a local config changed and will break things.
This is a short summary discussing what the options are for integrating Oracle RDBMS into Kafka, as of December 2018 (refreshed June 2020). For a more detailed background to why and how at a broader level for all databases (not just Oracle) see this blog and this talk.
Franck Pachot has written up an excellent analysis of the options available here.
I’ve written recently about how I create the diagrams in my blog posts and talks, and from discussions around that, a couple of people were interested more broadly in how I use my iPad Pro. So, on the basis that if two people are interested maybe others are (and if no-one else is, I have a copy-and-paste answer to give to those two people) here we go.
I travel quite a lot for work, so want something with a decent battery life for stuff like:
I write and speak lots about Kafka, and get a fair few questions from this. The most common question is actually nothing to do with Kafka, but instead:
How do you make those cool diagrams?
So here’s a short, and longer, answer!
I’ve moved away from Paper -> read more here
An iOS app called Paper, from a company called FiftyThree
Disclaimer: This is a style that I have copied straight from my esteemed colleagues at Confluent, including Neha Narkhede and Ben Stopford, as well as others including Martin Kleppmann.
Not sure why the brew doesn’t work as it used to, but here’s how to get it working:
brew install mtr
sudo ln /usr/local/Cellar/mtr/0.92/sbin/mtr /usr/local/bin/mtr
sudo ln /usr/local/Cellar/mtr/0.92/sbin/mtr-packet /usr/local/bin/mtr-packet
(If you don’t do the two symbolic links (ln) you’ll get mtr: command not found or mtr: Failure to start mtr-packet: Invalid argument)
sudo mtr google.com
I do lots of work with Kafka Connect, almost entirely in Distributed mode—even just with 1 node -> makes scaling out much easier when/if needed. Because I’m using Distributed mode, I use the Kafka Connect REST API to configure and manage it. Whilst others might use GUI REST tools like Postman etc, I tend to just use the commandline. Here are some useful snippets that I use all the time.
I’m showing the commands split with a line continuation character (\) but you can of course run them on a single line. You might also choose to get fancy and set the Connect host and port as environment variables etc, but I leave that as an exercise for the reader :)
Doing some funky Docker Compose stuff, including:
Data comes into Kafka in many shapes and sizes. Sometimes it’s from CDC tools, and may be nested like this:
See also docs.
To help future Googlers… with the Confluent docker images for Kafka, KSQL, Kafka Connect, etc, if you want to access JMX metrics from within, you just need to pass two environment variables: <x>_JMX_HOSTNAME and <x>_JMX_PORT, prefixed by a component name.
<x>_JMX_HOSTNAME - the hostname/IP of the JMX host machine, as accessible from the JMX Client.
This is used by the JMX client to connect back into JMX, so must be accessible from the host machine running the JMX client.
You can use kafkacat to send messages to Kafka that include line breaks. To do this, use its -D operator to specify a custom message delimiter (in this example /):
kafkacat -b kafka:29092 \
-t test_topic_01 \
-D/ \
-P <<EOF
this is a string message
with a line break/this is
another message with two
line breaks!
EOF
Note that the delimiter must be a single byte - multi-byte chars will end up getting included in the resulting message See issue #140
KSQL provides the ability to create windowed aggregations. For example, count the number of messages in a 1 minute window, grouped by a particular column:
CREATE TABLE RATINGS_BY_CLUB_STATUS AS \
SELECT CLUB_STATUS, COUNT(*) AS RATING_COUNT \
FROM RATINGS_WITH_CUSTOMER_DATA \
WINDOW TUMBLING (SIZE 1 MINUTES) \
GROUP BY CLUB_STATUS;
How KSQL, and Kafka Streams, stores the window timestamp associated with an aggregate, has recently changed. See #1497 for details.
Whereas previously the Kafka message timestamp (accessible through the
KSQL ROWTIME system column) stored the start of the window for which
the aggregate had been calculated, this changed in July 2018 to instead
be the timestamp of the latest message to update that aggregate value.
This was in Apache Kafka 2.0 and Confluent Platform 5.0, and back-ported
to previous versions.
There’s lots going on in the next few months :-)
I’m particularly excited to be speaking at several notable conferences for the first time, including JavaZone, USENIX LISA, and Devoxx.
As always, if you’re nearby then hope to see you there, and let me know if you want to meet for a coffee or beer!
(This was cross-posted on the Confluent.io blog)
This question comes up on StackOverflow and such places a lot, so here’s something to try and help.
tl;dr : You need to set advertised.listeners (or KAFKA_ADVERTISED_LISTENERS if you’re using Docker images) to the external address (host/IP) so that clients can correctly connect to it. Otherwise they’ll try to connect to the internal host address–and if that’s not reachable then problems ensue.
Put another way, courtesy of Spencer Ruport:
LISTENERSare what interfaces Kafka binds to.ADVERTISED_LISTENERSare how clients can connect.