Comment by smittywerben

Comment by smittywerben 2 months ago

Kafka is a write-ahead log, not a queue per se. It handles transactions to the disk. Not across the network.

RabbitMQ is neat out of the box. But I went with ZeroMQ at the time.

ZeroMQ is cool but during current year I'd only use it to learn from their excellent documentation. Coming from Python, it taught me about Berkeley sockets and the process of building cross-language messaging patterns. After a few projects, it's like realizing I didn't need ZeroMQ to begin with I could make my own! If ZeroMQ's Hintjens were still with us I'd still be using it.

It's like the documented incremental process of designing a messaging queue to fit your problem domain, plus a thin wrapper easing some of lower level socket nastiness. At least that's my experience using it over the years. Me talking about it won't do it enough justice.

NATS does the lower level socket wrapper part very nicely. It's a but more modern too. Golang's designed to be like a slightly nicer C syntax, so it would make sense that it's high performance and sturdy. So it's similar to ZeroMQ there.

I'm not sure if either persist to disk out of the box. So either of these are going to be simpler and faster than Kafka.

The DB people are probably trying too hard to cater to the queues. Ideally I'd have normalized the data and modeled the relations such transactions don't lock up the whole table. Then I started questioning why I needed a queue at all when databases (sans SQLite which is fast enough as is) are made for pooling access to a database.

Kafka supports pipelining to a relational database but this part is where you kind of have to be experienced to not footgun and I'm not at that level. I think using it as a queue in that you're short-circuiting it from the relational database pipeline is non-standard for Kafka. I suspect that's where a lot of the Kafka hate is from. I could understand if the distributed transactions part is hell but at that point it's like why'd you skip the database then? Trying to get that free lunch I assume.

I have an alternative. Try inserting everything into a SQLite file. Running into concurrency issues? Use a second SQLite file. Two computers? send it over the network. More issues? Since it's SQL just switch to a real database that will pool the clients. Or switch to five of them. SQL is sorta cool that way. I assume that would avoid the reimplementing half of the JVM to sync across computers where you get Oracle Java showing up to sell you their database halfway into making your galactic scale software or the whatever.

I must be stressed today. Thanks for asking.

smittywerben 2 months ago

I already spent time proofreading it, but I forgot that I couldn't edit it after so much time. Unless the merciful admin lets me roll the 12-sided dice to replace the above cringe with the below piece I carved out of soap.

---

I always check for maintained libraries for my programming languages for any messaging library. Bindings in many languages are consistent across Kafka, ZeroMQ, and NATS.

Kafka is a write-ahead log, not a queue per se. It handles transactions to the disk. The networking is a simple broadcast, not a shared queue. You also can't (canonically, at least) pop/insert/delete rows. It's append-only. It can do basic seeking, like replaying from the start.

ZeroMQ is a good choice for learning from its excellent documentation, and programmers interested in C programming. Probably a good lead into Beej's networking guide. ZeroMQ is the odd one as it has no central broker ("Zero" for zero broker); you copy your favorite broker.py pattern from the ZeroMQ guide.

Dropping anchor to throw in the POSIX standard sockets, the BSD kqueue, the Linux epoll, newer io_uring, and libuv for boring cross-platform asynchronous I/O.

https://zguide.zeromq.org/docs/preface/

https://beej.us/guide/bgnet/html/split/

https://pubs.opengroup.org/onlinepubs/9699919799/functions/p...

https://pubs.opengroup.org/onlinepubs/9699919799/functions/s...

https://man.freebsd.org/cgi/man.cgi?kqueue

https://man7.org/linux/man-pages/man7/epoll.7.html

https://man7.org/linux/man-pages/man7/io_uring.7.html

https://docs.libuv.org/en/v1.x/

Reply View 0 replies

j45 2 months ago

It’s less about DB people or not.

Many of us have used both sides and settled on one area to start.

Kafka, et al are amazing. Also almost always overkill in the first x months or years.

It’s not too much of a stretch to model your queue first in something like Postgres, which oddly offers things a little beyond a traditional rbdms, and when the model implementation in the domain reveals itself… it can shine a nice light in the direction of a Kafka, etc.

Reply View 1 reply

smittywerben 2 months ago

Sir, your reply is more coherent than mine. I'll give you props for that.
Still, I disagree that Kafka is always overkill.
When god opened a datagram socket on your computer, you needed to have been capturing this data X months ago, but weren't paying attention. You need to build warbot.py and put it into production before you have the chance to deal with cold storage. Kafka is my go-to if you can do this before you run out of disk space.
I frequently append JSON lines to a "data.json" file in Python. Add a socket server, compression, and typed bindings for 20+ languages. Boom. Kafka. Don't oversell it. Need to delete a row? Congratulations, you selected the wrong tool. It's appending JSON lines to a file. Kafka is a write-ahead log, not a queue.
To your point about Postgres, I've found Postgres has fantastic JSONB support and awesome developers who have been very influential in my life and whom I admire. Postgres is my preferred cold storage, which I connect to Kafka. It feels like swimming upstream because RMDBs are traditionally for normalized data, not denormalized JSON lines that make XML look hip again.
If you have a choice in DB, Postgres' JSONB has helped me avoid unnecessary normalization steps. It's good to have options.
ZeroMQ would call this the Titanic pattern and mic drop because the guide has a section on it. That's why I like ZeroMQ.
Edit: Apologies for typos/brevity. I have an ancient phone that only works with 20% of the web and phone apps. There are no apps or LLMs to help this dyslexic soul.
Reference for the Titanic pattern. The guide's author is cynical about me shoving spinning rust in the middle, but it doesn't say no. https://zguide.zeromq.org/docs/chapter4/#Disconnected-Reliab...

Reply View | 0 replies