Ask HN: What's your go-to message queue in 2025?

66 points by enether 7 months ago

View on Hacker News

The space is confusing to say the least.

Message queues are usually a core part of any distributed architecture, and the options are endless: Kafka, RabbitMQ, NATS, Redis Streams, SQS, ZeroMQ... and then there's the “just use Postgres” camp for simpler use cases.

I’m trying to make sense of the tradeoffs between:

- async fire-and-forget pub/sub vs. sync RPC-like point to point communication

- simple FIFO vs. priority queues and delay queues

- intelligent brokers (e.g. RabbitMQ, NATS with filters) vs. minimal brokers (e.g. Kafka’s client-driven model)

There's also a fair amount of ideology/emotional attachment - some folks root for underdogs written in their favorite programming language, others reflexively dismiss anything that's not "enterprise-grade". And of course, vendors are always in the mix trying to steer the conversation toward their own solution.

If you’ve built a production system in the last few years:

1. What queue did you choose?

2. What didn't work out?

3. Where did you regret adding complexity?

4. And if you stuck with a DB-based queue — did it scale?

I’d love to hear war stories, regrets, and opinions.

speedgoose 7 months ago

I played with most message queues and I go with RabbitMQ in production.

Mostly because it has been very reliable for years in production at a previous company, and doesn’t require babysitting. Its recent versions also has new features that make it is a descent alternative to Kafka if you don’t need to scale to the moon.

And the logo is a rabbit.

Reply View 5 replies

swyx 7 months ago

Datadog too. i often wonder how come more companies dont pick cute mascots. gives a logo, makes everyone have warm fuzzies immediately, creates pun opportunities.
inb4 "oh but you wont be taken seriously" well... datadog.

Reply View | 2 replies
- DonsDiscountGas 7 months ago
  
  Hugging face clearly shares the same philosophy
  
  Reply View | 1 reply
  
  taskforcegemini 7 months ago
  
  but usually I only see the namw "huggingface" written, and I think of headcrabs from half-life instead
  
  Reply View | 0 replies
aitchnyu 7 months ago

Just used it as a Celery (job queue) backend. How is it a Kafka alternative?

Reply View | 1 reply
- speedgoose 7 months ago
  
  RabbitMQ streams: https://www.rabbitmq.com/docs/streams
  
  Reply View | 0 replies

KingOfCoders 7 months ago

NATS.io because I'm using Go, and I can just embed it for one server [0], one binary to deploy with Systemd, but able to split it out when scaling the MVP.

[0] https://www.inkmi.com/blog/how-i-made-inkmi-selfhealing

Reply View 0 replies

adamcharnock 7 months ago

I would highlight a distinction between Queues and Streams, as I think this is an important factor in making this choice.

In the case of a queue, you put an item in the queue, and then something removes it later. There is a single flow of items. They are put in. They are taken out.

In the case of a stream, you put an item in the queue, then it can be removed multiple times by any other process that cares to do so. This may be called 'fan out'.

This is an important distinction and really effects how one designs software that uses these systems. Queues work just fine for, say, background jobs. A user signs up, and you put a task in the 'send_registration_email' queue.[1]

However, what if some _other_ system then cares about user sign ups? Well, you have to add another queue, and the user sign-up code needs to be aware of it. For example, a 'add_user_to_crm' queue.

The result here is that choosing a queue early on leads to a tight-coupling of services down the road.

The alternative is to choose streams. In this case, instead of saying what _should_ happen, you say what _did_ happen (past tense). Here you replace 'send_registration_email' and 'add_user_to_crm' with a single stream called 'used_registered'. Each service that cares about this fact is then free to subscribe to that steam and get its own copy of the events (it does so via a 'consumer group', or something of a similar name).

This results in a more loosely coupled system, where you potentially also have access to an event history should you need it (if you configure your broker to keep the events around).

--

This is where Postgresql and SQS tend to fall down. I've yet to hear of an implementation of streams in Postgresql[2]. And SQS is inherently a queue.

I therefore normally reach for Redis Steams, but mostly because it is what I am familiar with.

Note: This line of thinking leads into Domain Driven Design, CQRS, and Event Sourcing. Each of which is interesting and certainly has useful things to offer, although I would advise against simply consuming any of them wholesale.

[1] Although this is my go-to example, I'm actually unconvinced that email sending should be done via a queue. Email is just a sequence of queues anyway.

[2] If you know of one please tell me!

Reply View 8 replies

thruflo 7 months ago

There are lots of options to stream data out of Postgres, including:
- https://electric-sql.com (disclaimer: co-founder) - https://feldera.com - https://materialize.com - https://powersync.com - https://sequinstream.com - https://supabase.com/docs/guides/realtime/broadcast - https://zero.rocicorp.dev
Etc.

Reply View | 2 replies
- adamcharnock 7 months ago
  
  I think these all relate to streaming data. Not streams in the sense of the data-structure for message passing (a la Kafka, Redis Streams, etc)
  
  Reply View | 1 reply
  
  empthought 7 months ago
  
  The logical replication of the transaction log is basically a stream of data change events, so the difference between those senses isn’t very big.
  
  Reply View | 0 replies
j45 7 months ago

While someone’s use case would have to be verified, the below is to show that there are streaming options in Postgres.
Would be interesting to get your take on queues vs streams on the below.
I consider myself a little late to the Postgres party after time with other nosql and rbdms, but it seems more and more an ok place to consider beginning from.
For Streaming…
Supabase has some Kafka stream type examples that covers change data capture: https://supabase.com/blog/postgres-wal-logical-replication
Tables can also do some amount of stream like behaviour with visibility and timeout behaviours:
pg-boss — durable job queues with visibility timeouts and retries.
Zilla — supports Postgres as a source using CDC to act as a stream. • ElectricSQL — uses Postgres replication and CRDTs for reactive sync (great for frontend state as a stream
Streaming inside Postgres also has some attention from
Postgres as Event Store https://eventmodeling.org. This can combine event sourcing with Postgres for stream modeling.
pgmq — from Tempo - this is a minimal message queue built on Postgres using append-only design.. Effectively works as a persistent stream with ordered delivery

Reply View | 1 reply
- adamcharnock 7 months ago
  
  I suspect this comment is LLM generated. There is a 404-ing URL, discussion of queues, and some discussion of Postgres CDC which I believe is Postgres logical replication. Neither of which are a streams implementation on Postgres.
  
  Reply View | 0 replies
vlvdus 7 months ago

What makes Postgres (or any decent relational DB) fall down in this case?

Reply View | 1 reply
- adamcharnock 7 months ago
  
  It is simply that I’m unaware of a streams implementation for postgresql. Although another comment is mentioning them, so I’ll read that in some more detail shortly.
  I’ve always felt that streams should be implementable via stored procedures, and that it would be a fun project. I’ve just never quite had the driving force to do it.
  
  Reply View | 0 replies
ryandvm 7 months ago

Great comment. I'm disappointed that I had to scroll this far down to see someone pointing out that queues and streams ARE NOT THE SAME.

Reply View | 0 replies

bilinguliar 7 months ago

I am using Beanstalkd, it is small and fast and you just apt-get it on Debian.

However, I have noticed that oftentimes devs are using queues where Workflow Engines would be a better fit.

If your message processing time is in tens of seconds – talk to your local Workflow Engine professional (:

Reply View 4 replies

janstice 7 months ago

In that case, any suggestions if the answer was looking for workflow engines? Ideally something that will work for no-person-in-the-middle workloads in the tens of seconds range as well as person-making-a-decision workflows that can live for anywhere between minutes and months?

Reply View | 2 replies
- bilinguliar 7 months ago
  
  Temporal if you do not want vendor locks.
  AWS Step Functions or GCP Workflows if you are on the cloud.
  
  Reply View | 1 reply
  
  mdaniel 7 months ago
  
  https://github.com/temporalio/temporal/tree/v1.27.2 (MIT)
  It has been submitted quite a few times but I don't readily see any experiences (pro or con) https://news.ycombinator.com/from?site=github.com/temporalio
  
  Reply View | 0 replies
dkh 7 months ago

A classic. Not something I personally use these days, but I think just as a piece of software it is an eternally good example of something simple, powerful, well-engineered, pleasant to use, and widely-compatible, all at the same time

Reply View | 0 replies

wordofx 7 months ago

Postgres. Doing ~ 70k messages/second average. Nothing huge but don’t need anything dedicated yet.

Reply View 13 replies

lawn 7 months ago

I'm curious on how people use Postgres as a message queue. Do you rely on libraries or do you run a custom implementation?

Reply View | 9 replies
- ericaska 7 months ago
  
  We also use Postgres but we don't have many jobs. It's usually 10-20 squedule that creates hourly-monthly jobs and they are mostly independent. Currently a custom made solution but we are going to update it to use skip locked and use Notify/Listen + interval to handle jobs. There is a really good video about it on YouTube called: "Queues in PostgreSQL Citus Con."
  
  Reply View | 0 replies
- padjo 7 months ago
  
  You can go an awfully long way with just SELECT … FOR UPDATE … SKIP LOCKED
  
  Reply View | 4 replies
  
  Spivak 7 months ago
  
  I've never found a satisfying way to not hold the lock for the full duration of the task that is resilient to workers potentially dying. And postgres isn't happy holding a bunch of locks like that. You end up having to register and track workers with health checks and a cleanup job to prune old workers so you can give jobs exclusivity for a time.
  
  Reply View | 3 replies
- j45 7 months ago
  
  Built right in using a group of pg functions, or also with a library, or also with a python based tool that happens to use pg for the queue.
  
  Reply View | 0 replies
- digianarchist 7 months ago
  
  pgmq https://github.com/pgmq/pgmq
  
  Reply View | 0 replies
- wordofx 7 months ago
  
  Just select for update skipped locked. Table is partitioned to keep unprocessed small.
  
  Reply View | 0 replies
iamcalledrob 7 months ago

Curious what kind of hardware you're using for that 70K/s?

Reply View | 1 reply
- wordofx 7 months ago
  
  It’s an r8g instance in aws. Can’t remember the size but I think it’s over provisioned because it’s at like 20% utilisation and only spikes to 50.
  
  Reply View | 0 replies
aynyc 7 months ago

What’s your batch size?

Reply View | 0 replies

lmm 7 months ago

SQS is great if you're already on AWS - it works and gets out of your way.

Kafka is a great tool with lots of very useful properties (not just queues, it can be your primary datastore), but it's not operationally simple. If you're going to use it you should fully commit to building your whole system on it and accept that you will need to invest in ops at least a little. It's not a good fit for a "side" feature on the edge of your system.

Reply View 0 replies

mstaoru 7 months ago

Redis Streams is a "go-to" for me, mostly because of operational simplicity and performance. It's also dead simple to write consumers in any language. If I had more stringent durability requirements, I would probably pick Redpanda, but Kafka-esque (!) processing semantics can be daunting sometimes.

I didn't have anything but bad experiences with RabbitMQ, maybe I cannot "cook" it, but it would always go split-brain, or last issue I had, a part of clients connected to certain clustered nodes just stopped receiving messages. Cluster restart helped, but all logs and all metrics were green and clean. I try to avoid it if I can.

ZeroMQ is more like a building block for your applications. If you need something very special, it could be a good fit, but for a typical EDA-ish bus architecture Redis or Kafka/Redpanda are both very good.

Reply View 0 replies

jolux 7 months ago

Kafka is fairly different from the rest of these — it’s persistent and designed for high read throughput to multiple simultaneous clients at the same time, as some other commenters have pointed out.

We wanted replayability and multiple clients on the same topic, so we evaluated Kafka, but we determined it was too operationally complex for our needs. Persistence was also unnecessary as the data stream already had a separate archiving system and existing clients only needed about 24hr max of context. AWS Kinesis ended up being simpler for our needs and I have nothing but good things to say about it for the most part. Streaming client support in Elixir was not as good as Kafka but writing our own adapter wasn’t too hard.

Reply View 0 replies

AznHisoka 7 months ago

Sidekiq, Sidekiq, Sidekiq (or just Postgres if Im dealing with something trivial)

Reply View 0 replies

vanbashan 7 months ago

I prefer pulsar. Elegant modular design and fully open source ecosystem.

Performance is at least as good as Kafka.

For simpler workload, beanstalkd could be a good fit, either.

Reply View 1 reply

atombender 7 months ago

Pulsar's feature set is amazing, but it looks like a beast to operate? Especially compared to lighter-weight systems like NATS or Redpanda.
You need both Bookkeeper and Pulsar, which are both stateful, and both require ZooKeeper. (You can apparently configure Bookkeeper to use Etcd, not sure about Pulsar.) So three applications, each of which has several types of processes that probably demand a dedicated operator if running on Kubernetes.

Reply View | 0 replies

crmd 7 months ago

The US Federal Reserve uses IBM MQ for the FedNow interbank settlement service that went live last year.

Architecture info: https://explore.fednow.org/resources/technical-overview-guid...

Reply View 2 replies

kev009 7 months ago

Likely implies z/OS is common on both sides. Given the stakes and availability needs not a bad choice.

Reply View | 1 reply
- crmd 7 months ago
  
  In the linked doc they say it’s MQ on AIX but I believe it’s interchangeable with Z.
  
  Reply View | 0 replies

j45 7 months ago

After using more than a few, 2025 has been trying to start with Postgres with everything to minimize so many things.

Database functions can remain independent of stack or programming changes.

Complexity comes on it's own, often little need to pile it in from the start to tie ones hands early for relatively simple solutions.

Reply View 0 replies

matt_s 7 months ago

Google PubSub is what we use as our message queue, mostly for communicating change data capture via messages to other internal systems. Its typically being consumed by some job system polling on an interval and then doing CRUD to sync changes.

Its not very complex and feels like we're running a lot of compute resources to just sync data between systems. Admittedly there isn't good separation of concerns so there is overlap that requires data syncs.

I've been looking at things like kafka, etc. thinking there might be some magic there that makes us use less compute or makes data syncs a little easier to deal with but wonder what scale of data throughput is a tipping point where a service like that is really needed. If it turns out its just a different service but same timeliness of data sync and similar compute resources I struggle with what benefits might be provided.

I'd love for almost like a levels.fyi style site where people could anonymously report things like this for the tech stacks being used, throughput of data, amount of compute in play, and ratings/comments on their overall solution ("would do again", "don't recommend", "overkill", "resume filler"). It feels much like other areas of technology where a use case comes out of a huge company and RDD (resume driven development) takes hold and now there are people out there doing the equivalent of souping up a 1997 honda accord like its a racecar but its only driving grandma to her appointments.

Reply View 2 replies

mdaniel 7 months ago

I love hearing "watch out for" stories, because I feel that allows me to be extra vigilant about some aspect when running my own PoC
That said, my suspicion about any such aggregation project like that is that context is everything and trying to capture "this sucks" for all the input criteria which produced that outcome is going to be a wall of text that few will write and even fewer will read (ahem, LLM "tl;dr it for me" aside)

Reply View | 1 reply
- mdaniel 7 months ago
  
  Replying to myself so it can be downvoted separately, but in this mythical new AI gonna take our jobs world, who exactly is supposed to carry out the spikes required to know if technology X is a good fit for problem Y in company Z's culture? Vibe Kakfa into place and yolo?
  
  Reply View | 0 replies

lolc 7 months ago

We use Apache Artemis (Activemq). Mainly because it was the only system that would route large messages. It's from the Java ecosystem which is alien to us. So integration was not smooth but now it hums along fine.

Reply View 0 replies

MyOutfitIsVague 7 months ago

For my extremely specialized case, I use a SQLite database as a message queue. It absolutely wouldn't scale, but it doesn't need to. It works extremely well for what I need it to do.

Reply View 4 replies

pdimitar 7 months ago

Have you written up about it? I'd love to read it if so. Thought of using SQLite several times like this but never mustered the courage to try.

Reply View | 1 reply
- RedShift1 7 months ago
  
  I use SQLite as an offline buffer for telemetry data, basically one thread does INSERT of the payloads and another thread does just SELECT and then DELETE when it has successfully transmitted the payload.
  
  Reply View | 0 replies
justsomehnguy 7 months ago

I would join in asking for more details.
I have an idea of a project where even MySql/Maria is too much of admin burden.

Reply View | 1 reply
- MyOutfitIsVague 7 months ago
  
  There's very little to it, really, just a messages table with a id INTEGER PRIMARY KEY AUTOINCREMENT column (autoincrement to prevent id reuse, which is otherwise legal), a payload TEXT NOT NULL column (which is usually JSON encoded), and, in my case, a TEXT json annotations column, with some computed indexes. A publisher just pushes rows in, and a subscriber takes an exclusive lock to grab the first row matching the annotations that it cares about (I use DELETE RETURNING, but you can make it work however you need).
  You can use `PRAGMA data_version` on a dedicated thread to watch for changes and notify other waiters via a condition variable. It's not the nicest solution, because it's just a loop around a query, but it gets the job done.
  A req-rep pattern can be done by doing a `INSERT ... RETURNING id` and having the the other side re-push into the same or a different message queue with an annotation referring to that id. Alternatively, you could have a table with a req, rep, and status column to coordinate it all.
  It's far from everything you'd need from a complete, robust message broker, but for small single or multi-process message queue with a max of a few dozen readers and writers, it gets the job done nicely. In a single process, you can even replace the data_version loop thread with `sqlite3_commit_hook` on writers to notify readers that something has changed via the condition_variable.
  
  Reply View | 0 replies

micvbang 7 months ago

I got tired of the pricing and/or complexity of running message queues/event brokers, so decided to play around with implementing my own. It utilizes S3 as the source of truth, which makes it orders of magnitude easier to manage and cheaper to run. There's an ongoing blog series on the implementation: https://github.com/micvbang/simple-event-broker

Reply View 0 replies

Jemaclus 7 months ago

For large applications in a service-oriented architecture, I leverage Kafka 100% of the time. With Confluent Cloud and Amazon MSK, infra is relatively trivial to maintain. There's really no reason to use anything else for this.

For smaller projects of "job queues," I tend to use Amazon SQS or RabbitMQ.

But just for clarity, Kafka is not really a message queue -- it's a persistent structured log that can be used as a message queue. More specifically, you can replay messages by resetting the offset. In a queue, the idea is once you pop an item off the queue, it's no longer in the queue and therefore is gone once it's consumed, but with Kafka, you're leaving the message where it is and moving an offset instead. This means, for example, that you can have many many clients read from the same topic without issue.

SQS and other MQs don't have that persistence -- once you consume the message and ack, the message disappears and you can't "replay it" via the queue system. You have to re-submit the message to process it. This means you can really only have one client per topic, because once the message is consumed, it's no longer available to anyone else.

There are pros and cons to either mechanism, and there's significant overlap in the usage of the two systems, but they are designed to serve different purposes.

The analogy I tend to use is that Kafka is like reading a book. You read a page, you turn the page. But if you get confused, you can flip back and reread a previous page. An MQ like RabbitMQ or Sidekiq is more like the line at the grocery store: once the customer pays, they walk out and they're gone. You can't go back and re-process their cart.

Again, pros and cons to both approaches.

"What didn't work out?" -- I've learned in my career that, in general, I really like replayability, so Kafka is typically my first choice, unless I know that re-creating the messages are trivial, in which case I am more inclined to lean toward RabbitMQ or SQS. I've been bitten several times by MQs where I can't easily recreate the queue, and I lose critical messages.

"Where did you regret adding complexity?" -- Again, smaller systems that are just "job queues" (versus service-to-service async communication) don't need a whole lot of complexity. So I've learned that if it's a small system, go with an MQ first (any of them are fine), and go with Kafka only if you start scaling beyond a single simple system.

"And if you stuck with a DB-based queue -- did it scale?" -- I've done this in the past. It scales until it doesn't. Given my experience with MQs and Kafka, I feel it's a trivial amount of work to set up an MQ/Kafka, and I don't get anything extra by using a DB-based queue. I personally would avoid these, unless you have a compelling reason to use it (eg, your DB isn't huge, and you can save money).

Reply View 6 replies

michaelmior 7 months ago

> This means you can really only have one client per topic, because once the message is consumed, it's no longer available to anyone else.
It depends on your use case (or maybe what you mean by "client"). If I just have a bunch of messages that need to be processed by "some" client, then having the message disappear once a client has processed it is exactly what you want.

Reply View | 1 reply
- Jemaclus 7 months ago
  
  Absolutely, if you only ever have one client, SQS or a message queue is perfectly fine!
  
  Reply View | 0 replies
mlhpdx 7 months ago

We build applications very differently. SQS queues with 1000s of clients have been a go to for me for over a decade. And the opposite as well — 1000s of queues (one per client device, they’re free). Zero maintenance, zero cost when unused. Absurd scalability.

Reply View | 3 replies
- Jemaclus 7 months ago
  
  Certainly. There are many paths to victory here.
  One thing to consider is whether you _want_ your producers to be aware of the clients or not. If you use SQS, then your producer needs to be aware of where it's sending the message. In event-driven architecture, ideally producers don't care who's listening. They just broadcast a message: "Hey, this thing just happened." And anyone who wants to subscribe can subscribe. The analogy is a radio tower -- the radio broadcaster has no idea who's listening, but thousands and thousands of people can tune in and listen.
  Contrast to making a phone call, where you have to know who it is that you're dialing and you can only talk to one person at a time.
  There are pros and cons to both, but there's tremendous value in large applications for making the producer responsible for producing, but not having to worry about who is consuming. Particularly in organizations with large teams where coordinating that kind of thing can be a big pain.
  But you're absolutely right: queues/topics are basically free, and you can have as many as you want! I've certainly done it the SQS way that you describe many times!
  As I mentioned, there are many paths to victory. Mine works really well for me, and it sounds like yours works really well for you. That's fantastic :)
  
  Reply View | 0 replies
- matt_s 7 months ago
  
  Hey I'm curious how the consumers of those queues typically consume their data, is it some job that is polling, another piece of tech that helps scale up for bursts of queue traffic, etc. We're using the google equivalent and I'm finding that there are a lot of compute resources being used on both the publisher and subscriber sides. The use cases here I'm talking about are mostly just systems trying to stay in sync with some data where the source system is the source of record and consumers are using it for read-only purposes of some kind.
  
  Reply View | 1 reply
  
  mlhpdx 7 months ago
  
  On the producer side I’d expect to see change data capture being directed to a queue fairly efficiently, but perhaps you have some intermediary that’s running between the system of record and the queue? The latter works, but yeah it eats compute.
  On the consumer side the duty cycle drives design. If it’s a steady flow then a polling listener is easy to right size. If the flow is episodic (long periods of idle with unpredictable spikes of high load) one option is to put a alarm on the queue that triggers when it goes from empty to non-empty, and handle that alarm by starting the processing machinery. That avoids the cost of constantly polling during dead time.
  
  Reply View | 0 replies

austin-cheney 7 months ago

I have so far gotten by plenty well writing my own queue systems to fit the needs of the consuming application. Normally the only place where I need queue systems is in distributed systems with rapid fire transmissions to ensure messages hit the network in time sequence order. The additional benefit is that network traffic is saved in order when the current network socket fails so that nothing is lost but time.

Reply View 0 replies

coolcase 7 months ago

1. Never do greenfield. But usually seen systems set up with the cloud "house white" queue. SQS or the Azure queue whatever its called.

2. Nothing. It all worked out.

3. Nowhere. Generally used them for queue-y things.

4. Not done this. Even back in 2000s when queues weren't so well known they'd be a queue-like system. Polling FTP for example!

Reply View 0 replies

dmazin 7 months ago

No one ever seems to use it, but for AMPQ I like Beanstalkd. It’s fast, stable and has not failed me under high RPS.

Reply View 1 reply

bilinguliar 7 months ago

This is my go-to solution as well. It is great, but utilizes just one CPU core. But if this the problem, then your business is already booming.

Reply View | 0 replies

csomar 7 months ago

Another option to consider: Cloudflare Workers. They have a simple queue but you'll need to patch it with a Worker anyway. This means you can programatically manage the queue through the worker and also it makes it easy to send/receive HTTP requests.

Reply View 0 replies

stephenr 7 months ago

I've used Qless for several years;

For those unfamiliar, it's a Lua library that gets executed in Redis using one of the various language bindings (which are essentially wrappers around calling the Lua methods).

With our multi-node redis setup it seems to be quite reliable.

Reply View 0 replies

a_void_sky 7 months ago

Kafka for communication between microservices, and MQTT (VerneMQ) for IOT devices

Reply View 2 replies

toomuchtodo 7 months ago

What are your thoughts on Apache Pulsar vs Kafka?

Reply View | 0 replies
oulipo 7 months ago

I'm hesitating with EMQx, have you tried it? why did you choose VerneMQ?

Reply View | 0 replies

clark-kent 7 months ago

SQS. For Ruby apps I use Shoryuken with SQS.

Reply View 0 replies

z3ugma 7 months ago

Does Google Cloud Tasks count?

Reply View 0 replies

yesnomaybe 7 months ago

Been on Kafka (MSK) for a couple of years. I find the programming model and getting everything perfectly set up to be sitting behind a steep learning curve, to my surprise. For example, at some point I had a timestamp header but only very much later realised that it all ends up as number[] on the consumer side. So I lost data. My fault, but still. I came to the realisation that the programming model especially in MSK is rather unintuitive.

I found it hard to shift mentally from MSK and its even triggers back to regular consumer spun up in containers etc. but that also it rather MSK than Kafka.

I am currently swapping out the whole pub/sub layer to MongoDB change streams, which I have found to be working really well. For queuing it attempts to lock on read so I can scale consumers with retry / backoff etc. Broadcast is simple and without locking, auto delete in Mongo.

I will have to see how it really scales and I'm sure I'm trading one problem for another but, it will definitely help to remove a moving part. Overall, app is rather low volume with the occasional spike. I would have stayed with Kafka were there be let's say >100rpm on the core functions.

Reply View 0 replies

tacostakohashi 7 months ago

UUCP

Reply View 2 replies

RedShift1 7 months ago

People will call me crazy but why not SMTP for message queueing?

Reply View | 1 reply
- mdaniel 7 months ago
  
  Because it's only INSERT not SELECT nor DELETE? Maybe you meant IMAP which does have APPEND <https://www.rfc-editor.org/rfc/rfc3501.html#section-6.3.11> to insert a new message into a folder, and a bazillion SELECT options, plus of course DELETE
  I still would call that crazy, because of the mental tax of explaining to every new employee "wait, you're using IMAP for what?" but if it works for you, then great
  
  Reply View | 0 replies

MichaelMoser123 7 months ago

using zeebe/Camunda at work. The system gives you a way of designing and partitioning message-based workflows. It has a very thorough design.

Reply View 2 replies

kabes 7 months ago

We had a lot of reliability isdues with zeebe/camunda (granted we started using it at version 0.10), and now they also rugulled the free version. So I would never go near that company again.

Reply View | 1 reply
- MichaelMoser123 7 months ago
  
  reliability is much better now, as far as i can tell.
  
  Reply View | 0 replies

kabes 7 months ago

Maybe start by explaining what you want to use it for?

Reply View 0 replies

nop_slide 7 months ago

Solid queue in rails

Reply View 0 replies

ok1984 7 months ago

Surprised no body is mentioning ActiveMQ!

Reply View 0 replies

DonHopkins 7 months ago

What do people think of Supabase?

Reply View 1 reply

RedShift1 7 months ago

It's not a message queue?

Reply View | 0 replies

smittywerben 7 months ago

Kafka is a write-ahead log, not a queue per se. It handles transactions to the disk. Not across the network.

RabbitMQ is neat out of the box. But I went with ZeroMQ at the time.

ZeroMQ is cool but during current year I'd only use it to learn from their excellent documentation. Coming from Python, it taught me about Berkeley sockets and the process of building cross-language messaging patterns. After a few projects, it's like realizing I didn't need ZeroMQ to begin with I could make my own! If ZeroMQ's Hintjens were still with us I'd still be using it.

It's like the documented incremental process of designing a messaging queue to fit your problem domain, plus a thin wrapper easing some of lower level socket nastiness. At least that's my experience using it over the years. Me talking about it won't do it enough justice.

NATS does the lower level socket wrapper part very nicely. It's a but more modern too. Golang's designed to be like a slightly nicer C syntax, so it would make sense that it's high performance and sturdy. So it's similar to ZeroMQ there.

I'm not sure if either persist to disk out of the box. So either of these are going to be simpler and faster than Kafka.

The DB people are probably trying too hard to cater to the queues. Ideally I'd have normalized the data and modeled the relations such transactions don't lock up the whole table. Then I started questioning why I needed a queue at all when databases (sans SQLite which is fast enough as is) are made for pooling access to a database.

Kafka supports pipelining to a relational database but this part is where you kind of have to be experienced to not footgun and I'm not at that level. I think using it as a queue in that you're short-circuiting it from the relational database pipeline is non-standard for Kafka. I suspect that's where a lot of the Kafka hate is from. I could understand if the distributed transactions part is hell but at that point it's like why'd you skip the database then? Trying to get that free lunch I assume.

I have an alternative. Try inserting everything into a SQLite file. Running into concurrency issues? Use a second SQLite file. Two computers? send it over the network. More issues? Since it's SQL just switch to a real database that will pool the clients. Or switch to five of them. SQL is sorta cool that way. I assume that would avoid the reimplementing half of the JVM to sync across computers where you get Oracle Java showing up to sell you their database halfway into making your galactic scale software or the whatever.

I must be stressed today. Thanks for asking.

Reply View 3 replies

smittywerben 7 months ago

I already spent time proofreading it, but I forgot that I couldn't edit it after so much time. Unless the merciful admin lets me roll the 12-sided dice to replace the above cringe with the below piece I carved out of soap.
---
I always check for maintained libraries for my programming languages for any messaging library. Bindings in many languages are consistent across Kafka, ZeroMQ, and NATS.
Kafka is a write-ahead log, not a queue per se. It handles transactions to the disk. The networking is a simple broadcast, not a shared queue. You also can't (canonically, at least) pop/insert/delete rows. It's append-only. It can do basic seeking, like replaying from the start.
ZeroMQ is a good choice for learning from its excellent documentation, and programmers interested in C programming. Probably a good lead into Beej's networking guide. ZeroMQ is the odd one as it has no central broker ("Zero" for zero broker); you copy your favorite broker.py pattern from the ZeroMQ guide.
Dropping anchor to throw in the POSIX standard sockets, the BSD kqueue, the Linux epoll, newer io_uring, and libuv for boring cross-platform asynchronous I/O.
https://zguide.zeromq.org/docs/preface/
https://beej.us/guide/bgnet/html/split/
https://pubs.opengroup.org/onlinepubs/9699919799/functions/p...
https://pubs.opengroup.org/onlinepubs/9699919799/functions/s...
https://man.freebsd.org/cgi/man.cgi?kqueue
https://man7.org/linux/man-pages/man7/epoll.7.html
https://man7.org/linux/man-pages/man7/io_uring.7.html
https://docs.libuv.org/en/v1.x/

Reply View | 0 replies
j45 7 months ago

It’s less about DB people or not.
Many of us have used both sides and settled on one area to start.
Kafka, et al are amazing. Also almost always overkill in the first x months or years.
It’s not too much of a stretch to model your queue first in something like Postgres, which oddly offers things a little beyond a traditional rbdms, and when the model implementation in the domain reveals itself… it can shine a nice light in the direction of a Kafka, etc.

Reply View | 1 reply
- smittywerben 7 months ago
  
  Sir, your reply is more coherent than mine. I'll give you props for that.
  Still, I disagree that Kafka is always overkill.
  When god opened a datagram socket on your computer, you needed to have been capturing this data X months ago, but weren't paying attention. You need to build warbot.py and put it into production before you have the chance to deal with cold storage. Kafka is my go-to if you can do this before you run out of disk space.
  I frequently append JSON lines to a "data.json" file in Python. Add a socket server, compression, and typed bindings for 20+ languages. Boom. Kafka. Don't oversell it. Need to delete a row? Congratulations, you selected the wrong tool. It's appending JSON lines to a file. Kafka is a write-ahead log, not a queue.
  To your point about Postgres, I've found Postgres has fantastic JSONB support and awesome developers who have been very influential in my life and whom I admire. Postgres is my preferred cold storage, which I connect to Kafka. It feels like swimming upstream because RMDBs are traditionally for normalized data, not denormalized JSON lines that make XML look hip again.
  If you have a choice in DB, Postgres' JSONB has helped me avoid unnecessary normalization steps. It's good to have options.
  ZeroMQ would call this the Titanic pattern and mic drop because the guide has a section on it. That's why I like ZeroMQ.
  Edit: Apologies for typos/brevity. I have an ancient phone that only works with 20% of the web and phone apps. There are no apps or LLMs to help this dyslexic soul.
  Reference for the Titanic pattern. The guide's author is cynical about me shoving spinning rust in the middle, but it doesn't say no. https://zguide.zeromq.org/docs/chapter4/#Disconnected-Reliab...
  
  Reply View | 0 replies

atombender 7 months ago

It's important to distinguish between the use cases. Queues, streams, logs, databases, etc. are different kinds of tools you can use, and what the right tool is depends on your semantics.

For example: Message queues are good for work that must be done in strict order where you want to deal with one message at a time. They aren't such a great fit for large batch movement of data, like logs or high volume events, because having a per-message acknowledgement state requires a lot of round trips over the network that simply isn't needed; you want to treat the entire bulk of the flow to carve out big chunks of it, because CPUs wnd networks and disks are more efficient when doing the same operation over large amounts of data in one go.

If you are executing "tasks" (like image processing, ML inference, webhooks), ordering by insertion order might not be the right choice, either. Sometimes you want to coalesce (dedupe by key). Sometimes you want to ensure the processing for a key (e.g. a customer ID) is done in the same process and not randomly distributed over all your workers. Sometimes you want delivery to be strictly sequential, requiring an exclusive worker rather than massively parallel fan-out. And so on.

Where I work, we use a mix of things depending on the application. I am a big fan of NATS. It's not itself a message queue, but its primitives can be combined to handle all sorts of behaviors. Core NATS is more like ephemeral pub/sub, while Jetstream gives you durable, highly available Kafka-like streams.

I like combining queues with database state. Use the queue as an efficient way to order items (like jobs or events) for massively scalable distribution, and use the database to store the current state of things.

For example, imagine you're delivering webhook messages. We first store the message in the database with the state "pending", then write an event to the queue about it. The worker receives the event, double-checks its state is still "pending", then executes it. If delivered, mark as "done" and ack the message. Otherwise, mark as "failed" and create a new queue message to retry. This way, you have durable state in a solid database, and the queue is an efficient way to coordinate the workers. (There's a bit more work here to ensure consistency, but this is the gist of it.)

Core NATS is fantastic as a communication primitive between ephemeral processes. You can use it for RPC, for lightweight broadcasts (e.g. reload config everywhere), even for things like leases or caching or similar. Jetstream is like Kafka but more flexible; for example, each message has a wildcard subject that can be filtered on, so different consumers can very efficiently filter a big, commingled stream by interest. In Jetstream streams, messages have per-consumer ack/nack state in addition to a position, so you're not limited to Kafka's linear "position". Overall, a superb data model, and very easy to manage as infra.

One weak point with NATS is a maximum message size of 10MB. This means that you sometimes have to invent your own chunking if your application needs to send larger payloads. Doing this opens up some cans of worms, so I honestly wouldn't recommend it. For large batch stuff, Redpanda is a better option.

Reply View 0 replies

mlhpdx 7 months ago

SQS

Reply View 0 replies

catkitcourt 7 months ago

Pulsar

Reply View 0 replies

varbhat 7 months ago

NATS

Reply View 3 replies

RedShift1 7 months ago

I use NATS too! It has worked very well for me, using it to collect data from IoT devices. I don't really like all the other bits they tacked on like jetstream and object store, that seems beyond its scope. Subject authorization is also painful to implement. But runtime behaviour has been flawless for me.

Reply View | 2 replies
- pdimitar 7 months ago
  
  Do you have any links explaining the subject authorization? I have recommended NATS for a project that got scrapped.
  
  Reply View | 1 reply
  
  RedShift1 7 months ago
  
  Docs: https://docs.nats.io/running-a-nats-service/configuration/se...
  Example: https://natsbyexample.com/examples/auth/callout/java
  
  Reply View | 0 replies

wetpaws 7 months ago

[dead]

Reply View 0 replies

revskill 7 months ago

A cron job did thd work.

Reply View 0 replies