In software design, it’s not uncommon to come up with the need for a queue. Perhaps you have certain tasks that should be scheduled for background processing. Maybe a type of request needs to be handled in batches.
Why the Flack?
Regardless of the reason, we’ve seen customers and fellow engineers squeam at the suggestion of using a database table to handle the storage of queue items. Here’s an incomplete list of reasons we generally hear:
- “We should use a proper queueing technology, like AWS SQS, RabbitMQ, ActiveMQ, Kafka, or IronMQ. MySQL and PostgreSQL aren’t designed for this! This feels wrong!”
- “The performance will be horrible! We should at least using something that will be in-memory!”
- “If anyone sees a design like this, they’ll think we’re idiots!”
- “This won’t scale well when our growth curve starts double hockey sticking!”
- “Will it even be transaction-safe? What about handling dead letters? Why even bother?”
Some of these are good points. Performance may matter, eventually. A queueing service may allow for a better eventual architecture. Someone may think you’re an idiot. But lets look at the bigger picture.
A Case For The RDBs
Software architecture can be a challenging process. This is because most software built is to address a problem where the best solution will only be fully realized over a long span of time. Your understanding of what the problem is and how the problem will best be solved will be quite different when you start than when you end, even with adequate discovery efforts, planning, and experience.
You may start out thinking “Hey, I have a bunch of commands that are going to be consumed by a worker process. I should put them in a queue!”. So, you install a queue service on your server and configure a queue. It works great. Look at how fast it is since it’s all in memory! Grabbing an item off of the queue takes 2 ms!
Some time passes and you add more worker processes. Some of these worker processes exist on a server on the west coast and some are on the east coast, to increase uptime in case there’s some sort of locational outage. Great! But, you’re seeing that performance isn’t scaling linearly. Every time you add a worker, it seems that you’re only getting around 60% of the gain you would expect. It turns out that many of the commands in the queue process quicker when they are grouped together and handled by the same worker rather than being interspersed across workers at different locations. Perhaps it has to do with some sort of context/memory switching that the worker has to perform. OK, no problem, we’ll somehow group those items together, right?
Wrong. You’re using a queueing technology that only allows FIFO operations, so there’s no way for you to see further into the queue without popping items off of it. So, bad news.
Another issue arises. Some queue items aren’t processable when you pop them off the queue. You want an hour to pass before retrying them. But wait, you can’t have the worker just sit there and hold onto the item. You also don’t want to set up another queue just for these delayed items. Even worse, you may be using a technology like RabbitMQ that doesn’t really allow for delayed visibility of queue items without some strange dead-letter queue hacking. More bad news!
Yes, there are architectures that can be implemented to get around both of these situations. But… that’s not the point.
What’s the Point, Old Man?
The point is that a relational database, like MySQL for instance, doesn’t have these limitations. It’s not a high performance queueing technology, but it is a very general, flexible tool that will allow you to do just about whatever you need it to compared to a more specific technology like a queueing or messaging service.
You think you know what you want to accomplish, and you may be right, but you may not. I like the saying “build first with wood, then steel”. If you’re unsure, you’re still discovering the right solution, and you’re proving that your software is actually going to get used, work with malleable tools that allow you to make changes and adjustments with ease. If you see, down the road, that you are really being hurt performance wise, make the switch to a queueing system with confidence.