bitly’s nsq has some good ideas


NSQ is a realtime message processing system designed to operate at bitly’s scale, handling billions of messages per day.

It promotes distributed and decentralized topologies without single points of failure, enabling fault tolerance and high availability coupled with a reliable message delivery guarantee.

No SPOFs and reliable message delivery, without relying on something like ZooKeeper, is a big claim. They have some novel approaches to these problems.

First, they run an intermediary daemon, nsqlookupd, between the producers/consumers and the actual queues. These daemons monitor all the available queue servers and tell the clients what to connect to. No configuration of actual queue servers is known to applications. They then run multiple lookup daemons, which are stateless and don’t need to agree with each other in order for the system to operate properly.

Reliable message delivery is provided with at-least-once message delivery semantics. They require all consumers to de-duplicate messages or restrict their operations to idempotent operations. Not exactly legacy friendly, as many applications are coded with the assumption of a closed, one-shot world. But. Idempotence: I highly recommend it if you have the means.

If you need to prevent losing messages due to the FBI stealing your servers, which is something you definitely need to account for, you can set up redundant pairs of servers and rely on deduplication/idempotence to make sure you’re only processing messages once, even if you consume them multiple times.

In summary: lots of good ideas here. Perhaps some of them could be applied to how people are using Resque?