*: consider merging absolute8511/nsq fork (HA, replication, in-order delivery, log-based) #887

absolute8511 · 2017-04-17T11:14:22Z

NSQ is great for most usage and we are using it for a long time. However, there are some important features we need eagerly. So we redesigned the NSQ to add some features we need. The main features are:

Replication and High Availibility
Auto balance and migrate
Delivery the messages in Order
Consume history messages
Dynamic Log level and more detail logs
More stats in the nsqadmin
Performance improvement

After redesigned, this version has been used in Youzan.com for several months and processed 150B messages until now.

To thanks to the original project, we open source all the redesigned project including client SDKs and if any possible we would like to merge it to the master. Since there are many differences between the two projects, it may be a very hard work.
The forked project : https://github.com/absolute8511/nsq

Thanks again for the great work on the NSQ.

ploxiln · 2017-04-17T18:17:05Z

Merging back in to this project would have to be done one feature at a time. And as you say, it'll be a lot of work - not just because of merge conflicts, but also because of design review and revision.

It appears that you've integrated etcd into your fork - that's something we've thought about. Can you describe how your fork uses it exactly?

mreiferson · 2017-04-17T18:32:37Z

Wow, thanks, this is great!

In a similar vein to @ploxiln's comments, I would love to start by reading some design and architecture docs to better understand how you've accomplished all of this. After that, we can decide what makes the most sense in terms of merging it into mainline.

absolute8511 · 2017-04-18T02:31:22Z

Yes, we use the etcd as the storage of the cluster meta data, such as topic partition number, replicator, the node placement strategy and some other options. @ploxiln

There are some documents under doc (including the slide which I talked at GopherChina Conference). And I am preparing a more detail article to describe what we have done about the fork. @mreiferson

ploxiln · 2017-04-18T20:28:51Z

looks like the document useful for English speakers is this one: https://github.com/absolute8511/nsq/blob/master/doc/NSQ-redesigned-details.pdf

One of the big changes is to use a shared file-based queue for all channels. That's similar to the stalled effort in #625 - so that is indeed a topic we're interested in. Can you summarize how requeued messages and deferred-published messages are handled? (I can think of one or two ways, but I'm curious about your strategy and how it works out in practice.)

Is the performance improvement in your fork mostly from this change, from go channels to the shared file-based queue?

Another big change - even bigger, I'd say - is the system of assigning "topic leader" and "topic follower" roles to nsqd processes, using etcd to coordinate. I understand that this is so a "leader" can die abruptly, and a "follower" will have all of the successfully-published messages, and can take-over. Please correct me if I'm wrong: does this mean that all processes which publish a message on Topic "A" must publish to the same single nsqd, the "leader"? Currently, separate processes which are producing messages for a single topic can separately publish to N independent nsqd if desired, for the same topic "A". This change would seem to significantly limit the scalability of nsqd for handling a single topic. That may be a trade-off some users are willing to make for this functionality. But again, correct me if I'm wrong :)

absolute8511 · 2017-04-19T02:50:04Z

For the requeue, we have 2 strategies currently, first is put back to the requeued channel and second is put back to the end of the file. So for deferred-publish or deferred-req we may lead to imprecise for the time deferred. This issue for imprecise defer can not be avoided while lots of deferred messages. The priority queue in NSQ for deferred messages is not enough if too much deferred, so we are planing using the (Hierarchical Timing Wheel) to store them on disk (detail is still in discussion).

Except the file-based queue, there are some another improvements such as group commit which reduce io, id generator using atomic int64, buffer pool to reduce GC, prefetch on disk read to reduce the syscall.

About the leader and follower, you're correct. To solve the scalability we introduced the partition, and a topic can have several partitions and each partition of topic can have an independent leader to handle the read/write.

ploxiln · 2017-04-19T03:36:06Z

ah, partitions, reminds me of Kafka

shinzui · 2017-04-20T21:51:52Z

This is really exciting. I hope this work could make its way back to the main project.

absolute8511 · 2017-04-21T02:54:28Z

Yeah, I think the first thing need to be done is: every message should be written to the disk file without involving the channel overflow to disk. All theses features are based on this assumption.

mreiferson · 2017-04-22T23:03:18Z

Yeah, I think the first thing need to be done is: every message should be written to the disk file without involving the channel overflow to disk. All theses features are based on this assumption.

I agree with this as a pre-requisite. Have you reviewed the work in #625?

absolute8511 · 2017-04-25T03:07:21Z

After reviewed the changes, I noticed that WAL is added for each topic. In my redesigned solution, I tried to make the least changes. Since the topic disk file has all the data for message, I treated the disk file as the WAL itself. So what I did is just remove the code handling channel overflow to disk.

And to keep compatible with the original NSQ, I add most of the replication related code outside the topic code without modifying the old topic. Also in this way we can turn off replication in the topic unit test.

The biggest change I did for topic is change message id from 16bytes guid id to the 16bytes binary ID. (8 bytes for uint64 internal id and 8 bytes for outside trace id)

lihan · 2017-05-08T04:31:58Z

@absolute8511 "Dynamic Log level and more detail logs" Possible to submit a separate branch to address this one? It would be way quicker to merge smaller set over the time than a big chunk in one go.

mreiferson · 2017-05-18T17:58:54Z

Dynamic Log level and more detail logs

This is now happening in #892. Please review if you haven't already.

After reviewed the changes, I noticed that WAL is added for each topic. In my redesigned solution, I tried to make the least changes. Since the topic disk file has all the data for message, I treated the disk file as the WAL itself. So what I did is just remove the code handling channel overflow to disk

@absolute8511 I see. How did you implement independent per-channel "cursors" on top of the diskqueue though? That's basically the reason why I wrote and used mreiferson/wal.

The biggest change I did for topic is change message id from 16bytes guid id to the 16bytes binary ID. (8 bytes for uint64 internal id and 8 bytes for outside trace id)

Agreed, my assumption is that the on-disk structure will need to change to support these kinds of features.

mreiferson · 2017-05-18T18:01:58Z

@absolute8511 taking a step back, how do you propose we begin the process of discussing design and architecture? Assuming we reach consensus on those things, we can then discuss implementation details and roadmap.

absolute8511 · 2017-05-19T02:49:07Z

@mreiferson channel cursor has the fileNum of the disk file and the offset of the disk file. And while a message read from file, we fill the offset and the occupied size in the message meta. On the finished of the message, the channel cursor will be updated based on the message meta info. Actually, we introduced the log index file outside of the topic to help sync with replica.

I am a little busy recently, and I am still preparing some technical articles on the redesigned NSQ. I think we can begin the merge discussion after I finished these design documents.

ps: the first overview article is here:
Redesign Overview
and there will be some more details in the next few weeks.

yunnian · 2018-01-15T17:59:39Z

This is my coworker ，His Chinese name is 李文

mreiferson added the question label Apr 18, 2017

mreiferson changed the title ~~Add some great features for the NSQ and maybe merge to master~~ *: consider merging absolute8511/nsq fork (HA, replication, in-order delivery, log-based) Apr 22, 2017

shinzui mentioned this issue May 17, 2017

Status of the project #893

Closed

ploxiln mentioned this issue Oct 22, 2021

nsqd: there seems to be a possibility that the message may be lost #1386

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

*: consider merging absolute8511/nsq fork (HA, replication, in-order delivery, log-based) #887

*: consider merging absolute8511/nsq fork (HA, replication, in-order delivery, log-based) #887

absolute8511 commented Apr 17, 2017

ploxiln commented Apr 17, 2017

mreiferson commented Apr 17, 2017

absolute8511 commented Apr 18, 2017

ploxiln commented Apr 18, 2017

absolute8511 commented Apr 19, 2017 •

edited

ploxiln commented Apr 19, 2017

shinzui commented Apr 20, 2017

absolute8511 commented Apr 21, 2017 •

edited

mreiferson commented Apr 22, 2017

absolute8511 commented Apr 25, 2017 •

edited

lihan commented May 8, 2017 •

edited

mreiferson commented May 18, 2017

mreiferson commented May 18, 2017

absolute8511 commented May 19, 2017 •

edited

yunnian commented Jan 15, 2018

*: consider merging absolute8511/nsq fork (HA, replication, in-order delivery, log-based) #887

*: consider merging absolute8511/nsq fork (HA, replication, in-order delivery, log-based) #887

Comments

absolute8511 commented Apr 17, 2017

ploxiln commented Apr 17, 2017

mreiferson commented Apr 17, 2017

absolute8511 commented Apr 18, 2017

ploxiln commented Apr 18, 2017

absolute8511 commented Apr 19, 2017 • edited

ploxiln commented Apr 19, 2017

shinzui commented Apr 20, 2017

absolute8511 commented Apr 21, 2017 • edited

mreiferson commented Apr 22, 2017

absolute8511 commented Apr 25, 2017 • edited

lihan commented May 8, 2017 • edited

mreiferson commented May 18, 2017

mreiferson commented May 18, 2017

absolute8511 commented May 19, 2017 • edited

yunnian commented Jan 15, 2018

absolute8511 commented Apr 19, 2017 •

edited

absolute8511 commented Apr 21, 2017 •

edited

absolute8511 commented Apr 25, 2017 •

edited

lihan commented May 8, 2017 •

edited

absolute8511 commented May 19, 2017 •

edited