Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: stops sending messages after moving time backward #1124

Open
vostrik opened this issue Jan 11, 2019 · 13 comments
Open

nsqd: stops sending messages after moving time backward #1124

vostrik opened this issue Jan 11, 2019 · 13 comments
Labels

Comments

@vostrik
Copy link

vostrik commented Jan 11, 2019

Environment/Pre-Conditions
nsqd version: 1.1.0
Node.js application with nsqjs

Steps to Reproduce:

  1. Run nsqd and Node.js application.
  2. Generate some messages. Everything works correctly.
  3. Move time backward (for ex.: 5 minutes or 1 hour).
  4. Generate some messages. Subscriber does't receive messages.
    5.1. After some time (may be delay from (3)?) message will be published.
    5.2. If I move time forward then messages will be published immediately.

Actual Result:

Messages aren't sent to subscriber after moving time back.

Expected Result:

Messages are sent to subscriber after moving time back.

@ploxiln
Copy link
Member

ploxiln commented Jan 11, 2019

I generally expect various things to not work right when system time moves backwards :)

Go-1.9 added "transparent monotonic time" in certain cases: https://golang.org/doc/go1.9#monotonic-time
There may be ways we can adjust nsq code to get the monotonic-time based comparisons and timeouts ... if nsqd is the problem here rather than the nodejs client library ...

@vostrik
Copy link
Author

vostrik commented Jan 11, 2019

I think It is certainly nsqd because nodejs restart didn’t help.

@vostrik
Copy link
Author

vostrik commented Jan 11, 2019

Can I help with this issue? Maybe you can prompt modules and methods names which should be refactored.

@ploxiln
Copy link
Member

ploxiln commented Jan 11, 2019

thanks, we'll look into it

@ploxiln
Copy link
Member

ploxiln commented Jan 11, 2019

since Go does not use separate fuctions/options/objects for monotonic time, but includes it hidden in the wall-clock time struct, it's probably a bit tricky to figure out where it's being lost, or how to keep it along where needed

@mreiferson
Copy link
Member

I'd be surprised if "normal" channel sends were the problem, so I'd start by looking at other timeout related things. The two that come to mind are:

  1. network connection level timeouts
  2. time.Ticker use cases, e.g. to flush buffered messages to a client

@mreiferson mreiferson added the bug label Jan 11, 2019
@vostrik
Copy link
Author

vostrik commented Jan 17, 2019

Sorry for bothering but is there any progress?
Our team has some resources to investigate this behaviour, i. e. you can simply point potential problem places.
We want to help.

@ploxiln
Copy link
Member

ploxiln commented Jan 17, 2019

No progress.

I suspect the issue may be around https://github.com/nsqio/nsq/blob/master/nsqd/guid.go#L59

@mreiferson
Copy link
Member

I suspect the issue may be around https://github.com/nsqio/nsq/blob/master/nsqd/guid.go#L59

Ahhh, yes, I completely forgot about the GUID code, good call!

I had been poking around at all the network deadlines and tickers, but I'm pretty confident they're not the issue.

@vostrik
Copy link
Author

vostrik commented Jan 18, 2019

Can we add to guid some clock sequence (14 bit). This section helps with backwards time travelling. More here:
https://blog.stephencleary.com/2010/11/few-words-on-guids.html

@mreiferson
Copy link
Member

Hmmm, I'm not sure we have space for that.

I think we can just use a time duration rather than timestamp, and Go will transparently handle the monotonicity for us.

@ploxiln
Copy link
Member

ploxiln commented Jan 18, 2019

The last time we visited this issue was #658 / #663

We didn't want to drastically change the GUID generation algorithm. Although we'd like to say the only guarantee is that IDs are unique within a particular consumer connection to nsqd, it's possible that some users may implicitly depend on the uniqueness within a channel across multiple nsqd and restarts of nsqd.

An alternate algorithm that would lose some of that across-sources-and-restarts uniqueness, but be compatible with the odd and inadvisable condition of time going backwards: start at a random initial value, and just increment and wrap (similar to tcp sequence numbers but 64-bit).

@mreiferson
Copy link
Member

Hmmm, I do vaguely remember discussing this and being frustrated about the "backwards compatibility".

@mreiferson mreiferson changed the title nsqd: stops send messages after moving time backward nsqd: stops sending messages after moving time backward Feb 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants