Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default user logs to send to fn stderr at INFO level #1479

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

rdallman
Copy link
Contributor

closes #1474

see linked issue above for some details of the saga. this patch does a number of things in order to accomplish the goal of sending user logs to fn stderr at INFO level, while allowing it to easily be configured to off. as pure_runner is explicitly setting these to off for the moment, that configuration is not affected by this patch, however in the future it's possible we may want to change these to align so that lb/runner configs can get logs out of user containers in this manner. this only affects the full node (not just full agent) configuration, where calls are created in the invoke/httptrigger handlers from OSS fn. open to discussion on default behavior, however I think we have to have logs by default in some capacity for OSS, absolutely want to make it not brittle and easily disabled for production services. attempt to cover surface area here:

  • moves the logger configuration up to a CallOpt instead of getting defaulted to in GetCall, now GetCall defaults to configuring logs to off, which seems much less brittle for other packages that call into agent and may absolutely not want logs (they exist). now our handlers in OSS configure this setting. let the record show, I'm not perfectly content with some of the munging that had to go on here (there's agent env vars to configure this option which is called from the front end, which needs the config in call opts), however after trying a few things this seems the most reasonable but I am very much open to concrete ideas - this was a particular pain point.
  • changed stderr to io.WriteCloser as there's no longer a need to buffer logs since we are not uploading them anywhere, modifies our func logger heavily to remove all the old cruft and adds the ability to configure log writer with a level
  • adds ability to configure FN_USER_LOG_LEVEL (i'll update docs) which defaults to 'info'. the level at which this is defaulted to info is open for discussion. removes unused FN_MAX_LOG_SIZE_BYTES
  • updates logrus to 1.4.1 for logger.Log(level, stuff) method
  • deleted a lot of old TODOs from yours truly
  • deleted StdErr method from RunnerCall interface, as far as I can tell in all our repos here and in the service, this method is uncalled (can remove at will, since any implementers it's unused and still satisfies the interface)
  • fixes race: funcLogger write race #1328 issue by using buffer.ReadBytes('\n')
  • fixed bug where we'd log between containers' logs at debug level when logs are configured off
  • fixes a couple tests that fail for me now on docker 18.09.5 w/ new 404 message

tested and she seems to work as expected

api/agent/func_logger.go Outdated Show resolved Hide resolved
we have WithLogger already, we just need to use it. this moves the config of
the logger itself up to be a call option which makes it less brittle in
GetCall to hit the path where it might get accidentally turned on.

removes the old buffer and io.Reader on stderr so that we could upload logs,
we only need the logrus logger for now and it was a bit of a tangled mess.

TODO need to add config bit to set the default level to 'info' so that we get
logs out of these guys in the agent config (pure runner is left alone to off),
with ability to change to debug or turn off altogether.

NOTE: this fixes #1328 by putting a write guard after close and not using the
line writer for different writers
@@ -148,8 +120,9 @@ func (li *lineWriter) Write(ogb []byte) (int, error) {
// write in this line and advance buffer past it
l := b[:i+1]
ns, err := li.w.Write(l)
n += ns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write() should return "how much of the original buffer was consumed". Looks like existing bug... In other words, in this state, Write() may return n, which can be larger than len(ogb) which is not really what we want to return.. I think we might want to change a few things such as:

  • li.b buffer should be limited. Currently, series of Write() calls without a newline can grow li.b.buffer without limits with each Write() having to parse entire buf for newlines.
  • if buffer is at limit (or going to grow over limit with current call to Write()), then we should flush li.b with appended newline ourselves before processing ogb. If that flush fails, then we can return error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess, in practice, the reality of this is that the inner writer is never going to error (it's a logWriter, which never errors) and writing to/reading from the buffer is never going to error either (we're not setting a capacity on buffers, and many methods of the buffer contract will never error - the runtime will oom before an error gets returned, but this is a panic). it's also the case that this is not to be used for production applications and we can avoid thinking about some of the more fun attack vectors here, but would be happy to note them - I just don't think we need to add any more arbitrary configuration than we already have for something that has a configuration to just turn it off for prod apps (glad to fix correctness issues, of course).

that being said, appreciate the feedback can spend an hour here tidying this up so that it's more obviously correct.

if err != nil {
return ns, err
return n, err
}
li.b.Next(len(l))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also looks odd. if li.w.Write() return error with partial write, then we do need to execute li.b.Next(ns) regardless of error, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

no function logs by default race: funcLogger write race
2 participants