Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing stops when hitting UTF-8 replacement char #698

Open
matejzero opened this issue Dec 15, 2022 · 3 comments
Open

Parsing stops when hitting UTF-8 replacement char #698

matejzero opened this issue Dec 15, 2022 · 3 comments
Labels
bug This is considered a bug and shall get fixed mtail-Log Tailing Issues related to log polling and tailing

Comments

@matejzero
Copy link

We are using mtail to parse some specific logs where sometimes there are replacement chars(�) in strings. It seems like when mtail hits a log line with this char, it stops parsing logs after this log line.

mtail version: mtail version 3.0.0-rc50

mtail program:

counter response_total by code
counter count_lines
const ONELINER /oneliner/

ONELINER {
  /<result code=\"(?P<code>1000|1300|1301|1500|2002|2200|2201|2202|2302|2303|2306|2308|greeting)\">/ {
  response_total[$code]++
  }
}

/$/ {
  count_lines++
}

Sample logs - non-working:

Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds
Nov 29 04:02:02.242 host.example.tld service INFO  Loop(responseData -> oneliner) <result code="1000"> <contact:name>FOO�BAR LOG</contact:name>
Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds

Sample logs - working:

Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds
Nov 29 04:02:02.242 host.example.tld service INFO  Loop(responseData -> oneliner) <result code="1000"> <contact:name>FOOSBAR LOG</contact:name>
Nov 29 04:02:02.242 host.example.tld service INFO  [18636C07-1:foobar:1.2.3.4] service.ioLoop 123 : Transaction took: 0.500 seconds

Running with -one_shot, with replacement char, I get count_lines{prog="test"} 2and without it,count_lines{prog="test"} 3`

Is this a known problem and how can I get more info on why is it failing?

cc: @smrekarm12

@jaqx0r
Copy link
Contributor

jaqx0r commented Jan 1, 2023

This is definitely a bug, thanks for finding it!

The problem is probably with how the file reader handles UTF-8 runes, and probably it's getting stuck by not discarding this after failing to convert the rune.

Can you find the INFO logs for mtail when this happens? https://google.github.io/mtail/Troubleshooting.html#deployment-problems has the location of the INFO log.

@jaqx0r jaqx0r added bug This is considered a bug and shall get fixed mtail-Log Tailing Issues related to log polling and tailing labels Jan 1, 2023
@matejzero
Copy link
Author

I ran mtail with -logtostderr, -one_shot and -v=2 and the output is attached in log files.

Non-working log file: mtail_debug_nonworking.log

Working log file: mtail_debug_working.log

From the logs, I can only see the working one having one more decode.go:52] sendline log line and the non-working log is 2 bytes bigger, probably due to utf-8 rune.

@matejzero
Copy link
Author

Any news regarding this issue? Just so we know if we should find another solution or wait?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This is considered a bug and shall get fixed mtail-Log Tailing Issues related to log polling and tailing
Projects
None yet
Development

No branches or pull requests

2 participants