Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash while parsing netfilter using regex #193

Open
fukawi2 opened this issue May 22, 2023 · 1 comment
Open

Crash while parsing netfilter using regex #193

fukawi2 opened this issue May 22, 2023 · 1 comment

Comments

@fukawi2
Copy link

fukawi2 commented May 22, 2023

While trying to write a regex that will parse netfilter logs, I managed to crash agrind (again, sorry!)

Command line:

cat netfilter.logs | agrind '* | parse regex "^(?P<timestamp>\S+) (?P<hostname>\S+) kernel: \[[0-9\.]*\] (?P<tag>.*?) IN=(?P<in>\S*) OUT=(?P<out>\S*) (?P<x1>MAC=(?P<mac>\S+) )?SRC=(?P<src>\S+) DST=(?P<dest>\S+) LEN=(?P<len>\S+) TOS=(?P<tos>\S+) PREC=(?P<prec>\S+) TTL=(?P<ttl>[0-9]+) ID=(?P<id>\S+) (?P<x5>(?P<dont_fragment>DF) )?PROTO=(?P<protocol>\S+) (?P<x4>TYPE=(?P<icmp_type>[0-9]+) CODE=(?P<icmp_code>[0-9]+) \[.*?\])?(?P<x2>SPT=(?P<sport>[0-9]+) DPT=(?P<dport>[0-9]+) (?P<x3>WINDOW=(?P<window>\S+) RES=(?P<res>\S+) (?P<flags>.*) URGP=(?P<ugrp>\S+))?)?(?P<x6>LEN=(?P<length>[0-9]+))?$"'

I'm reasonably confident that the regex is correct - it's definitely the most complex I've ever done, but I debugged it with https://regex101.com/ using the rust setting, and all my test lines passed there for the above regex.

netfilter.logs contents:

2023-05-02T11:12:50.823305+00:00 fw.example.com kernel: [576812.045483] Firewall: DROPPED IN=bond1 OUT=bond1 MAC=9e:ee:03:11:12:05:00:1f:6d:b5:18:00:08:00 SRC=192.0.2.100 DST=192.0.2.200 LEN=40 TOS=0x00 PREC=0x00 TTL=242 ID=46326 PROTO=TCP SPT=123 DPT=456 WINDOW=1024 RES=0x00 SYN URGP=0
2023-05-02T11:00:01.786039+00:00 fw1.example.com kernel: [578791.811376] Firewall IN=ens5 OUT=ens6 MAC=2a:e8:c0:3e:4b:17:ae:f6:8c:55:e9:a3:08:00 SRC=192.0.2.111 DST=192.0.2.222 LEN=226 TOS=0x00 PREC=0xC0 TTL=63 ID=24087 PROTO=ICMP TYPE=3 CODE=3 [SRC=10.10.10.10 DST=10.20.30.40 LEN=198 TOS=0x00 PREC=0x80 TTL=122 ID=3067 PROTO=UDP SPT=53 DPT=36506 LEN=178 ]
2023-05-02T11:00:02.136051+00:00 fwall kernel: [2839042.118073] [FIREWALL][OUTPUT] IN= OUT=ens5 SRC=192.0.2.11 DST=192.0.2.22 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=11993 DF PROTO=TCP SPT=60488 DPT=443 WINDOW=62727 RES=0x00 SYN URGP=0
2023-05-02T11:00:07.372179+00:00 foo.example.net kernel: [2765257.090978] [FIREWALL][OUTPUT] IN= OUT=ens3 SRC=192.0.2.123 DST=192.0.2.321 LEN=339 TOS=0x00 PREC=0xC0 TTL=64 ID=21371 DF PROTO=UDP SPT=68 DPT=67 LEN=319

FWIW, I'm hoping to contribute this back as an alias. Here's my WIP branch: https://github.com/fukawi2/angle-grinder/tree/phs/add-netfilter

report-9f23892a-d887-4a5e-bedc-cdf9b7c4ea85.txt

@jnhmcknight
Copy link

I believe this crash is caused by the optional modifier being set at the field level in (?P<x4>TYPE=(?P<icmp_type>[0-9]+) CODE=(?P<icmp_code>[0-9]+) \[.*?\])?. I've just hit a similar crash while trying to parse some custom web logs, and not wanting the query string to be included in the URL counts.

access.log:

71.183.39.172 (10.71.144.194) - - [22/May/2024:14:50:13 +0000] "POST /track/?verbose=1&ip=1&_=1716389413340 HTTP/1.1" 200 25 "https://example.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
51.9.96.249 (10.71.103.55) - - [22/May/2024:14:50:14 +0000] "POST /amplitude/typescript HTTP/1.1" 499 0 "https://example.com/home" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
50.205.126.142 (10.71.19.173) - - [22/May/2024:14:50:15 +0000] "POST /amplitude/typescript HTTP/1.1" 200 94 "https://example.com/home" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"
82.144.56.18 (10.71.78.143) - - [22/May/2024:14:50:15 +0000] "POST /track/?verbose=1&ip=1&_=1716389415082 HTTP/1.1" 200 25 "https://example.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"

Command:

cat access.log | agrind  '* |parse regex "(?P<remote>[0-9\-\.]+) (?P<elbip>[0-9\(\)\-,\. ]+) - - \[(?P<timestamp>[A-Za-z0-9:/]+ \+0000)\] \"(?P<method>[A-Z]+) (?P<url>.+)(?P<query>\?.+)? (?P<proto>HTTP/[0-9\.]*)\" (?P<status>[0-9]+) (?P<size>[0-9]+) \"(?P<referer>.+)\" \"(?P<user_agent>.+)\"" nodrop |count by method, url, status'

Running as is, will crash. But if you change (?P<query>\?.+)? to (?P<query>\?.+) (without the optional modifier at the field level), it runs fine.

I wish I knew more Rust and could submit a patch, but unfortunately, I'm not there in my learnings yet. Hopefully, this helps someone can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants