Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue] Logging using incorrect level when verbose is set up #847

Open
reivaxxavier opened this issue May 26, 2023 · 7 comments
Open

[Issue] Logging using incorrect level when verbose is set up #847

reivaxxavier opened this issue May 26, 2023 · 7 comments
Assignees
Labels
Status: On Hold Similar to blocked, but is assigned to someone

Comments

@reivaxxavier
Copy link

Describe the bug
Hello there!

I'm encountering a difficulty related to logging in subfinder, which seems to be connected to how gologger handles label and log level. In the piece of code provided below, it seems that whenever an event that includes a label value is logged, this value gets converted to log level during formatting. Is there a specific reasoning behind this design decision?

To illustrate, when I operate subfinder with verbose log level configuration gologger.DefaultLogger.SetMaxLevel(levels.LevelVerbose), I encounter numerous logs with a log level that my external logging system cannot process.

{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-77-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"code.l.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"captive.adservice.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-64-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"google-proxy-66-249-83-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-73-0.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-73-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"colab-sandbox.research.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"tensorboard-staging.cloud.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"mail.google.com","timestamp":"2023-05-25T14:07:57+0000"}

Subfinder version
2.5.8

Complete command you used to reproduce this
I'm not running the CLI but rather using the lib, but the code below should reproduce this bug

package main

import (
	"bytes"
	"fmt"
	"io"
	"log"

	"github.com/projectdiscovery/subfinder/v2/pkg/resolve"
	"github.com/projectdiscovery/subfinder/v2/pkg/runner"
	"github.com/projectdiscovery/gologger"
)

func main() {
        gologger.DefaultLogger.SetMaxLevel(levels.LevelVerbose)
	gologger.DefaultLogger.SetFormatter(&formatter.JSON{})
	runnerInstance, err := runner.NewRunner(&runner.Options{
		Threads:            10,                       // Thread controls the number of threads to use for active enumerations
		Timeout:            30,                       // Timeout is the seconds to wait for sources to respond
		MaxEnumerationTime: 10,                       // MaxEnumerationTime is the maximum amount of time in mins to wait for enumeration
		Resolvers:          resolve.DefaultResolvers, // Use the default list of resolvers by marshaling it to the config
		ResultCallback: func(s *resolve.HostEntry) {  // Callback function to execute for available host
			log.Println(s.Host, s.Source)
		},
	})

	buf := bytes.Buffer{}
	err = runnerInstance.EnumerateSingleDomain("projectdiscovery.io", []io.Writer{&buf})
	if err != nil {
		log.Fatal(err)
	}

	data, err := io.ReadAll(&buf)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("%s", data)
}

You will then see multiple logs such as :

{"level":"hackertarget","msg":"nuclei.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"hackertarget","msg":"chaos.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"hackertarget","msg":"test.auth.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"hackertarget","msg":"dev.auth.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"alienvault","msg":"test.auth.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}

The problem comes from those lines of code :

// Log the verbose message about the found subdomain per source
if _, ok := sourceMap[subdomain][result.Source]; !ok {
gologger.Verbose().Label(result.Source).Msg(subdomain)
}

It seems like the Label() method overwrite the level of the log event object. However, here we just want to specify which source found the domain name (while still keeping the event log level at verbose).
I think this could be easily fixed by using something like gologger.Verbose().Str("source", result.Source).Msg(subdomain)

@tarunKoyalwar
Copy link
Member

@reivaxxavier , if i understood this correctly the issues is that gologger logging design is different that expected and is not compatible with your processing system

I encounter numerous logs with a log level that my external logging system cannot process.

the usage of format

gologger.Verbose().Label(result.Source).Msg(subdomain)

is common across all pd projects and i think this issue should be addressed as a feature request in gologger repo to make it more standard or option to change this specific behaviour

i see you have already created a issue for that at projectdiscovery/gologger#38 .
since this is not related to subfinder . i will be closing this issue and we can track the gologger feature request at projectdiscovery/gologger#38

@gnuletik
Copy link

I think that this should be reopened.

the usage of format [...] is common across all pd projects

From what I see, this seems specific to subfinder.

@ehsandeep
Copy link
Member

Reopening to revisit after projectdiscovery/gologger#38 worked on.

@ehsandeep ehsandeep reopened this May 31, 2023
@ehsandeep ehsandeep added the Status: On Hold Similar to blocked, but is assigned to someone label May 31, 2023
@gnuletik
Copy link

Thanks for reopening this issue @ehsandeep!

I don't think that projectdiscovery/gologger#38 needs to be fixed.
Instead, an appropriate Label value should be given in the argument of the Label(...) function, when called by subfinder.

This is what #848 intends to do.

@xavier-gerondeau-stoik
Copy link

I'm on the same page with gnuletik on this one. The trouble with gologger doesn't really have anything to do with the way subfinder uses the Label function.

The thing is, Label function should be given a 'level', not a 'source name'. So, it makes sense to sort this out in subfinder itself, rather than waiting for a fix from gologger. Because, even if gologger comes up with a fix, it's not going to change the fact that the Label usage in subfinder still needs to be fixed.

@tarunKoyalwar
Copy link
Member

@xavier-gerondeau-stoik , @gnuletik the intention behind creating gologger issue is that we can add a option in gologger to strictly behave like a logger and do not accept if any unknown labels are given and if given wrap it (like golang error)

ex when that option is implemented

gologger.DefaultLogger.Label("securitytrails").Msg("api.hackerone.com")

this will return output in below format

[VER] securitytrails: api.hackerone.com

it might be just me but when i am using subfinder binary in verbose mode i like

[SecurityTrails] api.hackerone.com

^ more than

[VER] Found subdomain subdomain_source=virustotal subdomain=mta-sts.forwarding.hackerone.com

since its simple . and i can easily do some bash foo with that format

@gnuletik
Copy link

Thanks for the feedback @tarunKoyalwar!

Even if the following can be easier to read when running a CLI, the following example is not following any logging convention.

[SecurityTrails] api.hackerone.com

The key=value pair is also quite easy to parse and defined as the Common Event Format.

[VER] Found subdomain subdomain_source=securitytrail subdomain=api.hackerone.com

This approach even has it full potential when you want to add other fields (we can imagine fields like confidence, ip etc..).

Also, having a static log message ("Found subdomain") helps when searching among multiple log messages (e.g we can search all "Found subdomain" messages and extract their subdomain values).

But I can understand if you want to avoid using key=value pair to improve readability in CLI mode.

Until gologger adds a Label feature, what do you think about implementing the expected output with the following code?

gologger.DefaultLogger.Msg("securitytrails: api.hackerone.com")

@tarunKoyalwar tarunKoyalwar self-assigned this May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: On Hold Similar to blocked, but is assigned to someone
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants