Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loosen domain name scrubbing in span.domain tag #3573

Open
gggritso opened this issue May 9, 2024 · 3 comments
Open

Loosen domain name scrubbing in span.domain tag #3573

gggritso opened this issue May 9, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@gggritso
Copy link
Member

gggritso commented May 9, 2024

Problem Statement

Relay scrubs domain addresses like my.app.service.com to something like *.service.com. This is okay for a lot of cases, but for some users, subdomains are meaningful to more than just 2 segments. This scrubbing causes different domains to be mashed together in the Requests view.

Solution Brainstorm

  • allow more segments (e.g., leave the last three segments, or more)
  • no scrubbing at all (surely we can't allow that, there are too many unique values)
  • customizable domain scrubbing (let people to configure how domains are scrubbed)
  • allow self-hosted users to turn domain scrubbing off (people who have their own infra might want to turn this off, and bear the weight of high cardinality)
@gggritso gggritso added the enhancement New feature or request label May 9, 2024
@jjbayer
Copy link
Member

jjbayer commented May 10, 2024

Domain scrubbing is definitely arbitrary right now, because the second level domain might be just as high-cardinality as the subdomain.

@gggritso what if we approach this in a data-driven way, i.e. collect a sample of unscrubbed domains from from the spans dataset and see what kind of scrubbing would yield the best results? One thing we could immediately evaluate is subdomain removal vs. scrubbing of integer / hex IDs. Is this something you'd be willing to drive?

@aldy505
Copy link

aldy505 commented May 11, 2024

I think I'd prefer this one:

allow self-hosted users to turn domain scrubbing off (people who have their own infra might want to turn this off, and bear the weight of high cardinality)

It doesn't disturbs the behavior on SaaS and it has a toggle for it.

@gggritso
Copy link
Member Author

@jjbayer that approach makes sense! I'd be happy to do that at some point, though not sure when I'll have a chance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

3 participants