Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Any way to increment/decrement with a delay (after syntax) #762

Open
narcoticfresh opened this issue Sep 1, 2023 · 4 comments
Open
Labels
question This issue looks like a usage problem, not a bug or feature request

Comments

@narcoticfresh
Copy link

Hi!

First, thanks for mtail and your community support, it's great! ;-)

I'm using this version currently:

root@27ce4b2f72ca:/var/www# mtail -version
mtail version 3.0.0~rc43 git revision 3.0.0~rc43-3+b2 go version go1.15.9 go arch amd64 go os linux

I have a use case that does not seem possible currently - but would be of great use.

What do I want to archieve?

I want to render a gauge metric that expresses the number of active users of an application.

  • A user becomes "active" if he makes a request (= increment the metric). If the same user does not make a request for an hour, he comes inactive (= decrement the metric).

What i do

The application outputs this if a user makes a request (exemplary):

User is USERNAME,

So OK great - so i do this:

gauge active_users by user

/User is\: (?P<user>\w*),/ {
  active_users[tolower($user)]++
  del active_users[$user] after 1h
}

Which works to track them - but i actually do not want a metric that has a label per user - i just want to have a count..

What is missing

From what I see, there is no way to have a "count of active users in this scenario" - as Crucially, there is not log event that tells me when the hour has passed, so there is no log line that says "now he is inactive, the hour passed".

It would be great if I could write this:

active_user_acount++
active_user_acount-- after 1h

That way, if a new $user comes around (one which is not in active_users yet), i could increment it temporarly for an hour - mtail would automatically decrease it.

Another way to solve it

Another feature that would be amazing in this scenario is a function that could count the number of labels in a metric.

That way i could simply write:

hidden gauge active_users by user
gauge active_user_count

/User is\: (?P<user>\w*),/ {
  active_users[tolower($user)]++

  active_user_count = count(active_users)
}

That would be amazing! ;-) We have len() for strings, so some kind of function that allows me to do the same for the metric dimensions..

Thanks!

@jaqx0r
Copy link
Contributor

jaqx0r commented Sep 2, 2023

Thanks for the kind words!

There's no way to add a delay. mtail is designed to give immediate feedback to the monitoring system that collects its metrics.

I would think about trying to do this with the monitoring collector. You can export the last seen time for each user in a histogram type:

histogram user_last_seen by user

/User is\: (?P<user>\w*),/ {
  user_last_seen[$user] = timestamp()
}

and then with your monitoring system count users who have been seen in the last hour, maybe something like:

count(user_last_seen[1h]) by user

(some sort of pseudo-Prometheus expression here)

I think the changes() function in Prometheus would do what you want: https://prometheus.io/docs/prometheus/latest/querying/functions/#changes

changes(user_last_seen[1h]) > 0

Or increase() may also work, in both cases if the value isn't seen in the last hour the output would be zero so filtering for greater-than-zero gets you your 1h active users.

mtail deliberately doesn't have time series storage; that's what the collector systems are for.

Counting the number of labels in a metric is an interesting one -- that doesn't require any historical storage and is totally doable.

I can imagine that working for you if you use del: https://google.github.io/mtail/Language.html#del

gauge active-users by user

/User is\: (?P<user>\w*),/ {
  active_users[$user]++
  del active_users[$user] after 1h
}

but you still need the count feature.

Again, that's something that I'd defer to the monitoring collector though.

@narcoticfresh
Copy link
Author

@jaqx0r
thanks for your fast feedback! ;-)

well, yes - i agree that "computation"/"analysis" should be done after mtail, i.e. in prometheus..

but in this case, the sheer amount of metric labels (there are a couple of thousands of possible usernames) place a huge burden on prometheus, as each label opens up a new time series.

see https://prometheus.io/docs/practices/naming/

CAUTION: Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

so the requirements to store these (possible thousands) of labels and then do a count (the only thing we're interested in) in prometheus a quite an overkill..

you mentioned that maybe a count() function would be possible - could you implement that? do you maybe have a bounty program or amazon wishlist so i can see if we can make a contribution? ;-)

@narcoticfresh
Copy link
Author

@jaqx0r
any follow up?

@jaqx0r
Copy link
Contributor

jaqx0r commented Apr 21, 2024

No; my followup is that you should do this in a collector.

If you wanted to implement a count method that works on the size of a histogram... maybe I'd accept it, but I think it's a highly specialised method and still don't believe it belongs in mtail.

@jaqx0r jaqx0r added the question This issue looks like a usage problem, not a bug or feature request label Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question This issue looks like a usage problem, not a bug or feature request
Projects
None yet
Development

No branches or pull requests

2 participants