Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A monoid for sliding time window aggregations #626

Open
dasch opened this issue Feb 28, 2017 · 3 comments
Open

A monoid for sliding time window aggregations #626

dasch opened this issue Feb 28, 2017 · 3 comments

Comments

@dasch
Copy link

dasch commented Feb 28, 2017

I've started using Algebird in conjunction with Google Dataflow, which works great! My main data structure is a deeply nested case class that is implicitly a monoid – it contains a bunch of stuff, mainly time-sorted Max values (such that I only keep the most recent version of some value).

My needs are a bit specific, though, and the built-in time window aggregation semantics are causing some complications. I'd like to ask if you have any experience implementing windowed aggregations using monoids, or whether you'd be interested in accepting such an implementation were I to write one.

My current idea is loosely based on TopK – let's just call it SlidingWindow. It keeps k "slots", each containing an instance of another monoid and a timestamp (the timestamp will be normalized according to the period of the time window, e.g. start-of-day). When adding two of these together, each slot that maps to the same timestamp will be added together, but only the k most recent slots will be retained. There would be a method that returned the "sum" of all the slots, representing, say, the total count of some event during the last k days/minutes/whatever.

Does this design make sense? I've tried looking, but I have seen little mention of time as a factor here, although it's critical in many streaming applications.

Sorry for the bother if this is the wrong place to ask.

@SemanticBeeng
Copy link

SemanticBeeng commented Jan 23, 2019

@johnynek mentions something related here "scale.bythebay.io: Oscar Boykin, How to Elm-ify Your ML" https://youtu.be/k1jhDcruiLE?t=2142 but would like to see some code. :-)

@sritchie
Copy link
Collaborator

We've got a pretty nice approximate sliding window implementation here: http://twitter.github.io/algebird/datatypes/approx/exponential_histogram.html

This might help!

@sritchie
Copy link
Collaborator

cc @SemanticBeeng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants