Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change/burst detection via CountMinSketch #565

Open
sritchie opened this issue Oct 25, 2016 · 0 comments
Open

change/burst detection via CountMinSketch #565

sritchie opened this issue Oct 25, 2016 · 0 comments

Comments

@sritchie
Copy link
Collaborator

sritchie commented Oct 25, 2016

Thanks to @avibryant for this one!

The paper: https://www.cs.utexas.edu/~yzhang/papers/nad-imc03.pdf

Avi's summary:

Somewhat related and maybe interesting: [the algorithm] keeps a Count-Min-Sketch for each time period, and then shows how to combine a series of historical CMS to produce a CMS of the moving average, or the Holt-Winters forecast, etc; then you can use that as the basis for alerting on the current CMS.

(Actually they don't use min() as their estimator so it's not quite a CMS, they do something slightly fancier, but I think the data structure is identical).

And the abstract from the paper:

Traffic anomalies such as failures and attacks are commonplace in today’s network, and identifying them rapidly and accurately is critical for large network operators. The detection typically treats the traffic as a collection of flows that need to be examined for significant changes in traffic pattern (e.g., volume, number of connections). However, as link speeds and the number of flows increase, keeping per-flow state is either too expensive or too slow.

We propose building compact summaries of the traffic data using the notion of sketches. We have designed a variant of the sketch data structure, k-ary sketch, which uses a constant, small amount of memory, and has constant per-record update and reconstruction cost. Itslinearity property enables us to summarize traffic at various levels. We then implement a variety of time series forecast models (ARIMA, Holt-Winters, etc.) on top of such summaries and detect significant changes by looking for flows with large forecast errors. We also present heuristics for automatically configuring the model parameters.

Using a large amount of real Internet traffic data from an operational tier-1 ISP, we demonstrate that our sketch-based change detection method is highly accurate, and can be implemented at low computation and memory costs. Our preliminary results are promising and hint at the possibility of using our method as a building block for network anomaly detection and traffic measurement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant