How to limit the memory usage of AMF #1454

lbhwyy · 2023-11-16T04:03:19Z

Versions

0.19.0
python=3.8
ubuntu 18.04

Issue

AMF integrated with River is very useful, but the unlimited growth of memory size restricts practical application. Could you please tell me how to limit the memory size of AMF?

MaxHalford · 2023-11-16T07:24:49Z

Hey there. Just curious, how deep are your trees? How much memory are you consuming? Do you know?

lbhwyy · 2023-11-16T07:53:22Z

I don't know how deep my trees are; it might be infinite because I couldn't find a parameter in the settings to control it. I used around 20,000 data points with 61 dimensions for online training. After that, I saved the model locally using joblib. I noticed that the size of the model is approximately 500MB.These are my settings:
model_train = forest.AMFClassifier(
n_estimators=50,
use_aggregation=True,
dirichlet=0.5,
seed=1
)
Additionally, I observed that the size of the model seems to be increasing approximately linearly.I would like to know if, for forest.AMFClassifier, it's possible to limit the memory usage of a single tree by restricting its depth, thereby constraining the overall memory usage of the entire forest.

MaxHalford · 2023-11-16T09:02:57Z

Dear tree-hugger @smastelini, would you have some spare time to look into this? I think it's worthwhile to get a better understanding of how fast/deep Mondrian trees grow :)

smastelini · 2023-11-16T12:22:56Z

Hey everyone, I will do my best. The vanilla Mondrian trees had a budget parameter if I recall correctly. I am not that familiarized with Aggregated Mondrian Trees, but I'll do my homework.

lbhwyy · 2023-11-16T12:50:55Z

Thank you for your quick response! I appreciate your willingness to look into it. If you discover any information about the budget parameter for Aggregated Mondrian Trees during your research, I would be interested to learn more. Looking forward to any updates you can provide. Thanks again!

ananiask8 · 2024-02-18T10:50:35Z

Any updates on this? If memory usage is unbounded, using this model online in production could result in eventual memory collapse.

smastelini · 2024-02-18T11:12:52Z

No, not yet. I started reading the paper but so far I haven't found any type of "budget" parameter. Unfortunately my time is currently scarce, so I cannot delve in depth into this topic right now.

smastelini · 2024-03-08T18:18:11Z

A small update. I finished skimming through the paper and from what I get, the theoretical robustness guarantees of the algorithm and its adaptive nature are the factors that should provide an automatic cap on the memory usage. There is no direct control from the user standpoint as far as I am concerned.

The idea is the algorithm would (eventually) adapt and converge while avoiding overfitting. This last aspect could be the main source of excessive memory usage, as far as decision tree structures are concerned.

I want to get a more practical understanding of AMFs by delving into the original code and the River adaptation. This could help me form a more practical and solid opinion from an application viewpoint.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to limit the memory usage of AMF #1454

How to limit the memory usage of AMF #1454

lbhwyy commented Nov 16, 2023 •

edited

MaxHalford commented Nov 16, 2023

lbhwyy commented Nov 16, 2023

MaxHalford commented Nov 16, 2023

smastelini commented Nov 16, 2023

lbhwyy commented Nov 16, 2023

ananiask8 commented Feb 18, 2024

smastelini commented Feb 18, 2024

smastelini commented Mar 8, 2024

How to limit the memory usage of AMF #1454

How to limit the memory usage of AMF #1454

Comments

lbhwyy commented Nov 16, 2023 • edited

Versions

Issue

MaxHalford commented Nov 16, 2023

lbhwyy commented Nov 16, 2023

MaxHalford commented Nov 16, 2023

smastelini commented Nov 16, 2023

lbhwyy commented Nov 16, 2023

ananiask8 commented Feb 18, 2024

smastelini commented Feb 18, 2024

smastelini commented Mar 8, 2024

lbhwyy commented Nov 16, 2023 •

edited