How to better tune peak memory usage #260

cyc · 2022-01-21T20:09:26Z

I have some datasets and transformations that I want to run that unfortunately won't fit on n1-highmem-16 instances (which is what FlexRS requires). The features are fairly standard scalar features with tft.quantiles analyzer and string features with tft.vocabulary analyzer (but there are a lot of each type of feature). Generally the analyze step will run fine up until the final combine which will typically run on a very small number of machines and cause them to repeatedly OOM.

Of course I can do something like use a larger machine type or even a custom machine type, but these don't work with FlexRS and would be more expensive. I'm generally curious about whether either of the following two options would be viable solutions:

Shard the analyze step by features. So split up the set of features into separate groups and run multiple different analyze steps sequentially, which should hopefully reduce peak memory usage. The challenge would be how to merge the outputs of the analyze steps together at the end.
Add beam resource hints specifically to the problematic combine tasks so that they do not get scheduled to run on the same machine.

Are either of these two options viable or is there a solution that I have not considered yet?

The text was updated successfully, but these errors were encountered:

zoyahav · 2022-01-24T18:00:24Z

Quick initial questions -
a. what version of TFT is your pipeline running with
b. same question about beam
c. how many analyzers are defined in the pipeline, and of what types? (how many quantiles, vocabulary, etc.)

cyc · 2022-01-24T20:23:19Z

a. TFT 1.5.0
b. Beam 2.35.0
c. 1 tft.quantiles analyzer with dimension 840 (and reduce_instance_dims=False), 15 tft.vocabulary analyzers, and 96 tft.experimental.approximate_vocabulary analyzers.

cyc · 2022-01-27T17:32:04Z

Also, I should add that from my experiments testing these analyzers using DirectRunner it's not the quantiles analyzer that consumes most of the memory, it's the tft.vocabulary analyzers (tested this by disabling different analyzers and measuring the amount of memory allocated)

sanatmpa1 self-assigned this Jan 24, 2022

sanatmpa1 added the type:support label Jan 24, 2022

sanatmpa1 assigned zoyahav and unassigned sanatmpa1 Jan 24, 2022

sanatmpa1 added stat:awaiting response stat:awaiting tensorflower and removed stat:awaiting response labels Jan 24, 2022

zoyahav assigned iindyk Jan 24, 2022

cyc mentioned this issue Jan 27, 2022

Support combiner packing for tft.vocabulary #259

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to better tune peak memory usage #260

How to better tune peak memory usage #260

cyc commented Jan 21, 2022

zoyahav commented Jan 24, 2022

cyc commented Jan 24, 2022

cyc commented Jan 27, 2022

How to better tune peak memory usage #260

How to better tune peak memory usage #260

Comments

cyc commented Jan 21, 2022

zoyahav commented Jan 24, 2022

cyc commented Jan 24, 2022

cyc commented Jan 27, 2022