Provide a way to work out the total number of iterations while the loop is running #1550

kalekundert · 2024-02-07T22:30:31Z

I have marked all applicable categories:
- documentation request (i.e. "X is missing from the documentation." If instead I want to ask "how to use X?" I understand StackOverflow#tqdm is more appropriate)
- new feature request
I have visited the source website, and in particular
read the known issues
I have searched through the issue tracker for duplicates
I have mentioned version numbers, operating system and
environment, where applicable:
```
import tqdm, sys
print(tqdm.__version__, sys.version, sys.platform)
```

Sometimes it takes a long time just to figure out how many iterations are going to happen. For example, consider the following:

from pathlib import Path
from tqdm import tqdm

paths = Path.cwd().glob('**/*')

for path in tqdm(paths):
    # Do some work...

Working out the number of paths that will be matched by the glob requires interacting with the filesystem, and could take a long time in a big directory. But it's really useful to have this number, because without it there's no way to actually render a progress bar.

If I want the above script to have a true progress bar, I can only think of two ways to do it:

Read all the paths into a list, then iterate through that.
Iterate once though all the paths to come up with a count, then iterate through them again to actually do the work.

Neither of these approaches are ideal. The first could use a prohibitive amount of memory, and both could cause the program to take a long time before even beginning to render the progress bar.

I'd like to propose a new API that provides a better alternative. The idea is to let the user provide a function that can be called to get the total number of expected iterations, then to run that function in a background thread. Before the function finishes, the "progress bar" would just display the same thing it currently does when the total number of iterations isn't known, i.e. the current iteration number, the elapsed time, etc. After the function finishes, a true progress bar would be displayed. I think this gives the best of both worlds: the progress bar would start immediately with the information it has, and once better information is available, it would provide the user an estimate for how long they'll need to wait.

As for the actual API, I can imagine two options. I think I slightly prefer the second, but I'd be happy with either:

Allow tqdm(total=...) to be a function, in which case it will be handled as described above.
Add a new tqdm(total_bg=...) argument, that only accepts functions.

If there's interest in adding a feature like this, let me know. No guarantees, but I might be able to make a PR.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a way to work out the total number of iterations while the loop is running #1550

Provide a way to work out the total number of iterations while the loop is running #1550

kalekundert commented Feb 7, 2024 •

edited

Provide a way to work out the total number of iterations while the loop is running #1550

Provide a way to work out the total number of iterations while the loop is running #1550

Comments

kalekundert commented Feb 7, 2024 • edited

kalekundert commented Feb 7, 2024 •

edited