Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what's the decent way to update desc and postfix #1570

Closed
yantaozhao opened this issue Apr 20, 2024 · 6 comments
Closed

what's the decent way to update desc and postfix #1570

yantaozhao opened this issue Apr 20, 2024 · 6 comments

Comments

@yantaozhao
Copy link

yantaozhao commented Apr 20, 2024

Given below code which is from website of joblib:

Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10))

What's the decent way to update the desc and postfix, if I want to trace and show the current processing number in the desc or postfix field?
Is there any way to avoid manually creating a new pbar object?

Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in tqdm(range(10)))
@CopperEagle
Copy link

CopperEagle commented Apr 23, 2024

You may want to have a look at the bar_format argument of tqdm. It allows to to completely customize the appearance of the loading bar. The changing variables can be inserted between { and }. Example:

from math import sqrt
from tqdm import tqdm
from joblib import Parallel, delayed

bar = "{desc}: Element number {n_fmt}... | {bar} | [{elapsed}<{remaining}, {rate_fmt}{postfix}]"

a = Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in tqdm(range(10), bar_format=bar, desc="Process A"))
# Process A: Element number 10... | ██████████████████████████████ | [00:00<00:00, 101.86s/it]

The variables are, per documentation: l_bar, bar, r_bar, n, n_fmt, total, total_fmt, percentage, elapsed, elapsed_s, ncols, nrows, desc, unit, rate, rate_fmt, rate_noinv, rate_noinv_fmt, rate_inv, rate_inv_fmt, postfix, unit_divisor, remaining, remaining_s, eta .

Per documentation, the default bar is '{l_bar}{bar}{r_bar}' with

l_bar='{desc}: {percentage:3.0f}%|'
r_bar='| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}{postfix}]'

Hope this helps.

@yantaozhao
Copy link
Author

yantaozhao commented Apr 23, 2024

Thanks @CopperEagle , but what I want is a dynamic desc and postfix in realtime.

For example, on data list ['a', 'b', 'c', 'd', 'e'], in bar:
hello d: | ██████████████████████████████ | [00:00<00:00, 101.86s/it] current value d,

the str hello d and current value d are generated on the fly, where d is the instant value in process.

@CopperEagle
Copy link

CopperEagle commented Apr 24, 2024

Oh, I see @yantaozhao

In this case, we can use tqdm.set_description and tqdm.set_postfix_str. Here is an example:

# Process a single argument
def do_stuff(arg): 
    return len(arg)

# Process subroutine
def process(arg, pbar): 
    sleep(1)
    pbar.set_description(f"Processing {arg}")
    pbar.set_postfix_str(f"The almightiy {arg} is here...")
    return delayed(do_stuff)(arg)

# ... elsewhere
pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
data = Parallel(n_jobs=2)(process(i, pbar) for i in pbar)

# Processing e: 100%|█████████████████████| 5/5 [00:19<00:00,  3.84s/it, The almighty e is here...]

The notable difference is that the tqdm iterator can no longer be anonymous, as it needs to be passed to the process function and updated. The traditional for loop (without joblib) would be

pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
for i in pbar:
    process(i, pbar)

Cheers

@yantaozhao
Copy link
Author

Oh, I see @yantaozhao

In this case, we can use tqdm.set_description and tqdm.set_postfix_str. Here is an example:

# Process a single argument
def do_stuff(arg): 
    return len(arg)

# Process subroutine
def process(arg, pbar): 
    sleep(1)
    pbar.set_description(f"Processing {arg}")
    pbar.set_postfix_str(f"The almightiy {arg} is here...")
    return delayed(do_stuff)(arg)

# ... elsewhere
pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
data = Parallel(n_jobs=2)(process(i, pbar) for i in pbar)

# Processing e: 100%|█████████████████████| 5/5 [00:19<00:00,  3.84s/it, The almighty e is here...]

The notable difference is that the tqdm iterator can no longer be anonymous, as it needs to be passed to the process function and updated. The traditional for loop (without joblib) would be

pbar = tqdm([ 'a', 'b', 'c', 'd', 'e' ])
for i in pbar:
    process(i, pbar)

Cheers

Good idea @CopperEagle .
Is there any further way to avoid manually creating a new pbar object?

what I want is something like below (not real runnable code):

data = ['a', 'b', 'c', 'd', 'e']
Parallel(n_jobs=2)(delayed(foo)(x) for x in tqdm(data, desc=lambda i: f'hello {data[i]}', postfix=lambda i: f'current value {data[i]}'))

where i is assumed as element index.

@CopperEagle
Copy link

Sure @yantaozhao, we can make it a wrapper function. Here's how:

def updater(pbar):
    iter = pbar.__iter__()
    class AnonymousPbar:
        def __iter__(self):
            return self
        def __next__(self):
            arg = iter.__next__()
            pbar.set_description(f"Processing {arg}")
            pbar.set_postfix_str(f"The almightiy {arg} is here...")       
            return arg
    return AnonymousPbar()

## Then you can use this anywhere:
for i in updater(tqdm(range(10))):
    process(i)
# Processing 9: 100%|███████████████████| 10/10 [00:10<00:00,  1.00s/it, The almightiy 9 is here...]

However, this may look a bit convoluted when using this with a for loop.
Also, the implementation fixes the description string. To make it look lean like tqdm does, we can pull in the creation of the tqdm object into the wrapper function.

The fallowing is a drop-in replacement for tqdm:

def progress(iterable, desc=None, postfix=None, **kwargs):
    pbar = tqdm(iterable, **kwargs)
    class AnonymousPbar:
        def __init__(self, proc):
            self.iter = proc.__iter__()
            self.pbar = proc
        def __iter__(self):
            return self
        def __next__(self):
            arg = self.iter.__next__()
            if desc is None: 
                desc_str = ""
            elif isinstance(desc, str): # compliance with tqdm
                desc_str = desc # want simple string? you get it.
            else:
                desc_str = desc(arg)
            if postfix is None: 
                postfix_str = ""
            elif isinstance(postfix, str): # compliance with tqdm
                postfix_str = postfix # want simple string? you get it.
            else:
                postfix_str = postfix(arg)
            self.pbar.set_description(desc_str)
            self.pbar.set_postfix_str(postfix_str)
            return arg
    return AnonymousPbar(pbar)

## Then, anywhere in the code:
for i in progress(range(10)):
    sleep(1)
# 100%|█████████████████████████████| 10/10 [00:10<00:00,  1.00s/it]


## progress accepts any keyword argument that tqdm does
for i in progress(range(10), desc=lambda arg: f"Process {arg}", ncols=80):
    sleep(1)
# Process 9: 100%|█████████████████████████████| 10/10 [00:10<00:00,  1.00s/it]


## progress can be used with joblib as expected
data = [6,3,8,2]
Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in progress(data, ncols=80, postfix=lambda arg:f"Process {arg}"))
# 100%|█████████████████████████████| 4/4 [00:00<00:00, 1126.59it/s, Process 2]
# [6.0, 3.0, 8.0, 2.0]

The length of the bars () was edited for readability in both code snippets.

Any argument by tqdm is supported. It has the added feature that you can set desc and postfix to be a lambda function which takes as an argument the current element being processed.

@yantaozhao
Copy link
Author

smart wrapper function @CopperEagle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants