Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support for multiprocessing.Pool "initializer". #231

Open
gwerbin opened this issue Apr 14, 2023 · 5 comments
Open

Feature request: support for multiprocessing.Pool "initializer". #231

gwerbin opened this issue Apr 14, 2023 · 5 comments

Comments

@gwerbin
Copy link

gwerbin commented Apr 14, 2023

The multiprocessing.Pool interface provides the ability to pass a custom "initializer" function to each worker: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.

This is useful for things like suppressing specific warnings, setting up logging, and other "scaffolding" that is occasionally useful (or even required) in applications.

It seems like Pandarallel does not use its own initializer function, so it would be nice if users could provide their own.

Looking over the code, this seems like a relatively unintrusive backward-compatible change that most users won't notice at all, but would benefit the small number of users who do want or need this feature.

Hypothetical usage:

def _suppress_shapely_warning(ignore: bool = True) -> None:
    import warnings
    warnings.filterwarnings(
        'ignore' if ignore else 'default',
        message='invalid value encountered in intersects',
        category=RuntimeWarning,
        module=r'shapely\.predicates',
        lineno=758,
        append=True,
    )

pandarallel.initialize(
    nb_workers=10,
    progress_bar=False,
    initializer=_suppress_shapely_warning,
    initargs=(True,),
)
@till-m
Copy link
Collaborator

till-m commented Apr 14, 2023

Hi @gwerbin,

as you may have noticed, both @nalepae and me are currently super busy. I'm trying to keep this package running as best as I can, but I don't really have the time to expand it. However, I very much see how this could be useful, and if you're willing to draft a PR I would gladly review and merge it. From scanning the code I also suspect it wouldn't be a big change.

@nalepae
Copy link
Owner

nalepae commented Apr 14, 2023

Yep, if you can provide a PR (with tests and docs) it could be super nice!

@gwerbin
Copy link
Author

gwerbin commented Apr 18, 2023

Thanks @till-m and @nalepae! Happy to make a PR. I'll be busy for the next week followed by a short vacation, so I'll set a reminder for myself to look at it in early May.

gwerbin added a commit to gwerbin/pandarallel that referenced this issue Apr 18, 2023
Initial draft implementation of nalepae#231
@gwerbin
Copy link
Author

gwerbin commented Apr 18, 2023

I realized this was a very small change, so I went ahead and created #232

@nalepae
Copy link
Owner

nalepae commented Jan 23, 2024

Pandaral·lel is looking for a maintainer!
If you are interested, please open an GitHub issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants