Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement __init__.py lazy loading for codegen'd objects #4058

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

abey79
Copy link
Contributor

@abey79 abey79 commented Oct 29, 2023

What

This PR implements __init__.py lazy loading for codegen'd objects. This means that any codegen'd object will only be actually loaded when explicitly imported. This should make our (best case) rerun python package loading time fixed instead of O(n), for n = API object count.

⚠️ WIP, benchmarks behaving weird, see comments ⚠️

TODO:

  • figure out benchmark behaviour
  • fix documentation generation (this PR breaks from rerun import archetypes; dir(archetypes))

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested demo.rerun.io (if applicable)
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

@abey79
Copy link
Contributor Author

abey79 commented Oct 29, 2023

Although lazy loading appears to work, benchmarks behave strange:

❯ pipx install --suffix "_2ec5ce7" --python /opt/local/bin/python3.11 https://build.rerun.io/commit/2ec5ce7/wheels/rerun_sdk-0.10.0a9+dev-cp38-abi3-macosx_11_0_arm64.whl

❯ hyperfine --warmup 10 "PYTHONPATH=rerun_py/rerun_sdk python -c 'import rerun'" "/Users/hhip/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -c 'import rerun'"
Benchmark 1: PYTHONPATH=rerun_py/rerun_sdk python -c 'import rerun'
  Time (mean ± σ):     451.6 ms ±  66.2 ms    [User: 730.6 ms, System: 2015.5 ms]
  Range (min … max):   337.5 ms … 570.3 ms    10 runs
 
Benchmark 2: /Users/hhip/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -c 'import rerun'
  Time (mean ± σ):     425.0 ms ±  76.5 ms    [User: 662.7 ms, System: 2014.2 ms]
  Range (min … max):   267.2 ms … 503.5 ms    10 runs
 
Summary
  /Users/hhip/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -c 'import rerun' ran
    1.06 ± 0.25 times faster than PYTHONPATH=rerun_py/rerun_sdk python -c 'import rerun'

This is confirmed by -X importtime:

❯ PYTHONPATH=rerun_py/rerun_sdk python -X importtime -c "import rerun" 2> /tmp/new.log
❯ $USER/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -X importtime -c 'import rerun' 2> /tmp/old.log
❯ tuna /tmp/old.log
image
❯ tuna /tmp/new.log
image

For some reason, loading time of pyarrow has increased more than what was gained 🤔

@abey79
Copy link
Contributor Author

abey79 commented Oct 30, 2023

Running python 3.8, the benchmark is inverted 🤷🏻‍♂️

❯ hyperfine --warmup 10 "/Users/hhip/.local/pipx/venvs/rerun-sdk-local/bin/python -c 'import rerun'" "/Users/hhip/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -c 'import rerun'"
Benchmark 1: /Users/hhip/.local/pipx/venvs/rerun-sdk-local/bin/python -c 'import rerun'
  Time (mean ± σ):     455.7 ms ±  39.4 ms    [User: 718.9 ms, System: 1999.5 ms]
  Range (min … max):   397.4 ms … 503.0 ms    10 runs

Benchmark 2: /Users/hhip/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -c 'import rerun'
  Time (mean ± σ):     504.3 ms ±  30.9 ms    [User: 758.3 ms, System: 2015.1 ms]
  Range (min … max):   463.7 ms … 546.3 ms    10 runs

Summary
  /Users/hhip/.local/pipx/venvs/rerun-sdk-local/bin/python -c 'import rerun' ran
    1.11 ± 0.12 times faster than /Users/hhip/.local/pipx/venvs/rerun-sdk-2ec5ce7/bin/python -c 'import rerun'

@teh-cmc
Copy link
Member

teh-cmc commented Oct 30, 2023

$ python --version
Python 3.11.5

## latest

$ rerun --version
rerun_py 0.10.0-alpha.8 [rustc 1.72.1 (d5c2e9c34 2023-09-13), LLVM 16.0.5] x86_64-unknown-linux-gnu release-0.10.0-alpha.8 da96dc0, built 2023-10-26T11:50:51Z

$ hyperfine --warmup 10 "python -c 'import rerun'"
Benchmark 1: python -c 'import rerun'
  Time (mean ± σ):     181.7 ms ±   1.6 ms    [User: 501.8 ms, System: 2117.4 ms]
  Range (min … max):   179.2 ms … 185.0 ms    16 runs

## Antoine's

$ rerun --version
rerun_py 0.10.0-alpha.9+dev [rustc 1.72.1 (d5c2e9c34 2023-09-13), LLVM 16.0.5] x86_64-unknown-linux-gnu main 2ec5ce7, built 2023-10-27T20:12:51Z

$ hyperfine --warmup 10 "python -c 'import rerun'"
Benchmark 1: python -c 'import rerun'
  Time (mean ± σ):     181.1 ms ±   1.7 ms    [User: 485.1 ms, System: 2131.4 ms]
  Range (min … max):   178.1 ms … 184.2 ms    16 runs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement lazy loading in rerun/__init__.py
2 participants