Add memory cache limit #10258

PacificDou · 2024-04-23T08:34:16Z

Description:
This PR is for the proposal mentioned at: #9871

Basically, it gives users the flexibility to control memory cache by the additional parameter mem_cache_limit (default is 0.7):

mem_cache_limit < 0.0: no cache will be used in memory (same as cache=False);
0.0 < mem_cache_limit < 1.0: use fraction of current available memory;
mem_cache_limit > 1.0: use the value as it is, unit is byte.

For example:

mem_cache_limit=0.7: 70% of current available memory;
mem_cache_limit=1024*1024*1024: 1024x1024x1024 Bytes = 1 GB.

Please be noticed that, there will always be a (memory buffer)[https://github.com/ultralytics/ultralytics/blob/5a82b511074617b9daa25000e5672c5d56188374/ultralytics/data/base.py#L87] inside the dataset, even if case=disk or cache=False, to cache a few batches.
If the required memory for this buffer is larger than mem_cache_limit, we'll still cache the images so the actual memory usage might be higher than mem_cache_limit. This should not be a problem, because this buffer is usually quite small.

In addition, if multi-process workers are being used, mem_cache_limit only controls the memory usage inside a single worker process.

🛠️ PR Summary

_{Made with ❤️ by Ultralytics Actions}

🌟 Summary

Introducing memory cache limits for improved data handling in training 🚀.

📊 Key Changes

Added mem_cache_limit configuration option in default.yaml to control memory cache usage.
Implemented memory cache limits within the data loading process to better manage RAM usage during training, preventing potential bottlenecks.
Updated dataset classes to support partial caching based on the new memory cache limit, enabling more efficient use of available system memory.

🎯 Purpose & Impact

Efficient Memory Usage: The introduction of a memory cache limit aims to optimize RAM usage during the data loading phase of training models. This is especially crucial for training on large datasets or on systems with limited memory resources.
Enhanced Performance: By allowing for partial dataset caching within the specified memory limits, training processes can achieve a balance between speed and system resource utilization. This ensures smoother training sessions without overwhelming the system's memory.
Flexibility and Control: Users now have the flexibility to configure the memory cache usage according to their system's capabilities, making the training process more adaptable and reducing the risk of system crashes or slowdowns due to excessive memory consumption.

Overall, this update enhances the Ultralytics framework's usability and efficiency, especially in environments where memory is at a premium.

PacificDou · 2024-04-23T08:36:40Z

I have read the CLA Document and I sign the CLA

codecov · 2024-04-23T08:37:00Z

Codecov Report

Attention: Patch coverage is 85.71429% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 70.49%. Comparing base (40e5ac3) to head (5e4da0c).

Files	Patch %	Lines
ultralytics/data/base.py	85.71%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #10258      +/-   ##
==========================================
- Coverage   70.66%   70.49%   -0.18%     
==========================================
  Files         122      122              
  Lines       15591    15600       +9     
==========================================
- Hits        11018    10997      -21     
- Misses       4573     4603      +30

Flag	Coverage Δ
Benchmarks	`35.37% <21.42%> (-0.26%)`	⬇️
GPU	`37.35% <21.42%> (-0.01%)`	⬇️
Tests	`66.75% <85.71%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…Dou/ultralytics into feature/set-dataset-cache-limit

PacificDou · 2024-04-23T09:12:59Z

Codecov Report

Attention: Patch coverage is 86.66667% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 75.72%. Comparing base (5a82b51) to head (00c9e64).

❗ Current head 00c9e64 differs from pull request most recent head abf95eb. Consider uploading reports for the commit abf95eb to get more accurate results

Files Patch % Lines
ultralytics/data/base.py 86.66% 2 Missing ⚠️
Additional details and impacted files
☔ View full report in Codecov by Sentry. 📢 Have feedback on the report? Share it here.

Added an additional test case for low ram limit setting

PacificDou · 2024-04-23T09:53:22Z

Got the following failure from github actions:

=========================== short test summary info ============================
FAILED tests/test_python.py::test_export_coreml - AttributeError: 'torch._C.Node' object has no attribute 'getModuleHierarchy'
======= 1 failed, 71 passed, 9 skipped, 11 warnings in 554.44s (0:09:14) =======
Error: Process completed with exit code 1.

However, I passed the same test on my local machine:

========================================================== test session starts ===========================================================
platform linux -- Python 3.10.12, pytest-8.1.1, pluggy-1.4.0
rootdir: /home/shuyang/ultralytics
configfile: pyproject.toml
collected 1 item

tests/test_python.py . [100%]

========================================================== slowest 30 durations ==========================================================
22.99s call tests/test_python.py::test_export_coreml

(2 durations < 0.005s hidden. Use -vv to show these durations.)
=========================================================== 1 passed in 23.01s ===========================================================

glenn-jocher · 2024-04-23T13:56:37Z

Hey there! It seems like you encountered an issue when running GitHub Actions for exporting to CoreML, but it worked fine on your local setup. 🤔 This discrepancy might be due to differences in the environment setup, library versions, or even the PyTorch and torchvision versions between your local machine and the GitHub Actions runner.

A potential step to align both environments more closely could involve ensuring that the same versions of PyTorch, torchvision, and coremltools are used in both cases. You could verify and match these versions in your GitHub Actions workflow file, possibly by setting up a more explicit environment for your tests. Here's a quick example on how you could specify versions in your workflow:

- name: Set up Python
  uses: actions/setup-python @PacificDou
  with:
    python-version: '3.10'
- name: Install dependencies
  run: |
    pip install torch==<your_version> torchvision==<your_version> coremltools==<your_version>
    pip install -r requirements.txt

Replace <your_version> with the specific versions you're using locally. This might help in ensuring that the testing environments are as identical as possible, potentially resolving the attribute error you encountered. If the issue persists, it may be worth checking the latest compatibility notes or updates for coremltools or the related exporting function within the repository. Hope this helps!

PacificDou · 2024-04-23T15:08:41Z

Hi @glenn-jocher , I tried the main branch with original code (no change), got the same error for the same test case on github actions. So I guess there might be some version conflict in the test scripts.

It seems that your recent commit fixed it!!

Burhan-Q · 2024-04-23T21:28:43Z

@PacificDou thanks for opening a PR! Please also make sure to update the train arguments table in the documentation. Additionally, we appreciate any unit tests you can write to test the changes you're contributing.

PacificDou · 2024-04-30T20:37:58Z

Hi @Burhan-Q , I updated the documentation and also added a test case.

PacificDou and others added 2 commits April 23, 2024 08:19

add memory cache limit

00c9e64

Auto-format by https://ultralytics.com/actions

c27b3b6

PacificDou added 2 commits April 23, 2024 09:09

add test for low ram limit

e7070ca

:Merge branch 'feature/set-dataset-cache-limit' of github.com:Pacific…

abf95eb

…Dou/ultralytics into feature/set-dataset-cache-limit

PacificDou mentioned this pull request Apr 23, 2024

Seems that images are not being cached in RAM when cache=ram #9871

Open

1 task

modity comment to trigger github actions

e78dca5

Merge branch 'main' into feature/set-dataset-cache-limit

aa2f08f

Burhan-Q added the enhancement New feature or request label Apr 23, 2024

ultralytics deleted a comment from glenn-jocher Apr 23, 2024

PacificDou added 3 commits April 30, 2024 21:19

Merge branch 'main' into feature/set-dataset-cache-limit

039fde9

add unit test

0061379

update comments

66a9074

Burhan-Q and others added 3 commits May 3, 2024 17:02

Merge branch 'main' into feature/set-dataset-cache-limit

d7e59f8

Merge branch 'main' into feature/set-dataset-cache-limit

aa023c4

Merge branch 'main' into feature/set-dataset-cache-limit

5e4da0c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add memory cache limit #10258

Add memory cache limit #10258

PacificDou commented Apr 23, 2024 •

edited by github-actions bot

PacificDou commented Apr 23, 2024

codecov bot commented Apr 23, 2024 •

edited

PacificDou commented Apr 23, 2024

Codecov Report

PacificDou commented Apr 23, 2024

glenn-jocher commented Apr 23, 2024

PacificDou commented Apr 23, 2024 •

edited

Burhan-Q commented Apr 23, 2024

PacificDou commented Apr 30, 2024

Add memory cache limit #10258

Are you sure you want to change the base?

Add memory cache limit #10258

Conversation

PacificDou commented Apr 23, 2024 • edited by github-actions bot

🛠️ PR Summary

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

PacificDou commented Apr 23, 2024

codecov bot commented Apr 23, 2024 • edited

Codecov Report

PacificDou commented Apr 23, 2024

Codecov Report

PacificDou commented Apr 23, 2024

glenn-jocher commented Apr 23, 2024

PacificDou commented Apr 23, 2024 • edited

Burhan-Q commented Apr 23, 2024

PacificDou commented Apr 30, 2024

PacificDou commented Apr 23, 2024 •

edited by github-actions bot

codecov bot commented Apr 23, 2024 •

edited

PacificDou commented Apr 23, 2024 •

edited