-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seems that images are not being cached in RAM when cache=ram #9871
Comments
@PacificDou thanks for raising an issue. This was resolved recently in 8.1.45 in #9828. |
@glenn-jocher That's great! As a follow-up, do you think it's a good idea to cache images in RAM up to user-specified limit? In current version, we only have two options: either load whole dataset to RAM, or give up RAM cache, because of function check_cache_ram. It would be more flexible if user can specify the maximum RAM he wants to use. If you agree with this idea, I can try to submit a PR later for this. |
@PacificDou hi there! 😊 That's indeed an interesting thought! Allowing users to specify a maximum RAM limit for caching images sounds like a valuable enhancement to offer more flexibility and control. It could especially benefit users with limited resources or those managing large datasets. If you're up for it, we'd certainly welcome a PR on this feature. Your contribution could make a big difference to users looking for that sweet spot between performance and resource usage. Just ensure that the implementation is user-friendly and integrates smoothly with the existing setup. Looking forward to seeing your ideas come to life in a PR! 🚀 |
Hi @glenn-jocher , here is the PR for add memory cache limit control: #10258 |
@PacificDou, thanks for submitting the PR! 🌟 We'll review it shortly to ensure everything aligns with our vision for flexible and efficient data handling. This addition could indeed provide a valuable improvement for users working with diverse datasets and hardware configurations. Stay tuned for feedback or further instructions! 🛠️ |
👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help. For additional resources and information, please see the links below:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed! Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ |
Search before asking
Question
When init the trainer, there is an option
cache
for switching on/off cache on ram/disk, acceptable values are ram/True, disk, False. https://docs.ultralytics.com/modes/train/#train-settingsFor example, DetectionTrainer will init a YOLODataset object, and then wrap it in an InfiniteDataloader.
The
cache
parameter was set during initialling the YOLODataset object, which inherits BaseDataset.In BaseDataset's constructor, the image dataset will be cached to RAM/Disk.
For case of
cache=ram
, function load_image will be called for each image: load to RAM, clear old buffer iflen(self.buffer) >= self.max_buffer_length
;For case of
cache=disk
, function cache_images_to_disk will be called for each image: load to RAM, save as numpy file.Thus, in case of
cache=ram
, after initialisation of BaseDataset, there will be onlyself.max_buffer_length
images in RAM (and buffer), not the whole dataset.In addition, because of the following instruction in the BaseDataset's constructor, the buffer size will be capped at 1000. So if a dataset has more than 1000 images (and sufficient RAM), we still can NOT benefit from
reduced disk IO
.Additional
No response
The text was updated successfully, but these errors were encountered: