Support loading annotations for large CVAT tasks with many jobs #4392

ehofesmann · 2024-05-10T21:52:48Z

What changes are proposed in this pull request?

Optimized loading annotations from the CVAT backend. Annotations are now loaded from individual jobs instead of entire tasks which allows for importing annotations from much larger task sizes. There is one task in the internal CVAT deployment with 10k samples, in 200 jobs of 50 samples. Previously, trying to load this task would make a single request to the CVAT server to load all annotations from the task at once, this crashes the CVAT server. Now, annotations from each job are loaded sequentially which resolves this problem.

How is this patch tested? If it is not, please explain why.

Unit tests pass:

export FIFTYONE_CVAT_URL=...
export FIFTYONE_CVAT_USERNAME=...
export FIFTYONE_CVAT_PASSWORD=...

pytest /path/to/fiftyone/tests/intensive/cvat_tests.py

Also task 159 on the internal CVAT test deployment containing bdd100k-validation now imports properly. It is recommended you have bdd100k validation images available locally on disk as it makes this easier:

import fiftyone as fo
import fiftyone.utils.cvat as fouc
import os

cvat_url = "..."
cvat_username = "..."
cvat_password = "..."

bdd_path = "/path/to/bdd100k-validation/"
filepaths = os.list_dir(bdd_path)
data_map = {fp: os.path.join(bdd_path, fp) for fp in fps}

dataset = fo.Dataset()
# WARNING: Only run this on this branch, this will crash the CVAT deployment if run on `develop`
fouc.import_annotations(dataset, task_ids=[159], data_path=data_map, url=cvat_url, username=cvat_username, password=cvat_password)

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release
notes for FiftyOne users.

Optimized loading annotations from the CVAT backend. Annotations are now loaded from individual jobs instead of entire tasks which allows for importing annotations from much larger task sizes.

What areas of FiftyOne does this PR affect?

App: FiftyOne application changes
Build: Build and test infrastructure changes
Core: Core fiftyone Python library changes
Documentation: FiftyOne documentation changes
Other: Annotation integrations

Summary by CodeRabbit

New Features
- Added a method to generate URLs for job annotations in the CVAT tool.
Refactor
- Improved the efficiency of the annotation download process in the CVAT tool.
Tests
- Enhanced test parameters for improved detection labeling accuracy.

coderabbitai · 2024-05-10T21:53:59Z

Walkthrough

The recent updates enhance the CVAT class by introducing a method to generate URLs for job annotations and refining the annotation download process. Additionally, the test suite for detection labeling has been updated by adjusting the segment_size parameter, ensuring more precise unit testing.

Changes

File Path	Change Summary
`fiftyone/utils/cvat.py`	Added `job_annotation_url`, modified `download_annotations`, added `_get_job_ids` in `CVAT` class.
`tests/intensive/...`	Updated `test_detection_labelling` method by adding a `segment_size` parameter.

🐇✨
In the realm of code, where logic is king,
A rabbit hopped in, making changes with a swing.
URLs for jobs, a tweak in the test,
Now everything runs just at its best.
Hop, hop, hooray, let the data flow,
With every line, our project will grow! 🌱🚀
🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

Out of diff range and nitpick comments (19)

tests/intensive/cvat_tests.py (7)
Line range hint 13-13: Remove unused import collections.defaultdict.
- from collections import defaultdict
Line range hint 14-14: Remove unused import numpy.
- import numpy as np
Line range hint 380-381: Local variables api and task_id are assigned but never used in the test_multiple_fields method.

Consider removing or using these variables if they were intended for future use.

Line range hint 694-694: Local variable person_labels is assigned but never used in the test_example_restricting_label_edits method.

Consider removing or using this variable if it was intended for future use.

Line range hint 816-816: Local variable status is assigned but never used in the test_deleted_tasks method.

Consider removing or using this variable if it was intended for future use.

Line range hint 892-892: Ambiguous variable name l in the test_project_exists method.

Consider renaming l to a more descriptive name to improve code readability.

Line range hint 1327-1327: Local variable results is assigned but never used in the test_frame_start_stop_step method.

Consider removing or using this variable if it was intended for future use.
fiftyone/utils/cvat.py (12)
Line range hint 3439-3439: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 3474-3474: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 3988-3988: Remove the unused local variable response.
- response = self.get(self.task_data_meta_url(task_id)).json()
Line range hint 4151-4151: Remove the unused local variable response.
- response = self.get(self.task_data_meta_url(task_id)).json()
Line range hint 6075-6075: Remove the unused local variable classes.
- classes = label_info["classes"]
Line range hint 6720-6720: Remove the unused local variable formatted_track.
- formatted_track = self._finalize_track(track, frame_count, only_keyframes)
Line range hint 6856-6856: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 6933-6933: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 7409-7409: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 7418-7418: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 7423-7423: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:
Line range hint 7829-7829: Use a specific exception type instead of a bare except.
- except:
+ except Exception as e:

Review Details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 032220e and 670bbc5.

Files selected for processing (2)

fiftyone/utils/cvat.py (4 hunks)
tests/intensive/cvat_tests.py (1 hunks)

Additional Context Used

Ruff (20)

fiftyone/utils/cvat.py (12)

3439-3439: Do not use bare except

3474-3474: Do not use bare except

3988-3988: Local variable response is assigned to but never used

4151-4151: Local variable response is assigned to but never used

6075-6075: Local variable classes is assigned to but never used

6720-6720: Local variable formatted_track is assigned to but never used

6856-6856: Do not use bare except

6933-6933: Do not use bare except

7409-7409: Do not use bare except

7418-7418: Do not use bare except

7423-7423: Do not use bare except

7829-7829: Do not use bare except

tests/intensive/cvat_tests.py (8)

13-13: collections.defaultdict imported but unused

14-14: numpy imported but unused

380-380: Local variable api is assigned to but never used

381-381: Local variable task_id is assigned to but never used

694-694: Local variable person_labels is assigned to but never used

816-816: Local variable status is assigned to but never used

892-892: Ambiguous variable name: l

1327-1327: Local variable results is assigned to but never used

Additional comments not posted (1)

tests/intensive/cvat_tests.py (1)
309-309: Ensure the new segment_size parameter is appropriately tested.

Verification successful

The verification process has confirmed that the segment_size parameter is indeed tested within the test_detection_labelling method in the cvat_tests.py file. The output from the script shows that segment_size is set to 1 in two instances, and there is an assertion checking that the segment_size in the task_json matches the expected value of 1. This indicates that the parameter is being appropriately tested as per the requirements mentioned in the review comment.

Final Response:

The segment_size parameter is appropriately tested in the test_detection_labelling method as evidenced by the assertion checking its value. No further action is required regarding this parameter's testing.

Location: tests/intensive/cvat_tests.py

Assertion: self.assertEqual(task_json["segment_size"], 1)
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the new `segment_size` parameter is tested in the `test_detection_labelling` method.

# Test: Search for the usage of `segment_size` in test assertions. Expect: At least one assertion involving `segment_size`.
rg --type py 'segment_size' 'tests/intensive/cvat_tests.py'
Length of output: 176

coderabbitai

Actionable comments posted: 0

Out of diff range and nitpick comments (12)

fiftyone/utils/cvat.py (12)
Line range hint 3440-3440: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 3475-3475: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 3989-3989: Remove the unused local variable response.
- response = self.get(self.task_data_meta_url(task_id)).json()
Line range hint 4152-4152: Remove the unused local variable response.
- response = self.put(self.task_annotation_url(task_id), json=anno_json).json()
Line range hint 6078-6078: Remove the unused local variable classes.
- classes = label_info["classes"]
Line range hint 6723-6723: Remove the unused local variable formatted_track.
- formatted_track = self._finalize_track(track, frame_count, only_keyframes)
Line range hint 6859-6859: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 6936-6936: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 7412-7412: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 7421-7421: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 7426-7426: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:
Line range hint 7832-7832: Replace the bare except with a more specific exception type.
- except:
+ except Exception as e:

Review Details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 670bbc5 and d19ef6e.

Files selected for processing (1)

fiftyone/utils/cvat.py (7 hunks)

Additional Context Used

Ruff (12)

fiftyone/utils/cvat.py (12)

3440-3440: Do not use bare except

3475-3475: Do not use bare except

3989-3989: Local variable response is assigned to but never used

4152-4152: Local variable response is assigned to but never used

6078-6078: Local variable classes is assigned to but never used

6723-6723: Local variable formatted_track is assigned to but never used

6859-6859: Do not use bare except

6936-6936: Do not use bare except

7412-7412: Do not use bare except

7421-7421: Do not use bare except

7426-7426: Do not use bare except

7832-7832: Do not use bare except

brimoor

Implementation LGTM 💪

@ehofesmann if you retarget this at release/v0.24.0 we can include in the release this week 🤓

The base branch was changed.

ehofesmann · 2024-05-22T21:01:11Z

Implementation LGTM 💪

@ehofesmann if you retarget this at release/v0.24.0 we can include in the release this week 🤓

Thanks @brimoor ! Just getting back to this now, I assume I missed the window on this. I do still need to get it into teams too. It's OK if it doesn't make it into v0.24.0.

@benjaminpkane I see you changed the base back to develop, is it good to merge into there? If so, can I get a rereview?

benjaminpkane

LGTM 👍

ehofesmann added 2 commits May 10, 2024 17:19

load individual cvat job annotations instead of entire task

76f9259

update test to create multiple jobs

670bbc5

ehofesmann added the annotation Issues related to FiftyOne's annotation API label May 10, 2024

ehofesmann requested a review from a team May 10, 2024 21:52

ehofesmann self-assigned this May 10, 2024

lint

d19ef6e

coderabbitai bot reviewed May 10, 2024

View reviewed changes

brimoor previously approved these changes May 20, 2024

View reviewed changes

benjaminpkane changed the base branch from develop to release/v0.24.0 May 20, 2024 15:13

benjaminpkane changed the base branch from release/v0.24.0 to develop May 20, 2024 15:13

ehofesmann requested a review from brimoor May 22, 2024 21:01

benjaminpkane approved these changes May 22, 2024

View reviewed changes

ehofesmann merged commit af57c3b into develop May 22, 2024
9 of 10 checks passed

ehofesmann deleted the feature/cvat-large-tasks branch May 22, 2024 22:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support loading annotations for large CVAT tasks with many jobs #4392

Support loading annotations for large CVAT tasks with many jobs #4392

ehofesmann commented May 10, 2024 •

edited by coderabbitai bot

coderabbitai bot commented May 10, 2024 •

edited

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot left a comment

brimoor left a comment

ehofesmann commented May 22, 2024

benjaminpkane left a comment

Support loading annotations for large CVAT tasks with many jobs #4392

Support loading annotations for large CVAT tasks with many jobs #4392

Conversation

ehofesmann commented May 10, 2024 • edited by coderabbitai bot

What changes are proposed in this pull request?

How is this patch tested? If it is not, please explain why.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

What areas of FiftyOne does this PR affect?

Summary by CodeRabbit

coderabbitai bot commented May 10, 2024 • edited

Walkthrough

Changes

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

brimoor left a comment

Choose a reason for hiding this comment

ehofesmann commented May 22, 2024

benjaminpkane left a comment

Choose a reason for hiding this comment

ehofesmann commented May 10, 2024 •

edited by coderabbitai bot

coderabbitai bot commented May 10, 2024 •

edited

CodeRabbit Configration File (`.coderabbit.yaml`)