Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass custom label schema attribute keys to annotation backends #5502

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

ehofesmann
Copy link
Member

@ehofesmann ehofesmann commented Feb 19, 2025

What changes are proposed in this pull request?

Previously, the keys of attribute properties in an annotation label schema were filtered to only include a preconfigured set of values. For example, if you were to pass in this label schema:

{
  "gt": {
    "type": "detections", 
    "classes": [
      {
        "classes": ["c1"], 
        "attributes": {"a1": {"type": "select", "values": ["v1", "v2"], "otherattrf": "yes"}
  }
}

It would only pass along the "type" and "values" properties and remove the "otherattrf". This means that if a new annotation backend has custom attribute properties, there is no way to define those in the label schema.

How is this patch tested? If it is not, please explain why.

The cvat, labelbox, and labelstudio tests were rerun and passed.

Printing the label schema from inside of an annotation backend now shows the custom attributes.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

Annotation label schema attributes now support custom attributes for annotation backends.

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Summary by CodeRabbit

  • Refactor
    • Improved the handling of attribute data to ensure original definitions remain unchanged, enhancing overall stability.

@ehofesmann ehofesmann added the annotation Issues related to FiftyOne's annotation API label Feb 19, 2025
@ehofesmann ehofesmann requested a review from a team February 19, 2025 22:02
@ehofesmann ehofesmann self-assigned this Feb 19, 2025
Copy link
Contributor

coderabbitai bot commented Feb 19, 2025

Walkthrough

The changes modify the attribute handling in the _format_attributes function. Instead of altering the original attribute dictionary directly, the function now creates a copy and performs all modifications on this copy. The extraction of values, error checking, and key assignments remain intact, but the modification sequence is updated to ensure that the original input is preserved.

Changes

File Summary
fiftyone/.../annotations.py Updated _format_attributes to copy the input attribute dictionary before modification. Adjusted the sequence for key assignments while preserving original data.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Formatter
    Caller->>Formatter: Call _format_attributes(attr)
    Formatter->>Formatter: Copy attr to _attr
    Formatter->>Formatter: Extract keys (attr_type, values, default, mutable, read_only)
    Formatter->>Formatter: Validate attribute type and check for errors
    Formatter->>Caller: Return modified _attr
Loading

Suggested reviewers

  • brimoor

Poem

I'm a rabbit with a curious flair,
Hoping through code with a skip and a stare,
I see the changes, light and true,
Preserving the data in a clever new view,
With each fresh hop, the code renews 🐇✨
Hip-hip-hooray, let the changes ensue!


📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa980aa and ab730f2.

📒 Files selected for processing (1)
  • fiftyone/utils/annotations.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • fiftyone/utils/annotations.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
  • GitHub Check: test / test-app
  • GitHub Check: e2e / test-e2e
  • GitHub Check: build / build
  • GitHub Check: lint / eslint
  • GitHub Check: build

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

default = attr.get("default", None)
mutable = attr.get("mutable", None)
read_only = attr.get("read_only", None)
_attr = attr.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use deepcopy here instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation doesn't modify nested values so it doesn't matter. However, agree that deepcopy would be more proper.

@@ -978,7 +979,7 @@ def _format_attributes(backend, attributes):
)
)

_attr = {"type": attr_type}
_attr["type"] = attr_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a 100% sure how this works, we already are getting attr_type from _attr in line 959 and attrr_type is not modified or updated in between?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @ehofesmann implements my suggestion above, then this line would be required.

As currently implemented, I agree it is a no-op.

read_only = attr.get("read_only", None)
_attr = attr.copy()
attr_type = _attr.get("type", None)
values = _attr.get("values", None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehofesmann FYI I believe this slightly changes how _attr is constructed.

For example:

  • Previously if attr["mutable"] == None, then _attr would not contain a "mutable" key, since it is only added if it is non-None
  • Now as implemented, _attr would contain a "mutable" key with value None

I'm not sure if that change in behavior breaks anything, but you could maintain the current behavior if you used pop() instead of get():

attr_type = _attr.pop("type", None)
values = _attr.pop("values", None)
default = _attr.pop("default", None)
mutable = _attr.pop("mutable", None)
read_only = _attr.pop("read_only", None)

@@ -978,7 +979,7 @@ def _format_attributes(backend, attributes):
)
)

_attr = {"type": attr_type}
_attr["type"] = attr_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @ehofesmann implements my suggestion above, then this line would be required.

As currently implemented, I agree it is a no-op.

default = attr.get("default", None)
mutable = attr.get("mutable", None)
read_only = attr.get("read_only", None)
_attr = attr.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation doesn't modify nested values so it doesn't matter. However, agree that deepcopy would be more proper.

@ehofesmann
Copy link
Member Author

Great calls @minhtuev and @brimoor , thanks!!

I like the idea of popping the attributes, I've made that change and used deepcopy instead of copy. I reran the cvat, label studio, and labelbox tests and they succeeded

Copy link
Contributor

@minhtuev minhtuev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Eric!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
annotation Issues related to FiftyOne's annotation API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants