Skip to content

Conversation

@edan-bainglass
Copy link
Member

@edan-bainglass edan-bainglass commented Oct 12, 2025

Based on #87

This PR discards the Pydantic models introduced in aiida-restapi in favor of those introduced in AiiDA in aiidateam/aiida-core#6255 and updated in aiidateam/aiida-core#6990.

The PR also addresses a few endpoints, but this can be separated out to another PR.

TODO

  • Use entry points (unique) instead of class name for Node model references
  • Migrate remaining entry points (e.g., links) - possibly in a separate PR?

@edan-bainglass edan-bainglass changed the title Use-aiida-orm-models Leverage AiiDA ORM models in REST API Oct 12, 2025
@edan-bainglass edan-bainglass marked this pull request as ready for review November 19, 2025 09:28
@edan-bainglass edan-bainglass marked this pull request as draft November 19, 2025 14:13
@edan-bainglass edan-bainglass marked this pull request as ready for review November 20, 2025 07:41
Copy link
Member

@eimrek eimrek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @edan-bainglass. I think these changes are great, and i like the overall structure. Some extra high-level comments:

  • Some of the top level files (e.g. aiida_db_mapping) are only used in GraphQL now. Maybe for clarity they could be moved inside that subfolder.

  • As discussed earlier, the /nodes endpoint should not include /attributes and /extras/ by default as those can be large. But this also applies to other endpoints - e.g. for /groups i currently get a lot of output due to extras, but i would just want the list.

  • (also mentioned already before, but important: the uuid should be used instead of pk when retrieving singular entities).

  • Bit of a conceptual question: in the POST models, the attributes are not contained in a nested dictionary any more, so e.g.

    {
      "node_type": "data.core.int.Int",
      "value": 8
    } 
    

    instead of

    {
      "node_type": "data.core.int.Int",
      "attributes": {"value": 8}
    } 
    

    This means that for nodes that have a lot of attributes, it's more difficult to distinguish from other types of inputs (e.g. extras that are still in a dictionary). Additionally, this doesn't reflect the corresponding GET models. thoughts?

node = self.entity_cls.collection.get(pk=model.pk)

total_size = 0
for path in repo_metadata['o']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the 'o' signify?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Idea. This is from aiida-core. Object? See aiida.repository.common.File.serialize:

    def serialize(self) -> dict[str, typing.Any]:
        """Serialize the metadata into a JSON-serializable format.

        .. note:: the serialization format is optimized to reduce the size in bytes.

        :return: dictionary with the content metadata.
        """
        if self.file_type == FileType.DIRECTORY:
            if self.objects:
                return {'o': {key: obj.serialize() for key, obj in self.objects.items()}}
            return {}
        return {'k': self.key}

raise HTTPException(status_code=404, detail=f'Could not find any Computer with id {comp_id}')


@router.post(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

500 if i try to create a computer with a label that already exists. Would be great to check for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to update all try/except blocks, which currently (mostly) only handle Exception. Need to at least add a few obvious exceptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this should be changed in aiida-core (the pydantic PR). The computer model should validate this. Then we get it via FastAPI automatically. I'll update the pydantic PR.


total_size = 0
for path in repo_metadata['o']:
size = node.base.repository.get_object_size(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be slow for many files? (e.g. for the s3 backend or so on). Either way good to keep in a more specialized endpoint.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up as part of the repository metadata. which, like attributes and extras, I'm thinking of removing by default from the node GET, placing them back via dedicated endpoints, e.g., /metadata, etc.

]


@router.get(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be achieved by querying the plain /nodes with a type filter, right?

Copy link
Member Author

@edan-bainglass edan-bainglass Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. That's what this end point does. The /nodes/types/... endpoints are for convenience. It provides pre-constructed links for much of what you'd want. I added them when I switched to keying the models/types dictionaries with node types, as node_type isn't great to type.

Maybe you can say more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see. I just wonder if it's better to just have a single /nodes endpoint, just to keep the API easier to grasp and the code leaner.

raise HTTPException(status_code=404, detail=f'Could no find any node with id {nodes_id}')
raise HTTPException(status_code=404, detail=f'Could not find any node with id {node_id}')

if download_format is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice here to return the download formats for the node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so sure. This endpoint is to be hit AFTER the user knows the download formats, for which we have another endpoint. See the comment there regarding adding a per-type endpoint also.

:return: A dictionary with available download formats as keys and their descriptions as values.
"""
return resources.get_all_download_formats()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the node types still have the pipe symbol at the end. I guess we can to drop this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure 🙂 I left this endpoint as is. Good to know how they are used in practice. Also, I can add an endpoint for download formats per node type, and add a pre-constructed one to the response of /nodes/types. Again, for convenience, as node_type is no fun 🙂


@router.post('/nodes/singlefile', response_model=models.Node)
# TODO what about folderdata?
@router.post(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this endpoint should instead be unified the the /nodes POST somehow. Many nodes can have files inside them apart from FolderData and SingleFileData, and creating them might not make sense without uploading the files at the same time (e.g. BandsData, ...).

Copy link
Member Author

@edan-bainglass edan-bainglass Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss in person

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at the /nodes/file-upload endpoint's parameters. Unlike /nodes, it does not take a model as input, but rather a parameters Form + an UploadFile. This is not my work. See #25 (comment). I believe this makes this endpoint incompatible w.r.t the /nodes endpoint

@edan-bainglass
Copy link
Member Author

edan-bainglass commented Dec 4, 2025

Thanks for the review @eimrek. Most of your comments I already agree with (per our in-person discussion). I provide further comments in the review, and also here below 🙂

Thanks @edan-bainglass. I think these changes are great, and i like the overall structure. Some extra high-level comments:

  • Some of the top level files (e.g. aiida_db_mapping) are only used in GraphQL now. Maybe for clarity they could be moved inside that subfolder.

Will do 👍

  • As discussed earlier, the /nodes endpoint should not include /attributes and /extras/ by default as those can be large. But this also applies to other endpoints - e.g. for /groups i currently get a lot of output due to extras, but i would just want the list.

Will do, also for repo metadata (anything that can be large) 👍

  • (also mentioned already before, but important: the uuid should be used instead of pk when retrieving singular entities).

This was using PKs prior to this PR, so will keep it as is here. Will open another PR to move everything to UUIDs 👍

  • Bit of a conceptual question: in the POST models, the attributes are not contained in a nested dictionary any more, so e.g.
    {
      "node_type": "data.core.int.Int",
      "value": 8
    } 
    
    instead of
    {
      "node_type": "data.core.int.Int",
      "attributes": {"value": 8}
    } 
    
    This means that for nodes that have a lot of attributes, it's more difficult to distinguish from other types of inputs (e.g. extras that are still in a dictionary). Additionally, this doesn't reflect the corresponding GET models. thoughts?

GET returns NodeModel, whereas POST works with NodeCreateModel, or rather specific ones, e.g., IntCreateModel, therefor "doesn't reflect". The idea of moving attributes outside of the attributes dict is to enable validation. If they are provided in attributes they hit Node.Model.attributes, which is a dict, so the attributes aren't actually validated. But IntCreateModel has value in its model, so providing it flat WILL validate it against IntCreateModel. As for extras, well that's REALLY just a dict and will be validated as such against the inherited Node.Model.extras field.

@eimrek
Copy link
Member

eimrek commented Dec 4, 2025

GET returns NodeModel, whereas POST works with NodeCreateModel, or rather specific ones, e.g., IntCreateModel, therefor "doesn't reflect". The idea of moving attributes outside of the attributes dict is to enable validation. If they are provided in attributes they hit Node.Model.attributes, which is a dict, so the attributes aren't actually validated. But IntCreateModel has value in its model, so providing it flat WILL validate it against IntCreateModel. As for extras, well that's REALLY just a dict and will be validated as such against the inherited Node.Model.extras field.

Couldn't one in principle validate the attributes dict separately for each node type? if it's possible, it might be worth considering, e.g. for potential ideas in the future to just easily GET and POST nodes (e.g. to transfer via the web from one db to another). But of course if it makes the code very messy, this probably also is fine.

@edan-bainglass
Copy link
Member Author

edan-bainglass commented Dec 4, 2025

Couldn't one in principle validate the attributes dict separately for each node type? if it's possible, it might be worth considering, e.g. for potential ideas in the future to just easily GET and POST nodes (e.g. to transfer via the web from one db to another). But of course if it makes the code very messy, this probably also is fine.

Sure, but not via the built-in pydantic model system. It would be an ad-hoc system, which is potentially messier, as you say. But good to keep this in mind.

Update

Referencing here @eimrek's proposal detailed in aiidateam/aiida-core#7131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants