Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Add API graph support for parameter annotations #18112

Merged
merged 1 commit into from
Dec 5, 2024

Conversation

tausbn
Copy link
Contributor

@tausbn tausbn commented Nov 26, 2024

Adds API graph support for observing that in

def foo(x : Bar): ...

The variable x is likely to be an instance of the type Bar inside this function.
In particular, we add getInstanceFromAnnotation as a predicate on API graph nodes that tracks this step (corresponding to a new edge type labeled with "annotation" in the API graph), and extend the existing getAnInstance predicate to also include instances arising from type annotations.

A more complete solution would also add support for annotated assignments (x : Foo = ... or just x : Foo) as well as track types through type aliases (type Foo = Bar). This turns out to be non-trivial, however, as these type constructs don't have any CFG nodes (and so no data-flow nodes by default either). In order to not have perfect be the enemy of good, this commit is only targeting the type parameter case (which is also likely to be the most common use case anyway).

The tests for API graphs have been extended accordingly, including tests for the kinds of type ascriptions that we don't currently model in API graphs (marked with MISSING: in the inline tests).

Pull Request checklist

All query authors

Internal query authors only

  • Autofixes generated based on these changes are valid, only needed if this PR makes significant changes to .ql, .qll, or .qhelp files. See the documentation (internal access required).
  • Changes are validated at scale (internal access required).
  • Adding a new query? Consider also adding the query to autofix.

Adds API graph support for observing that in
```python
def foo(x : Bar): ...
```
The variable `x` is likely to be an instance of the type `Bar` inside
this function.
In particular, we add `getInstanceFromAnnotation` as a predicate on API
graph nodes that tracks this step (corresponding to a new edge type
labeled with "annotation" in the API graph), and extend the existing
`getAnInstance` predicate to also include instances arising from type
annotations.

A more complete solution would also add support for annotated
assignments (`x : Foo = ...` or just `x : Foo`) as well as track types
through type aliases (`type Foo = Bar`). This turns out to be
non-trivial, however, as these type constructs don't have any CFG nodes
(and so no data-flow nodes by default either). In order to not have
perfect be the enemy of good, this commit is only targeting the type
parameter case (which is also likely to be the most common use case
anyway).

The tests for API graphs have been extended accordingly, including tests
for the kinds of type ascriptions that we _don't_ currently model in API
graphs (marked with `MISSING:` in the inline tests).
@tausbn
Copy link
Contributor Author

tausbn commented Nov 27, 2024

Performance comparison looks completely uneventful. Opening this up for review.

@tausbn tausbn marked this pull request as ready for review November 27, 2024 13:19
@tausbn tausbn requested a review from a team as a code owner November 27, 2024 13:19
local_x #$ MISSING: use=moduleImport("types").getMember("AssignmentAnnotation").getAnnotatedInstance()

global_x : AssignmentAnnotation #$ use=moduleImport("types").getMember("AssignmentAnnotation")
global_x #$ MISSING: use=moduleImport("types").getMember("AssignmentAnnotation").getAnnotatedInstance()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this missing? Is it because there is no assignment on the line above, so that global_x is not in getTarget (which is presumably empty)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... This is a very salient question. If I quick-eval the annotatedInstance predicate, I get four results:

  • ControlFlowNode for ImportMember, ControlFlowNode for global_x
  • ControlFlowNode for ImportMember, ControlFlowNode for parameter_y
  • ControlFlowNode for Alias, ControlFlowNode for parameter_z
  • ControlFlowNode for Alias, ControlFlowNode for global_z

So, we are picking up the instancing from from ... import AssignmentAnnotation to global_x : AssignmentAnnotation, but we're not picking up that the annotation is a use of that same identifier as in the import statement. What's curious, then, is that global_x isn't seen as an instance of AssignmentAnnotation. For global_z it makes sense, since we don't understand the simple type aliasing that's taking place on line 13.

Looking at getTarget it does exist for the type ascription of global_x, but it seems that we do not track the flow between the two occurrences of global_x. I'm trying to figure out why now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought occurred to me after writing that message. Could it be that we're observing that global_x gets overwritten here (because it's the target of an assignment), but then when we go to see what value was assigned we don't find it (because it's just a type ascription)? That would explain the weird behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that could be it. I wonder if global_x in global_x : AssignmentAnnotation should actually be considered a use rather than a def..

@yoff yoff self-requested a review December 5, 2024 14:02
Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, let us merge this now; it is a clear improvement :-)

@yoff yoff merged commit 81c8a70 into main Dec 5, 2024
15 checks passed
@yoff yoff deleted the tausbn/add-api-graph-support-for-parameter-annotations branch December 5, 2024 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants