Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parametrized query support for SELECTs and EXPLAINs #1725

Closed
wants to merge 14 commits into from

Conversation

larsschwegmann
Copy link
Contributor

This PR depends on the changes from this PR on tortoise/pypika-tortoise, which does not have a release yet.
So this cannot be merged until then.
If you have feedback on the idea i would be happy to hear it!

Description

This PR adds parametrization to SELECT and EXPLAIN queries for filter expressions by utilizing the ListParameter introduced in the tortoise-pypika PR mentioned above.
It allows us to get this functionality with minimal changes to the filter expressions, because all ValueWrappers in the query get support for Parametrization automatically when a parameter kwarg is supplied to pypika .get_sql() method on a QueryBuilder.
That way, we don't have to keep track of all parameters during query building.
I left the INSERT and UPDATE query building of the executor untouched, so that works like before. It could be refactored to use the ListParameter type as well.

Motivation and Context

Parametrization (as in prepared statements) provides better protection against SQL injection attacks. It is defined as an important goal for tortoises 1.0 roadmap -> Issue

How Has This Been Tested?

Unit tests are passing.

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added the changelog accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@abondar
Copy link
Member

abondar commented Oct 1, 2024

Hi!

Thanks for PR
I released pypika tortoise with version 0.2.0, you can use it now

@abondar
Copy link
Member

abondar commented Oct 1, 2024

Regarding refactoring insert and update statements - seems like good idea, if you could tackle it - would be great

@larsschwegmann
Copy link
Contributor Author

@abondar Thank you! I updated the pyproject.toml with the new pypika-tortoise version and ran poetry lock

@larsschwegmann
Copy link
Contributor Author

Regarding refactoring insert and update statements - seems like good idea, if you could tackle it - would be great

Not sure how busy my day is going to be today, but I'll try to get it in later today or tomorrow 👍
My project could really benefit from these changes :)

@larsschwegmann
Copy link
Contributor Author

seems like there are some errors from ruff. I'll get those fixed as well

@abondar
Copy link
Member

abondar commented Oct 1, 2024

If it's not too convoluted to describe - can you please share how does it help you? :)

Is it just about security concerns, as you have public APIs? Or it gives you some kind of new possibilities?

@larsschwegmann
Copy link
Contributor Author

If it's not too convoluted to describe - can you please share how does it help you? :)

Is it just about security concerns, as you have public APIs? Or it gives you some kind of new possibilities?

Yeah, it is mostly about security. We have an api that allows rich filtering of data and we do this with tortoise's Q expressions. It is an internal only tool, but I would have more peace of mind with prepared statements for all user input :)

@larsschwegmann
Copy link
Contributor Author

While fixing the code to make the ci tests pass, I noticed that there was still something wrong with the pypika ValueWrapper implementation, which I have addressed in this PR.
@abondar Could you please have a look at that? If that is merged, the tests should passt :)

I have also looked into refactoring the insert and update queries, but I came to the conclusion that its best to leave them as is, because using the parametrized ValueWrappers there as well would make caching the queries impossible. The approach we have there right now with caclulating the queries upfront and only adjusting the values supplied when executing it is already optimal i think.

@abondar
Copy link
Member

abondar commented Oct 2, 2024

@larsschwegmann I released 0.2.1 version of pypika

@larsschwegmann
Copy link
Contributor Author

There seems to be an issue with the extraction of Parameter values when nested inside pypika Functions, such as Upper(), which is used in the icontains filter, for example. I have fixed this in yet another PR for pypika-tortoise. Please review 😬
The "parameters" kwarg was not passed down to nested Terms during function sql generation.
tortoise/pypika-tortoise#13

@abondar
Copy link
Member

abondar commented Oct 2, 2024

@larsschwegmann any chances you can make your PR work locally with locally built version of pypika? So that we publish another version when we are sure it work properly for all testcases?

I think it would be even okay to vendor this version of lib in intermediate commit in this PR, if you are having issues with running all testscases locally

@larsschwegmann
Copy link
Contributor Author

@larsschwegmann any chances you can make your PR work locally with locally built version of pypika? So that we publish another version when we are sure it work properly for all testcases?

I think it would be even okay to vendor this version of lib in intermediate commit in this PR, if you are having issues with running all testscases locally

I'll try that 👍

@henadzit
Copy link
Contributor

henadzit commented Nov 11, 2024

Hey @larsschwegmann, thank you for working on this!

I'm getting up to speed with tortoise and parametrized queries is something that I'm interested in! I looked into your code and also into how pypika handles parametrized queries in general, below are my findings:

  • Currently the parameter values (I mean value in ... where field_name = 'value' ...) are wrapped with ValueWrapper, MySQLValueWrapper or others Pypika terms intended for literals. These terms might be doing character escaping, for examle, value.replace("\\", "\\\\") in the case of MySQL. However, I'm not sure that tortoise actually has to do escaping of parameter values. I checked the postgres and mysql database clients that tortoise use and they do escaping of parameter values by default! So, doing this in tortoise seems redundant.
  • Actually the fact that both tortoise and the database clients do escaping seems to be the reason why the test_char_fuzz fails in this PR.
  • I checked Django's codebase and they seem to escape only like and ilike queries which seems to be an exception
  • QuerySet has the sql method that returns SQL to be executed. With the code changes in this PR it returns different SQL with values in place.
  • I'm wondering if it would be more correct to make tortoise use ParameterValueWrapper or something when resolving the query in tortoise.

Please let me know what you think!

@henadzit
Copy link
Contributor

Here is my attempt to parametrize SELECT queries that incorporates some ideas from this PR #1777

@henadzit
Copy link
Contributor

Closing this in favor of #1777. @larsschwegmann, thank you for you work!

@henadzit henadzit closed this Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants