Skip to content

Conversation

@Ash-Crow
Copy link
Collaborator

@Ash-Crow Ash-Crow commented Nov 19, 2025

Purpose

We have the user full name through OIDC in the database, but the search only used the email field.

This change allows to search for a user by their first and/or last name (fix #929).

Given that user names are more likely than emails to include diacritics, it unaccents both the query and the database entry for search (fix #1091). It also unaccents for email so that internationalized domain names are managed whether or not the accent is included in the search.

Proposal

  • Add the ability to search by full name
  • Unaccent both the query and the database search
  • Added an unaccented GIN index on the email and full_name fields of the User model.
Capture d’écran du 2025-11-19 14-44-22 Capture d’écran du 2025-11-19 11-30-59

External contributions

Thank you for your contribution! 🎉

Please ensure the following items are checked before submitting your pull request:

  • I have read and followed the contributing guidelines
  • I have read and agreed to the Code of Conduct
  • I have signed off my commits with git commit --signoff (DCO compliance)
  • I have signed my commits with my SSH or GPG key (git commit -S)
  • My commit messages follow the required format: <gitmoji>(type) title description
  • I have added a changelog entry under ## [Unreleased] section (if noticeable change)
  • I have added corresponding tests for new features or bug fixes (if applicable)

@Ash-Crow Ash-Crow requested a review from lunika November 19, 2025 13:55
@Ash-Crow Ash-Crow force-pushed the user-search-accents branch 5 times, most recently from 33d70ed to 7e5e69b Compare November 19, 2025 15:13
@lunika
Copy link
Member

lunika commented Nov 19, 2025

I think we can also introduce a new index on the User model ? One on the fullname colume and an other on the email column` ? WDYT ?

@Ash-Crow
Copy link
Collaborator Author

I think we can also introduce a new index on the User model ? One on the fullname colume and an other on the email column` ? WDYT ?

Good idea!

@Ash-Crow
Copy link
Collaborator Author

It tried to create a GinIndex combined with Unaccent through an OpClass,

indexes = [
    GinIndex(
        OpClass(Func(F("email"), function="unaccent"), name="gin_trgm_ops"),
        name="user_email_unaccent_trgm_idx",
    ),
    GinIndex(
        OpClass(Func(F("full_name"), function="unaccent"), name="gin_trgm_ops"),
        name="user_name_unaccent_trgm_idx",
    ),
]

but apparently it is impossibe because Unaccent is only STABLE, not IMMUTABLE (per this StackOverflow thread)

There are a few workarounds but I'm not sure what is the best approach.

@github-actions
Copy link

🚀 Preview will be available at https://1637-docs.ppr-docs.beta.numerique.gouv.fr/

You can use the existing account with these credentials:

  • username: docs
  • password: docs

You can also create a new account if you want to.

Once this Pull Request is merged, the preview will be destroyed.

@lunika
Copy link
Member

lunika commented Nov 20, 2025

Can you squash your commits ? They are all related, you can keep only one.

I added the preview label, once your push force made this PR will be available to test.

@Ash-Crow Ash-Crow force-pushed the user-search-accents branch from a98425a to 2826db8 Compare November 20, 2025 16:38
@Ash-Crow
Copy link
Collaborator Author

I added the requested index.

For what is worth, I checked the queries through EXPLAIN ANALYZE:

Before the migration

 Planning Time: 0.137 ms
 Execution Time: 0.848 ms
(8 rows)

After the migration (same query)
 Planning Time: 0.266 ms
 Execution Time: 0.440 ms
(8 rows)

After the migration (removed redundant calls to Unaccent)

 Planning Time: 0.162 ms
 Execution Time: 0.335 ms
(8 rows)

@Ash-Crow Ash-Crow force-pushed the user-search-accents branch 3 times, most recently from 44beedc to d2befee Compare November 20, 2025 17:13
We have the user full name through OIDC in the database, but the search only
used the email field.
This change allows to search for a user by their first and/or
last name (fix #929).
Given that user names are more likely than emails to include diacritics, it
unaccents both the query and the database entry for search (fix #1091).
It also unaccents for email so that internationalized domain names are
managed whether or not the accent is included in the search.
An unaccented gin index is added on users full_name an email fields.
Using a manual migration because a wrapper around unaccent is necessary
to make it IMMUTABLE (cf.
https://stackoverflow.com/questions/9063402/ )
@Ash-Crow Ash-Crow force-pushed the user-search-accents branch from d2befee to db94bf5 Compare November 20, 2025 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User search is sensitive to special characters Use first name and last name for user search

3 participants