Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119609: Add PyUnicode_Export() and PyUnicode_Import() functions #119610

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented May 27, 2024

@vstinner
Copy link
Member Author

Docs build failed because of #119607

@vstinner
Copy link
Member Author

Docs build failed because of #119607

PR rebased on top of this fix.

@zooba
Copy link
Member

zooba commented May 28, 2024

Petr indicated on the issue that he's got Thoughts and will write them up this week.

@vstinner vstinner marked this pull request as draft May 29, 2024 12:21
@vstinner vstinner changed the title gh-119609: Add PyUnicode_AsNativeFormat() function gh-119609: Add PyUnicode_Export() and PyUnicode_Import() functions Jun 13, 2024
@vstinner vstinner marked this pull request as ready for review June 21, 2024 08:59
@vstinner
Copy link
Member Author

@encukou: Please review the updated PR.

@encukou: I prefer to make sure that the exported string ends with a NUL character, rather than making sure that it's not the case. It's convenient and cheap.

I rebased the PR on the main branch, I fixed merge conflicts, I updated the doc for new function names, I included some of Petr's suggestions. I marked the PR as ready for review (it's no longer a draft).

Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
(I don't know about the other C API WG members.)

Doc/c-api/unicode.rst Outdated Show resolved Hide resolved
Doc/c-api/unicode.rst Show resolved Hide resolved
Doc/c-api/unicode.rst Show resolved Hide resolved
Doc/c-api/unicode.rst Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

@davidism: Does the proposed API solve your MarkupSafe use case?

@vstinner
Copy link
Member Author

I created Add PyUnicode_Export() and PyUnicode_Import() to the limited C API issue in the C API WG Decisions project.

@davidism
Copy link

davidism commented Jun 24, 2024

I really appreciate you thinking about this after talking at PyCon! From what I can tell (I'm not very familiar with C), the MarkupSafe code would mostly remain the same, but use the two new abi3 functions instead of the existing functions? Perhaps you could show a brief example or a high level explanation of what I would change? https://github.com/pallets/markupsafe/blob/d12057361ad75c4569e2e61712c234acc69d5d0b/src/markupsafe/_speedups.c

@vstinner
Copy link
Member Author

@davidism:

I really appreciate you thinking about this after talking at PyCon!

You're welcome.

From what I can tell (I'm not very familiar with C), the MarkupSafe code would mostly remain the same, but use the two new abi3 functions instead of the existing functions? Perhaps you could show a brief example or a high level explanation of what I would change?

I created a PR to show how these functions can be used: pallets/markupsafe#451

The stable ABI is less efficient since it requires to allocate a UCS1/UCS2/UCS4 buffer first, write into the buffer, and only then create a Python str object from this buffer. That's because PyUnicode_New() is excluded from the stable ABI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants