Skip to content

Conversation

@jakelishman
Copy link
Member

The previous regex used \w as the match for an initial id character, and [\w\d] for continuations. Actually, \w matches any character that str.isalnum returns True plus the underscore, which includes all Unicode characters in the "digit" class, so is a superset of \d. The match \d is only the Unicode group [Nd] (decimal digits), which is not as complete as the entire [N?] set, so we can't even use variants of [^\W\d_] or the like to get something usable; we'd have to pull in the third-party regex package or the like to make an exact match.

This instead causes the Qiskit exporter to allow only a subset of valid OQ3 identifiers on export in lieu of complete matching. Additionally, we now do not accidentally include digits in the allowed set of initial characters for IDs (but do allow then in continuations).

Summary

Details and comments

Fix #15303
Fix #15304

The previous regex used `\w` as the match for an initial id character,
and `[\w\d]` for continuations.  Actually, `\w` matches any character
that `str.isalnum` returns `True` plus the underscore, which includes
all Unicode characters in the "digit" class, so is a superset of `\d`.
The match `\d` is only the Unicode group `[Nd]` (decimal digits), which
is not as complete as the entire `[N?]` set, so we can't even use
variants of `[^\W\d_]` or the like to get something usable; we'd have to
pull in the third-party `regex` package or the like to make an exact
match.

This instead causes the Qiskit exporter to allow only a subset of valid
OQ3 identifiers on export in lieu of _complete_ matching.  Additionally,
we now do not accidentally include digits in the allowed set of initial
characters for IDs (but do allow then in continuations).
@jakelishman jakelishman added this to the 2.2.4 milestone Nov 6, 2025
@jakelishman jakelishman requested a review from a team as a code owner November 6, 2025 17:56
@jakelishman jakelishman added stable backport potential The bug might be minimal and/or import enough to be port to stable Changelog: Bugfix Include in the "Fixed" section of the changelog mod: qasm3 Related to OpenQASM 3 import or export labels Nov 6, 2025
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@coveralls
Copy link

coveralls commented Nov 6, 2025

Pull Request Test Coverage Report for Build 19320339788

Details

  • 4 of 4 (100.0%) changed or added relevant lines in 1 file are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.02%) to 88.198%

Files with Coverage Reduction New Missed Lines %
crates/transpiler/src/passes/gate_direction.rs 1 96.86%
crates/qasm2/src/lex.rs 2 92.54%
Totals Coverage Status
Change from base Build 19314831255: 0.02%
Covered Lines: 94236
Relevant Lines: 106846

💛 - Coveralls

@debasmita2102
Copy link

Would it be worth adding a test case for consecutive invalid characters? For example, if a register is named "ab--cdef" (two hyphens), then both should be escaped to "ab__cdef" (two underscores) right? I feel the current tests cover single invalid characters like "3qr" and "j²", but I couldn't find a test verifying that BAD_IDENTIFIER_CHARACTERS.sub("_", name) replaces all matches when there are multiple consecutive ones.

Just wanted to check if this edge case is worth covering, or if you think the existing tests are sufficient?
It does the conversion correctly. I am only asking about test case.

@jakelishman
Copy link
Member Author

Thanks Debasmita, that's a good idea. Done in 9745eab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: Bugfix Include in the "Fixed" section of the changelog mod: qasm3 Related to OpenQASM 3 import or export stable backport potential The bug might be minimal and/or import enough to be port to stable

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenQASM 3 exporter does not escape digits at the start of identifiers OpenQASM 3 export does not escape unusual Unicode digits

4 participants