Fix TDT beam search timestamp alignment #14912
Open
+130
−38
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
Fixes two TDT beam search timestamp issues:
timestamps=True
(NoneType iteration error)Collection: ASR
Changelog
token_durations
inBatchedBeamHyps
for TDT models during beam search_timestamp_semantics
flag onHypothesis
objects_compute_offsets_tdt
to handle both START and END timestamp semantics"char"
field documentation fromList[str]
toList[int]
(pre-existing bug:y_sequence
contains integer token IDs, not strings)Problem
Issue 1: Crash with beam search + timestamps
When using TDT beam search with
timestamps=True
, the code crashes with:Root cause: Beam search (
BatchedBeamHyps
) doesn't populate thetoken_duration
field that_compute_offsets_tdt
requires.Issue 2: ~160ms timestamp offset
After fixing the crash (by computing durations from timestamp diffs), beam search timestamps are still ~160ms late compared to greedy. This occurs because:
timestamp = timesteps + duration
timestamp = timesteps
_compute_offsets_tdt
assumed all timestamps were START timesSolution
Three-part approach:
BatchedBeamHyps
(already receiving them during beam search, now stored)_timestamp_semantics
attribute onHypothesis
objects"end"
(timestamps are END times)"start"
(timestamps are START times)_compute_offsets_tdt
based on semantics:start_offset = timestamp - duration, end_offset = timestamp
start_offset = timestamp, end_offset = timestamp + duration
Usage
No API changes. The fix is transparent to users:
Impact
GitHub Actions CI
Ready for CI. Please add "Run CICD" label.
Before your PR is "Ready for review"
Pre checks:
PR Type:
Who can review?
@andrusenkoau
Per contributor guidelines, requesting review from ASR team: