Skip to content

Conversation

coezbek
Copy link

@coezbek coezbek commented Oct 13, 2025

What does this PR do ?

This PR fixes that tensors/numpy can't be passed to transcribe() if timestamps==True

Changelog

  • This PR matches the batch format of audio tensors against the expected data loader format.
  • It also disables chunking if audio tensors are passed, because the tensors don't contain cutting info and need to be cut by the caller.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • [NA] Did you write any new necessary tests?
  • [NA] Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

@titu1994, @redoctopus, @jbalam-nv, or @okuchaiev

Additional Information

@github-actions github-actions bot added the ASR label Oct 13, 2025
@nithinraok
Copy link
Collaborator

Hi @coezbek, thanks for the PR.
Could you also add a test to this file that checks timestamps using NumPy or tensor input in addition to the audio file path?
See the test_aed_forced_aligned_timestamps function.
Add a similar function named test_aed_forced_aligned_timestamps_with_tensors.

@nithinraok
Copy link
Collaborator

Follow these steps https://github.com/NVIDIA-NeMo/NeMo/pull/14918/checks?check_run_id=52593146372 to sign off commit and also please send PR to main branch instead of 2.5.0

@coezbek
Copy link
Author

coezbek commented Oct 13, 2025

@nithinraok I just wanted to point out one way to fix it. I am not sure I can put in the work for a comprehensive fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants