Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word log-likelihood scores #17

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DomhnallBoyle
Copy link

Hi, I created this pull request to add word log-likelihood scores to the output

I used this for other projects but hopefully it can be useful for someone else too

Thanks

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Type: Enhancement

PR Summary: The pull request introduces an enhancement to the existing functionality by adding log-likelihood scores to the output of word and phone alignments. This change allows for a more detailed analysis of the alignment results by including the log-likelihood scores alongside the start and end times of phones within words.

Decision: Comment

📝 Type: 'Enhancement' - not supported yet.
  • Sourcery currently only approves 'Typo fix' PRs.
✅ Issue addressed: this change correctly addresses the issue or implements the desired feature.
No details provided.
✅ Small diff: the diff is small enough to approve with confidence.
No details provided.

General suggestions:

  • Consider adding error handling or validation for the new log-likelihood score extraction to prevent potential runtime errors due to unexpected data formats. This could improve the robustness of the feature and ensure consistent behavior across different datasets.
  • Given the addition of log-likelihood scores, it might be beneficial to update any related documentation or examples to demonstrate how to use and interpret these new scores. This could help users take full advantage of the new feature.

Thanks for using Sourcery. We offer it for free for open source projects and would be very grateful if you could help us grow. If you like it, would you consider sharing Sourcery on your favourite social media? ✨

Share Sourcery

Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

@@ -203,14 +202,15 @@ def read_aligned_mlf(mlffile, SR, wave_start):

# Append this phone to the latest word (sub-)list
ph = lines[j].split()[2]
log_likelihood = float(lines[j].split()[3])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (llm): Extracting the log_likelihood directly from lines[j].split()[3] without checking if the log_likelihood value exists or if the split operation resulted in enough elements could lead to an IndexError if the data format is ever not as expected. Consider adding a check to ensure that the data format is correct before accessing the index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants