You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code: from datasets import load_dataset test_set = load_dataset("parler-tts/mls-eng-10k-tags_tagged_10k_generated", split="test") test_set[0]
The output: {'original_path': 'http://www.archive.org/download/lesmis3_0911_0911/lesmiserables_vol3_22_hugo_64kb.mp3', 'begin_time': 119.15, 'end_time': 132.26, 'audio_duration': 13.109999999999983, 'speaker_id': '7171', 'book_id': '3158', 'utterance_pitch_mean': 172.13397216796875, 'utterance_pitch_std': 71.41407012939453, 'snr': 47.84040069580078, 'c50': 57.13105392456055, 'speaking_rate': 'slightly slowly', 'phonemes': 'ʌnd hi nu ðʌ ʌndʒʌst ʃeɪm ʌnd ðʌ pɔɪnjʌnt blʌʃʌz ʌv ædmɜ˞ʌbʌl ʌnd tɛɹʌbʌl tɹaɪʌl fɹʌm wɪtʃ ðʌ fibʌl ɪmɜ˞dʒ beɪs fɹʌm wɪtʃ ðʌ stɹɔŋ ɪmɜ˞dʒ sʌblaɪm', 'gender': 'male', 'pitch': 'very high pitch', 'noise': 'moderate ambient sound', 'reverberation': 'very confined sounding', 'speech_monotony': 'slightly expressive', 'text_description': ' A man speaks with a slightly expressive tone in a confined space, his voice echoing slightly but overall sounding quite clear, with moderate ambient sound in the background. His pitch is very high, but his delivery is only slightly slower than normal.', 'original_text': 'and he knew the unjust shame and the poignant blushes of wretchedness admirable and terrible trial from which the feeble emerge base from which the strong emerge sublime', 'text': 'And he knew the unjust shame and the poignant blushes of wretchedness. Admirable and terrible trial from which the feeble emerge base, from which the strong emerge sublime.'}
'text': 'And he knew the unjust shame and the poignant blushes of wretchedness. Admirable and terrible trial from which the feeble emerge base, from which the strong emerge sublime.'
'begin_time': 119.15, 'end_time': 132.26 (in s) correspond to 1.9858 and 2.2043 (in minute)
However, after listening to the audio of http://www.archive.org/download/doublelifeofalfredburton_1801_librivox/doublelifealfredburton_14_oppenheim_64kb.mp3,
I found that the begining time and end time of 'text': 'Mr Cowper looked at his visitor in amazement, my young friend. He said: are you going to tell me that you have seen one of these beans? Not only that, but i have eaten one. Burton said, in fact, i have eaten two.' in the audio are 1.59 and 2.11 minutes., which are not aligned with the labels in the dataset.
The text was updated successfully, but these errors were encountered:
I have encountered inaccuracies in the labels provided in the dataset at https://huggingface.co/datasets/parler-tts/mls-eng-10k-tags_tagged_10k_generated.
The code:
from datasets import load_dataset
test_set = load_dataset("parler-tts/mls-eng-10k-tags_tagged_10k_generated", split="test")
test_set[0]
The output:
{'original_path': 'http://www.archive.org/download/lesmis3_0911_0911/lesmiserables_vol3_22_hugo_64kb.mp3', 'begin_time': 119.15, 'end_time': 132.26, 'audio_duration': 13.109999999999983, 'speaker_id': '7171', 'book_id': '3158', 'utterance_pitch_mean': 172.13397216796875, 'utterance_pitch_std': 71.41407012939453, 'snr': 47.84040069580078, 'c50': 57.13105392456055, 'speaking_rate': 'slightly slowly', 'phonemes': 'ʌnd hi nu ðʌ ʌndʒʌst ʃeɪm ʌnd ðʌ pɔɪnjʌnt blʌʃʌz ʌv ædmɜ˞ʌbʌl ʌnd tɛɹʌbʌl tɹaɪʌl fɹʌm wɪtʃ ðʌ fibʌl ɪmɜ˞dʒ beɪs fɹʌm wɪtʃ ðʌ stɹɔŋ ɪmɜ˞dʒ sʌblaɪm', 'gender': 'male', 'pitch': 'very high pitch', 'noise': 'moderate ambient sound', 'reverberation': 'very confined sounding', 'speech_monotony': 'slightly expressive', 'text_description': ' A man speaks with a slightly expressive tone in a confined space, his voice echoing slightly but overall sounding quite clear, with moderate ambient sound in the background. His pitch is very high, but his delivery is only slightly slower than normal.', 'original_text': 'and he knew the unjust shame and the poignant blushes of wretchedness admirable and terrible trial from which the feeble emerge base from which the strong emerge sublime', 'text': 'And he knew the unjust shame and the poignant blushes of wretchedness. Admirable and terrible trial from which the feeble emerge base, from which the strong emerge sublime.'}
Analysis:
However, after listening to the audio of
http://www.archive.org/download/doublelifeofalfredburton_1801_librivox/doublelifealfredburton_14_oppenheim_64kb.mp3
,I found that the begining time and end time of
'text': 'Mr Cowper looked at his visitor in amazement, my young friend. He said: are you going to tell me that you have seen one of these beans? Not only that, but i have eaten one. Burton said, in fact, i have eaten two.'
in the audio are 1.59 and 2.11 minutes., which are not aligned with the labels in the dataset.The text was updated successfully, but these errors were encountered: