-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shared_list does not have data_set in forward block with TIMIT tutorial #157
Comments
# --------FORWARD--------#
for forward_data in forward_data_lst:
# Compute the number of chunks
N_ck_forward=compute_n_chunks(out_folder,forward_data,ep,N_ep_str_format,'forward')
N_ck_str_format='0'+str(max(math.ceil(np.log10(N_ck_forward)),1))+'d'
processes = list()
info_files = list()
for ck in range(N_ck_forward):
if not is_production:
print('Testing %s chunk = %i / %i' %(forward_data,ck+1, N_ck_forward))
else:
print('Forwarding %s chunk = %i / %i' %(forward_data,ck+1, N_ck_forward))
# output file
info_file=out_folder+'/exp_files/forward_'+forward_data+'_ep'+format(ep, N_ep_str_format)+'_ck'+format(ck, N_ck_str_format)+'.info'
config_chunk_file=out_folder+'/exp_files/forward_'+forward_data+'_ep'+format(ep, N_ep_str_format)+'_ck'+format(ck, N_ck_str_format)+'.cfg'
# Do forward if the chunk was not already processed
if not(os.path.exists(info_file)):
# Doing forward
# getting the next chunk
next_config_file=cfg_file_list[op_counter]
# run chunk processing
if _run_forwarding_in_subprocesses(config):
shared_list = list()
print("shared list",shared_list)
output_folder = config['exp']['out_folder']
save_gpumem = strtobool(config['exp']['save_gpumem'])
use_cuda=strtobool(config['exp']['use_cuda'])
p = read_next_chunk_into_shared_list_with_subprocess(read_lab_fea, shared_list, config_chunk_file, is_production, output_folder, wait_for_process=True)
data_name, data_end_index_fea, data_end_index_lab, fea_dict, lab_dict, arch_dict, data_set_dict = extract_data_from_shared_list(shared_list)
print("shared list", shared_list)
print("output folder",output_folder)
print("data_set_dict",type(data_set_dict))
print("data_set_dict",data_set_dict)
data_set_inp, data_set_ref = convert_numpy_to_torch(data_set_dict, save_gpumem, use_cuda) |
When is shared_list overwrite? |
Hi ! Isn't it simply a problem with the path of the test dataset in the config file ? |
Yes, it looks like that!
…On Thu, 29 Aug 2019 at 04:48, Parcollet Titouan ***@***.***> wrote:
Hi ! Isn't it simply a problem with the path of the test dataset in the
config file ?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVUTZHOAI7NUZZN4R4DQG6EM3A5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5NX6NI#issuecomment-526090037>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEA2ZVUQKVVQBV35NWHQSSLQG6EM3ANCNFSM4IRZSXTQ>
.
|
I will check again. |
I'm still in trouble. ERROR MSG
cfg
|
data_name, data_end_index_fea, data_end_index_lab, lab_dict, data_set_dict is None.
lab_folder
exp/TIMIT_MLP_basic/exp_files/forward_TIMIT_test_ep23_ck0.cfg
|
Did you find a solution to this? I am having the exact same issue. Double checked all paths in my cfg file and the same error is occurring. Note: I am using PyTorch-Kaldi on WSL without CUDA (still no CUDA support on WSL) not sure if this would make a difference. |
It looks like and error in reading features and labels with kaldi.
To debug, you can try to "manually" read the features in this way:
1- select one ark file in the
/mnt/mscteach_home/s1870525/dissertation/PruninNeuralNetworksSpeech/s5/data/test_dev93/feats.scp
(e.g, quick_test/fbank/raw_fbank_dev.1.ark)
2- run copy-feats ark:your_ark_file.ark ark,t:- . If everything works you
should see a lot of numbers is standard output. If it doesn't work, try to
take a look into the error.
3- If it works, you can add the options and you can write: copy-feats
ark:your_ark.ark ark:- | apply-cmvn
--utt2spk=ark:/mnt/mscteach_home/s1870525/dissertation/PruninNeuralNetworksSpeech/s5/data/test_dev93/utt2spk
ark:/mnt/mscteach_home/s1870525/dissertation/PruninNeuralNetworksSpeech/s5/data/test_dev93/data/cmvn_test_dev93.ark
ark:- ark:- | add-deltas --delta-order=2 ark:- ark,t:- If it doesn't work,
take a look into the error message.
You can also try to take a look into the log.log file you find into the
output folder.
Please, let me know if you are able to solve the data loading issue...
…On Wed, 2 Oct 2019 at 09:29, spencerkirn ***@***.***> wrote:
Did you find a solution to this? I am having the exact same issue. Double
checked all paths in my cfg file and the same error is occurring.
Note: I am using PyTorch-Kaldi on WSL without CUDA (still no CUDA support
on WSL) not sure if this would make a difference.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVTDF3BUVASYTBN5OV3QMSO3LA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAEXV4A#issuecomment-537492208>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEA2ZVU5Z7HOD7RZI763UTDQMSO3LANCNFSM4IRZSXTQ>
.
|
Thank you for the quick reply. I apologize if these are basic questions, I am new to using Kaldi and this toolkit. So I ran |
Does /home/spencer/kaldi/egs/timit/s5/data/cmvn_dev.ark exists?
Mirco
…On Thu, 3 Oct 2019 at 09:24, spencerkirn ***@***.***> wrote:
Thank you for the quick reply. I apologize if these are basic questions, I
am new to using Kaldi and this toolkit. So I ran copy-feats
ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_dev.1.ark ark,t:- and
it ran just like you said it should, with a lot of numbers output to the
terminal. So after that I ran copy-feats
ark:/home/spencer/kaldi/egs/timit/s5/mfcc/raw_mfcc_dev.1.ark ark:- |
apply-cmvn --utt2spk=ark:/home/spencer/kaldi/egs/timit/data/dev/utt2spk
ark:/home/spencer/kaldi/egs/timit/s5/data/cmvn_dev.ark ark:- ark:- |
add-deltas --delta-order=2 ark:- ark, t:- and got the attached error. One
thing I noticed is that there is no cmvn_dev.ark in my data folder (no .ark
files at all in that folder) is that meant to be the output or should there
be a .ark file there? Seems like the error is centered around that file.
[image: TIMITError]
<https://user-images.githubusercontent.com/49201733/66129779-8fc35180-e5be-11e9-8b3c-d0ea6a826948.PNG>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVRWMRDPA6HIA2ECK3TQMXXCJA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAIGC4A#issuecomment-537944432>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEA2ZVSLWWHPNLZPMECNK23QMXXCJANCNFSM4IRZSXTQ>
.
|
No like I said there are not .ark files in that folder (or subfolders). I thought this might be an output folder, but it looks like the issue is in the creation of those files. |
This cmvn file is created by kaldi during the feature extraction phase and
it performs mean and variance normalization. You should probably have the
cmvn file somewhere else like in data/dev/cmvn* or mfcc/cmv*
Mirco
…On Oct 3, 2019 09:36, "spencerkirn" ***@***.***> wrote:
No like I said there are not .ark files in that folder (or subfolders). I
thought this might be an output folder, but it looks like the issue is in
the creation of those files.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVREGRGKV5LM674GLATQMXYODA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAIHFCY#issuecomment-537948811>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEA2ZVQZEBDVBCBQJ4SVEDDQMXYODANCNFSM4IRZSXTQ>
.
|
In case anyone else has this issue: I resolved it by bypassing the if statement on line 328 of run_exp.py. There was some issue in how the shared_list object was being created that I could not figure out, but the else statement ran the run_nn function in a similar fashion as the training and validation steps. So I commented out line 328 and created another variable set to False to bypass that if statement.
|
This is weird, are you sure that you don't have a path problem only?
…On Fri, 25 Oct 2019 at 08:54, spencerkirn ***@***.***> wrote:
In case anyone else has this issue: I resolved it by bypassing the if
statement on line 328 of run_exp.py. There was some issue in how the
shared_list object was being created that I could not figure out, but the
else statement ran the run_nn function in a similar fashion as the training
and validation steps.
So I commented out line 328 and created another variable set to False to
bypass that if statement.
test=False #if _run_forwarding_in_subprocesses(config) if test:
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVSA33MYYYZIFPEVEGLQQLT65A5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECIIHSQ#issuecomment-546341834>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEA2ZVSLDBJZY4BCVRFHB3TQQLT65ANCNFSM4IRZSXTQ>
.
|
Interesting, we haven't experimented this issue on our side.
…On Fri, 25 Oct 2019 at 11:07, spencerkirn ***@***.***> wrote:
Yes, I checked all the paths in the config file and they were all correct.
Bypassing that if statement though gave a result that looked very similar
to the one in the tutorial.
[image: TIMITResult]
<https://user-images.githubusercontent.com/49201733/67582253-6ad28200-f717-11e9-9d6e-40d0d73a7744.PNG>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVXGWLJEAJNF6KUJHWDQQMDT5A5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECIUSZI#issuecomment-546392421>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEA2ZVQWJTWPJJDJDTNX2N3QQMDT5ANCNFSM4IRZSXTQ>
.
|
Maybe this file has not been created because there is a problem with test
data. Could you better check them?
Mirco
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
…On Fri, 25 Oct 2019 at 12:19, spencerkirn ***@***.***> wrote:
There is still an error in the log.log file apparently (I had not check
that file when I got the correct result). Something to do with
decode_dnn.sh. Looks like the
forward_TIMIT_test_ep*_ck*_out_dnn1_to_decode.ark files are not being
created for some reason. Though for whatever reason this does not affect
the outcome it seems.
[image: TIMITError3]
<https://user-images.githubusercontent.com/49201733/67587312-965a6a00-f721-11e9-8b54-54dcbcebeef6.PNG>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#157?email_source=notifications&email_token=AEA2ZVQFPWFAZX3MKBOWJTDQQMMABA5CNFSM4IRZSXT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECI27SQ#issuecomment-546418634>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEA2ZVWADWVSFXVKNZ4NFU3QQMMABANCNFSM4IRZSXTQ>
.
|
I am also having error at testing phase
when printed shared_list [
I used same validation data [ |
@kumarh22 I got the same problem with you, have you solved? |
@mravanelli I also got the error in test phase. Testing TIMIT_test chunk = 1 / 1 I had "manually" read the features to debug as you said above. It works in step2, and not came into error in step3(for step3, it runs for such a long time but without error, this is the same with eval file) and the log.log is just |
Is the problem happening if you use the validation or training set as the test set? |
yes. I use the validation set as test set, but it still happen |
I find that when I use gpu version, the problem not appear again. |
Had the same issue today. Here are some findings: Why does it only happen when running on CPU?Because when CPU is used the forward will run in a subprocess and the method to run forward pass in a subprocess uses another version of read_lab_fea method here - read_lab_fea_refac01. While the same process forward pass will use the original read_lab_fea method. So why it crashes when using the read_lab_fea_refac01 method?First of all, because it will switch to production mode when reading fea_dict, lab_dict, arch_dict. By removing this line I fixed the initial issue. But there is another problem. How to fix:You can update this method to return False. I tried to use read_lab instead of read_lab_fea_refac01 here but it will crash anyway when trying to unpack the |
The text was updated successfully, but these errors were encountered: