Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 5658 and the array at index 1 has size 5640 #234

Open
aziryasin opened this issue Jun 24, 2020 · 2 comments

Comments

@aziryasin
Copy link

I'm getting this error. I checked the for path errors like this . But getting this error. Please help.

My CFG:
`[cfg_proto]
cfg_proto = proto/global.proto
cfg_proto_chunk = proto/global_chunk.proto

[exp]
cmd =
run_nn_script = run_nn
out_folder = exp/Sinhala_num
seed = 1234
use_cuda = False
multi_gpu = False
save_gpumem = False
n_epochs_tr = 24

[dataset1]
data_name = Sinhala_num_tr
fea = fea_name=mfcc
fea_lst=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/train/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/train/utt2spk ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/mfcc/cmvn_train.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5

lab = lab_name=lab_cd
lab_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/train/
lab_graph=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1/graph

n_chunks = 5

[dataset2]
data_name = Sinhala_num_dev
fea = fea_name=mfcc
fea_lst=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/utt2spk ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5

lab = lab_name=lab_cd
lab_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/
lab_graph=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1/graph

n_chunks = 1

[dataset3]
data_name = Sinhala_num_test
fea = fea_name=mfcc
fea_lst=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/utt2spk ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5

lab = lab_name=lab_cd
lab_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/
lab_graph=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1/graph

n_chunks = 1

[data_use]
train_with = Sinhala_num_tr
valid_with = Sinhala_num_dev
forward_with = Sinhala_num_test

[batches]
batch_size_train = 128
max_seq_length_train = 1000
increase_seq_length_train = False
start_seq_len_train = 100
multply_factor_seq_len_train = 2
batch_size_valid = 128
max_seq_length_valid = 1000

[architecture1]
arch_name = MLP_layers1
arch_proto = proto/MLP.proto
arch_library = neural_networks
arch_class = MLP
arch_pretrain_file = none
arch_freeze = False
arch_seq_model = False
dnn_lay = 1024,1024,1024,1024,N_out_lab_cd
dnn_drop = 0.15,0.15,0.15,0.15,0.0
dnn_use_laynorm_inp = False
dnn_use_batchnorm_inp = False
dnn_use_batchnorm = True,True,True,True,False
dnn_use_laynorm = False,False,False,False,False
dnn_act = relu,relu,relu,relu,softmax
arch_lr = 0.08
arch_halving_factor = 0.5
arch_improvement_threshold = 0.001
arch_opt = sgd
opt_momentum = 0.0
opt_weight_decay = 0.0
opt_dampening = 0.0
opt_nesterov = False

[model]
model_proto = proto/model.proto
model = out_dnn1=compute(MLP_layers1,mfcc)
loss_final=cost_nll(out_dnn1,lab_cd)
err_final=cost_err(out_dnn1,lab_cd)

[forward]
forward_out = out_dnn1
normalize_posteriors = True
normalize_with_counts_from = lab_cd
save_out_file = False
require_decoding = True

[decoding]
decoding_script_folder = kaldi_decoding_scripts/
decoding_script = decode_dnn.sh
decoding_proto = proto/decoding.proto
min_active = 200
max_active = 7000
max_mem = 50000000
beam = 13.0
latbeam = 8.0
acwt = 0.2
max_arcs = -1
skip_scoring = false
scoring_script = local/score.sh
scoring_opts = "--min-lmwt 1 --max-lmwt 10"
norm_vars = False
`

image

@Godaddy-xie
Copy link

You can modify codes in data_io.py
50: if k in fea -----> if k in fea and and v.shape[0] == fea[k].shape[0-
53: k: v for k, v in fea.items() if k in lab ---> k: v for k, v in fea.items() if k in lab and v.shape[0] = lab[k].shape[0]
to ensure that the feature and the truth value have the same size (frames).

@anketvit
Copy link

Having same issue, not resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants