ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 5658 and the array at index 1 has size 5640 #234

aziryasin · 2020-06-24T09:42:18Z

I'm getting this error. I checked the for path errors like this . But getting this error. Please help.

My CFG:
`[cfg_proto]
cfg_proto = proto/global.proto
cfg_proto_chunk = proto/global_chunk.proto

[exp]
cmd =
run_nn_script = run_nn
out_folder = exp/Sinhala_num
seed = 1234
use_cuda = False
multi_gpu = False
save_gpumem = False
n_epochs_tr = 24

[dataset1]
data_name = Sinhala_num_tr
fea = fea_name=mfcc
fea_lst=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/train/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/train/utt2spk ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/mfcc/cmvn_train.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5

lab = lab_name=lab_cd
lab_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/train/
lab_graph=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1/graph

n_chunks = 5

[dataset2]
data_name = Sinhala_num_dev
fea = fea_name=mfcc
fea_lst=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/utt2spk ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5

lab = lab_name=lab_cd
lab_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/
lab_graph=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1/graph

n_chunks = 1

[dataset3]
data_name = Sinhala_num_test
fea = fea_name=mfcc
fea_lst=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/feats.scp
fea_opts=apply-cmvn --utt2spk=ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/utt2spk ark:/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/mfcc/cmvn_test.ark ark:- ark:- | add-deltas --delta-order=2 ark:- ark:- |
cw_left=5
cw_right=5

lab = lab_name=lab_cd
lab_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1
lab_opts=ali-to-pdf
lab_count_file=auto
lab_data_folder=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/data/test/
lab_graph=/home/azir/kaldi/egs/SinhalaSpeechRecognizer/HMM_Based_4/exp/tri1/graph

n_chunks = 1

[data_use]
train_with = Sinhala_num_tr
valid_with = Sinhala_num_dev
forward_with = Sinhala_num_test

[batches]
batch_size_train = 128
max_seq_length_train = 1000
increase_seq_length_train = False
start_seq_len_train = 100
multply_factor_seq_len_train = 2
batch_size_valid = 128
max_seq_length_valid = 1000

[architecture1]
arch_name = MLP_layers1
arch_proto = proto/MLP.proto
arch_library = neural_networks
arch_class = MLP
arch_pretrain_file = none
arch_freeze = False
arch_seq_model = False
dnn_lay = 1024,1024,1024,1024,N_out_lab_cd
dnn_drop = 0.15,0.15,0.15,0.15,0.0
dnn_use_laynorm_inp = False
dnn_use_batchnorm_inp = False
dnn_use_batchnorm = True,True,True,True,False
dnn_use_laynorm = False,False,False,False,False
dnn_act = relu,relu,relu,relu,softmax
arch_lr = 0.08
arch_halving_factor = 0.5
arch_improvement_threshold = 0.001
arch_opt = sgd
opt_momentum = 0.0
opt_weight_decay = 0.0
opt_dampening = 0.0
opt_nesterov = False

[model]
model_proto = proto/model.proto
model = out_dnn1=compute(MLP_layers1,mfcc)
loss_final=cost_nll(out_dnn1,lab_cd)
err_final=cost_err(out_dnn1,lab_cd)

[forward]
forward_out = out_dnn1
normalize_posteriors = True
normalize_with_counts_from = lab_cd
save_out_file = False
require_decoding = True

[decoding]
decoding_script_folder = kaldi_decoding_scripts/
decoding_script = decode_dnn.sh
decoding_proto = proto/decoding.proto
min_active = 200
max_active = 7000
max_mem = 50000000
beam = 13.0
latbeam = 8.0
acwt = 0.2
max_arcs = -1
skip_scoring = false
scoring_script = local/score.sh
scoring_opts = "--min-lmwt 1 --max-lmwt 10"
norm_vars = False
`

Godaddy-xie · 2020-08-14T17:05:07Z

You can modify codes in data_io.py
50: if k in fea -----> if k in fea and and v.shape[0] == fea[k].shape[0-
53: k: v for k, v in fea.items() if k in lab ---> k: v for k, v in fea.items() if k in lab and v.shape[0] = lab[k].shape[0]
to ensure that the feature and the truth value have the same size (frames).

anketvit · 2022-02-23T15:34:07Z

Having same issue, not resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 5658 and the array at index 1 has size 5640 #234

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 5658 and the array at index 1 has size 5640 #234

aziryasin commented Jun 24, 2020

Godaddy-xie commented Aug 14, 2020

anketvit commented Feb 23, 2022

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 5658 and the array at index 1 has size 5640 #234

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 5658 and the array at index 1 has size 5640 #234

Comments

aziryasin commented Jun 24, 2020

Godaddy-xie commented Aug 14, 2020

anketvit commented Feb 23, 2022