Skip to content

experiment.py: --save-confidences does not support multilingual drafting configurations #947

@mmartin9684-sil

Description

@mmartin9684-sil

If the --save-confidences argument is used on the experiment script with a multilingual training configuration, the script will fail. For instance, this corpus_pair configuration:

  - corpus_books: GEN;EXO;NT
    mapping: many_to_many
    src:
    - tpi-TPB12
    - en-NIrV
    trg: ena-AplDupl_2025_11_29
    type: train,test

will fail during the test stage because it does not handle multilingual predictions files for the test set (i.e., test.src.trg.trg-predictions.txt.checkpoint).

A sample stack trace is provided below.

100% 250/250 [01:13<00:00,  3.41ex/s]
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/experiment.py", line 295, in <module>
    main()
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/experiment.py", line 291, in main
    exp.run()
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/experiment.py", line 58, in run
    self.test()
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/experiment.py", line 82, in test
    test(
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/test.py", line 820, in test
    results[step] = test_checkpoint(
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/test.py", line 595, in test_checkpoint
    model.translate_test_files(
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/nmt/hugging_face_config.py", line 1218, in translate_test_files
    generate_confidence_files(
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/common/translator.py", line 402, in generate_confidence_files
    confidence_file = ConfidenceFile.from_draft_file_path(trg_draft_file_path)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/common/translator.py", line 202, in from_draft_file_path
    file_type = cls._get_confidence_file_type(trg_draft_file_path)
  File "/root/.clearml/venvs-builds/3.10/task_repository/silnlp.git/silnlp/common/translator.py", line 188, in _get_confidence_file_type
    raise ValueError(
ValueError: No confidence file type corresponds to trg_draft_file_path /root/M/MT/experiments/PNG/Apal/2026-02-22/NLLB.1.3B.tpi-TPB12+en-NIrV.ena-AplDupl/test.en.ena.trg-predictions.txt.5000. Expected a trg_draft_file_path starting with 'test.trg-predictions' or ending with .usfm/.sfm/.txt.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestpipeline 5: testIssue relating to testing a model quality with Bleu or other metrics.

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions