A Paper List for Speech Translation

This is a paper list for speech translation.

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

Paper List

Dataset

Construction and Utilization of Bilingual Speech Corpus for Simultaneous Machine Interpretation Research, InterSpeech-2005,[paper]
Approach to Corpus-based Interpreting Studies: Developing EPIC (European Parliament Interpreting Corpus), MuTra-2005, [paper]
Automatic Translation from Parallel Speech: Simultaneous Interpretation as MT Training Data, ASRU-2009, [paper]
The KIT Lecture Corpus for Speech Translation, LREC-2012, [paper]
Improved Speech-to-Text Translation with the Fisher and Callhome Spanish–English Speech Translation Corpus, IWSLT-2013, [paper]
Collection of a Simultaneous Translation Corpus for Comparative Analysis, LREC-2014, [paper]
Microsoft Speech Language Translation (MSLT) Corpus: The IWSLT 2016 release for English, French and German, IWSLT-2016, [paper]
The Microsoft Speech Language Translation (MSLT) Corpus for Chinese and Japanese: Conversational Test data for Machine Translation and Speech Recognition, Machine_Translation-2017, [paper]
Amharic-English Speech Translation in Tourism Domain, SCNLP-2017, [paper]
A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiment, LREC-2018, [paper]
Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation, LREC-2018, [paper]
A Small Griko-Italian Speech Translation Corpus, SLTU-2019, [paper]
MuST-C: a Multilingual Speech Translation Corpus, NAACL-2019, [paper]
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible, Arxiv-2019, [paper]
How2: A Large-scale Dataset for Multimodal Language Understanding, NIPS-2018, [paper]
LibriVoxDeEn: A Corpus for German-to-English Speech Translation and Speech Recognition, LREC-2020, [paper]
Clotho: An Audio Captioning Dataset, Arxiv-2019, [paper]
Europarl-St: A Multilingual Corpus For Speech Translation Of Parliamentary Debates, Arxiv-2019, [paper]
CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus, Arxiv-2020, [paper]
MuST-Cinema: a Speech-to-Subtitles corpus, Arxiv-2020, [paper]

Pipeline ST

Phonetically-Oriented Word Error Alignment for Speech Recognition Error Analysis in Speech Translation, ASRU-2015,[paper]
Learning a Translation Model from Word Lattices, InterSpeech-2016, [paper]
Learning a Lexicon and Translation Model from Phoneme Lattices, EMNLP-2016, [paper]
Neural Lattice-to-Sequence Models for Uncertain Inputs, EMNLP-2017, [paper]
Using Spoken Word Posterior Features in Neural Machine Translation, IWSLT-2018, [paper]
Towards robust neural machine translation, ACL-2018, [paper]
Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors, InterSpeech-2019, [paper]
Lattice Transformer for Speech Translation, ACL-2019, [paper]
Self-Attentional Models for Lattice Inputs, ACL-2019, [paper]
Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training, IWSLT-2019, [paper]
Neural machine translation with acoustic embedding, ASRU-2019
Machine Translation in Pronunciation Space, Arxiv-2020, [paper]
Diversity by Phonetics and its Application in Neural Machine Translation, AAAI-2020, [paper]
Robust Neural Machine Translation for Clean and Noisy Speech Transcripts, IWSLT-2019, [paper]

End-to-end ST

Towards Speech Translation of Non Written Languages, IEEE-2006, [paper]
Towards speech-to-text translation without speech recognition, EACL-2017, [paper]
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation, NIPS-2016, [paper]
An Attentional Model for Speech Translation Without Transcription, NAACL-2016, [paper]
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages, EMNLP-2016, [paper]
A Case Study on Using Speech-to-translation Alignments for Language Documentation, ComputEL-2017, [paper]
Spoken Term Discovery for Language Documentation Using Translations, SCNLP-2017, [paper]
Sequence-to-sequence Models Can Directly Translate Foreign Speech, InterSpeech-2017, [paper]
Structured-based Curriculum Learning for End-to-end English-Japanese Speech Translation, InterSpeech-2017, [paper]
End-to-End Speech Translation with the Transformer, IberSPEECH-2018, [paper]
Towards Fluent Translations from Disfluent Speech, SLT-2018, [paper]
Low-resource Speech-to-text Translation, InterSpeech-2018, [paper]
End-to-End Automatic Speech Translation of Audiobooks, ICASSP-2018, [paper]
Tied Multitask Learning for Neural Speech Translation, NAACL-2018, [paper]
Towards Unsupervised Speech to Text Translation, ICASSP-2019, [paper]
Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation, ICASSP-2019, [paper]
Towards End-to-end Speech-to-text Translation with Two-pass Decoding, ICASSP-2019, [paper]
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation, TACL-2019, [paper]
Direct speech-to-speech translation with a sequence-to-sequence model, InterSpeech-2019, [paper]
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation, TACL-2019, [paper]
End-to-End Speech Translation with Knowledge Distillation, InterSpeech-2019, [paper]
Fluent Translations from Disfluent Speech in End-to-End Speech Translation, NAACL-2019, [paper]
Pre-Training On High-Resource Speech Recognition Improves Low-Resource Speech-To-Text Translation, NAACL-2019, [[paper]
Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation, ACL-2019, [paper]
Leveraging Out-of-Task Data for End-to-End Automatic Speech Translation, Arxiv-2019, [paper]
Bridging the Gap between Pre-Training and Fine-Tuning for End-to-End Speech Translation, AAAI-2020, [paper]
Adapting Transformer to End-to-end Spoken Language Translation, InterSpeech-2019, [paper]
Unsupervised phonetic and word level discovery for speech to speech translation for unwritten languages, InterSpeech-2019, [paper]
Simuls2s: End-to-end Simultaneous Speech To Speech Translation, ICLR-2019(under review), [paper]
Speech-To-Speech Translation Between Untranscribed Unknown Languages, ASRU-2019, [paper]
A comparative study on end-to-end speech to text translation, ASRU-2019, [paper]
Instance-Based Model Adaptation For Direct Speech Translation, ICASSP-2020 Submitted, [paper]
Analyzing Asr Pretraining For Low-Resource Speech-To-Text Translation, ICASSP-2020 Submitted, [paper]
ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task, IWSLT-2019, [paper]
Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade, IWSLT-2019, [paper]
Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning, Arxiv-2019, [paper]
On Using SpecAugment for End-to-End Speech Translation, IWSLT-2019, [paper]
Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding, AAAI-2020, [paper]
From Speech-To-Speech Translation To Automatic Dubbing, Arxiv-2020, [paper]
Skinaugment: Auto-Encoding Speaker Conversions For Automaticspeech Translation, ICASSP-2020, [paper]

Multilingual ST

Multilingual End-To-End Speech Translation, ASRU-2019, [paper]
One-To-Many Multilingual End-To-End Speech Translation, ASRU-2019, [paper]

Multimodal ST

Transformer-based Cascaded Multimodal Speech Translation, Arxiv-2019, [paper]

Streaming MT

Simultaneous translation of lectures and speeches, Machine Translation-2007, [paper]
Real-time incremental speech-to-speech translation of dialogs, NAACL-2012, [paper]
Incremental segmentation and decoding strategies for simultaneous translation, IJCNLP-2013, [paper]
Don't Until the Final Verb Wait: Reinforcement learning for simultaneous machine translation, EMNLP-2014, [paper]
Segmentation strategies for streaming speech translation, NAACL-2013, [paper]
Optimizing segmentation strategies for simultaneous speech translation, ACL-2014, [paper]
Syntax-based simultaneous translation through prediction of unseen syntactic constituents, ACL-IJCNLP-2015, [paper]
Simultaneous machine translation using deep reinforcement learning, ICML-2016, [paper]
Interpretese vs. translationese: The uniqueness of human strategies in simultaneous interpretation, NAACL-2016, [paper]
Can neural machine translation do simultaneous translation?, Arxiv-2016, [paper]
Learning to translate in real-time with neural machine translation, EACL-2017, [paper]
Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation, NAACL-2018, [paper]
Prediction Improves Simultaneous Neural Machine Translation, EMNLP-2018, [paper]
STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework, ACL-2019, [paper]
Simultaneous Translation with Flexible Policy via Restricted Imitation Learning, ACL-2019, [paper]
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation, ACL-2019, [paper]
Thinking Slow about Latency Evaluation for Simultaneous Machine Translation, Arxiv-2019, [paper]
DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting, Arxiv-2019, [paper]
Monotonic Multihead Attention, ICLR-2020(under review), [paper]
How To Do Simultaneous Translation Better With Consecutive Neural Machine Translation, Arxiv-2019, [paper]
Simultaneous Neural Machine Translation using Connectionist Temporal Classification, Arxiv-2019, [paper]
Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation, Arxiv-2019, [paper]
Learning Coupled Policies for Simultaneous Machine Translation, Arxiv-2020, [paper]

Related Works

Spoken Language Understanding

Understanding Semantics from Speech Through Pre-training, Arxiv-2019, [paper]
Learning ASR-Robust Contextualized Embeddings for Spoken Language Understanding, Arxiv-2019, [paper]
A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding, Arxiv-2019, [paper]
Incremental processing of noisy user utterances in the spoken language understanding task, W-NUT-2019, [paper]
Adapting pretrained transformer to lattices for spoken language understanding, ASRU-2019
Efficient semi-supervised learning for natural language understanding by optimizing diversity, ASRU-2019, [paper]
Joint learning of word and label embeddings for sequence labelling in spoken language understanding, ASRU-2019, [paper]
Transfer learning for context-aware spoken language understanding, ASRU-2019
Speech Sentiment Analysis Via Pre-Trained Features From End-To-End Asr Models, Arxiv-2019, [paper]
Recent Advances in End-to-End Spoken Language Understanding, [paper]
Modeling Inter-Speaker Relationship In Xlnet For Contextual Spoken Language Understanding, ICASSP-2020, [paper]

Text Normalization

A Hybrid Text Normalization System Using Multi-Head Self-Attention For Mandarin, ICASSP-2020, [paper]
A Unified Sequence-To-Sequence Front-End Model For Mandarin Text-To-Speech Synthesis, ICASSP-2020, [paper]

Workshop

IWSLT 2018, [link]
IWSLT 2019, [link]
IWSLT 2020, [link]
AutoSimTrans 2020, [link]

Copyright

By volunteers from Institute of Automation，Chinese Academy of Sciences.
Welcome to open an issue or make a pull request!

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
README.md		README.md
speech_translation_corpus.xlsx		speech_translation_corpus.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Paper List for Speech Translation

Paper List

Dataset

Pipeline ST

End-to-end ST

Multilingual ST

Multimodal ST

Streaming MT

Related Works

Spoken Language Understanding

Text Normalization

Workshop

Copyright

About

Releases

Packages

ucaslyc/speech_translation-papers

Folders and files

Latest commit

History

Repository files navigation

A Paper List for Speech Translation

Paper List

Dataset

Pipeline ST

End-to-end ST

Multilingual ST

Multimodal ST

Streaming MT

Related Works

Spoken Language Understanding

Text Normalization

Workshop

Copyright

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages