Detailseite
Projekt Druckansicht

Kerntechnologien für statistische maschinelle Übersetzung

Fachliche Zuordnung Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Allgemeine und Vergleichende Sprachwissenschaft, Experimentelle Linguistik, Typologie, Außereuropäische Sprachen
Förderung Förderung von 2017 bis 2020
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 327471424
 
Erstellungsjahr 2020

Zusammenfassung der Projektergebnisse

This work deals with the investigation of various core technologies for machine translation. In the general environment where neural machine translation predominates, we explore various potential network architectures for the task, rather than using the state-of-the-art approaches and making minor changes. Our novel neuralnetwork-based direct hidden Markov model with a better interpretable architecture achieves a comparable performance to the strong LSTM-RNN-based attention model. Our two-dimensional sequence-to-sequence model, a network that uses a two-dimensional LSTM unit to read both the source and the target sentence jointly, even surpasses the state-of-the-art attention baseline. Despite the drawbacks of training and decoding speed, which may affect their use in industrial applications, we still believe that these two alternative architectures give us insights into the system rather than using the neural machine translation system as a black box. On top of the transformer-based architecture, we improve word alignment quality for the purpose of dictionary-guided translation and analyze different types of positional encoding. We also propose alternative approaches of unsupervised machine translation and transfer learning that bring significant improvements over the state-of-the-art baselines. Moreover, we develop strategies for weight initialization and normalization and apply kernel functions in the softmax layer to improve the neural language models. In addition to the research results, we continuously participate in the global evaluation campaigns to test our open source tools such as Sisyphus, uniblock and Extended Edit Distance. The performance of our systems is always among the best in various types of evaluation tasks. Our punctuation prediction models perform well and play an important role in the cascaded speech-to-text translation system. In addition, the end-to-end speech translation model is now available in our speech recognition and machine translation toolkit RETURNN, with which we participate in the IWSLT 2020 speech translation task.

Projektbezogene Publikationen (Auswahl)

  • (2017). Biasing attentionbased recurrent neural networks using external alignment information. In Proceedings of the Second Conference on Machine Translation, WMT 2017, Copenhagen, Denmark, September 7-8, 2017, pages 108–117. Association for Computational Linguistics.
    Alkhouli, T. and Ney, H.
    (Siehe online unter https://dx.doi.org/10.18653/v1/W17-4711)
  • (2017). Hybrid neural network alignment and lexicon model in direct HMM for statistical machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, pages 125–131. Association for Computational Linguistics
    Wang, W., Alkhouli, T., Zhu, D., and Ney, H.
    (Siehe online unter https://dx.doi.org/10.18653/v1/P17-2020)
  • (2018). On the alignment problem in multi-head attention-based neural machine translation. In Proceedings of the Third Conference on Machine Translation: Research Papers, WMT 2018, Belgium, Brussels, October 31 - November 1, 2018, pages 177–185. Association for Computational Linguistics.
    Alkhouli, T., Bretschner, G., and Ney, H.
    (Siehe online unter https://dx.doi.org/10.18653/v1/W18-6318)
  • (2018). Towards twodimensional sequence to sequence model in neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pages 3009–3015
    Bahar, P., Brix, C., and Ney, H
    (Siehe online unter https://dx.doi.org/10.18653/v1/D18-1335)
  • (2019). A comparative study on end-to-end speech to text translation. In IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, Singapore, December 14-18, 2019, pages 792–799. IEEE
    Bahar, P., Bieschke, T., and Ney, H.
    (Siehe online unter https://doi.org/10.1109/ASRU46091.2019.9003774)
  • (2019). On using specaugment for end-to-end speech translation. In International Workshop on Spoken Language Translation, Hong Kong, China
    Bahar, P., Zeyer, A., Schlüter, R., and Ney, H
 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung