Sentence-State LSTMs For Sequence-to-Sequence Learning

Xuefeng Bai, Yafu Li, Zhirui Zhang, Mingzhou Xu, Boxing Chen, Weihua Luo, Derek F. Wong, Yue Zhang

January, 2021

Abstract

Transformer is currently the dominant method for sequence to sequence problems. In contrast, RNNs have become less popular due to the lack of parallelization capabilities and the relatively lower performance. In this paper, we propose to use a parallelizable variant of bi-directional LSTMs (BiLSTMs), namely sentence-state LSTMs (S-LSTM), as an encoder for sequence-to-sequence tasks. The complexity of S-LSTM is only \mathcal {O}(n) as compared to \mathcal {O}(n^2) of Transformer. On four neural machine translation benchmarks, we empirically find that S-SLTM can achieve significantly better performances than BiLSTM and convolutional neural networks (CNNs). When compared to Transformer, our model gives competitive performance while being 1.6 times faster during inference.

Type

Conference paper

Publication

Natural Language Processing and Chinese Computing - 10th CCF International Conference, (NLPCC, CCF-C), Qingdao, China, October 13-17, 2021, Proceedings, Part I

graph neural networks machine translation

Sentence-State LSTMs For Sequence-to-Sequence Learning

Abstract

Xuefeng Bai

Ph.D candidate