Sentence-State LSTMs For Sequence-to-Sequence Learning

Abstract

Transformer is currently the dominant method for sequence to sequence problems. In contrast, RNNs have become less popular due to the lack of parallelization capabilities and the relatively lower performance. In this paper, we propose to use a parallelizable variant of bi-directional LSTMs (BiLSTMs), namely sentence-state LSTMs (S-LSTM), as an encoder for sequence-to-sequence tasks. The complexity of S-LSTM is only \mathcal {O}(n) as compared to \mathcal {O}(n^2) of Transformer. On four neural machine translation benchmarks, we empirically find that S-SLTM can achieve significantly better performances than BiLSTM and convolutional neural networks (CNNs). When compared to Transformer, our model gives competitive performance while being 1.6 times faster during inference.

Publication
Natural Language Processing and Chinese Computing - 10th CCF International Conference, (NLPCC, CCF-C), Qingdao, China, October 13-17, 2021, Proceedings, Part I
Xuefeng Bai
Xuefeng Bai
Ph.D candidate

My research interests include semantics, dialogues and generation.