Non-Autoregressive generation is a sequence generation paradigm, which removes the dependency between target tokens. It could efficiently reduce the text generation latency with parallel decoding in place of token-by-token sequential decoding. However, due to the known multi-modality problem, Non-Autoregressive (NAR) models significantly under-perform Auto-regressive (AR) models on various language generation tasks. Among the NAR models, BANG is the first large-scale pre-training model on English un-labeled raw text corpus. It considers different generation paradigms as its pre-training tasks including Auto-regressive (AR), Non-Autoregressive (NAR), and semi-Non-Autoregressive (semi-NAR) information flow with multi-stream strategy. It achie...
Hybrid tabular-textual question answering (QA) requires reasoning from heterogeneous information, an...
In the speech research community, a very challenging topic researchers are interested in is the sequ...
Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) ...
Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference...
Recent advances in Transformer-based Large Language Models have made great strides in natural langua...
The computational benefits of iterative non-autoregressive transformers decrease as the number of de...
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to ...
The advances in deep learning have led to great achievements in many Natural Language Processing (NL...
Non-autoregressive approaches aim to improve the inference speed of translation models, particularly...
Deep generative models of text have shown great success on a wide range of conditional and unconditi...
In recent years, a number of mehtods for improving the decoding speed of neural machine translation ...
Current autoregressive language generative models in the deep learning literature have achieved impr...
Efficient machine translation models are com- mercially important as they can increase infer- ence s...
Transformer-based autoregressive (AR) methods have achieved appealing performance for varied sequenc...
Non-autoregressive text-to-speech (TTS) has recently received a lot of atten-tion due to its reliabi...
Hybrid tabular-textual question answering (QA) requires reasoning from heterogeneous information, an...
In the speech research community, a very challenging topic researchers are interested in is the sequ...
Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) ...
Non-autoregressive generation (NAG) has recently attracted great attention due to its fast inference...
Recent advances in Transformer-based Large Language Models have made great strides in natural langua...
The computational benefits of iterative non-autoregressive transformers decrease as the number of de...
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to ...
The advances in deep learning have led to great achievements in many Natural Language Processing (NL...
Non-autoregressive approaches aim to improve the inference speed of translation models, particularly...
Deep generative models of text have shown great success on a wide range of conditional and unconditi...
In recent years, a number of mehtods for improving the decoding speed of neural machine translation ...
Current autoregressive language generative models in the deep learning literature have achieved impr...
Efficient machine translation models are com- mercially important as they can increase infer- ence s...
Transformer-based autoregressive (AR) methods have achieved appealing performance for varied sequenc...
Non-autoregressive text-to-speech (TTS) has recently received a lot of atten-tion due to its reliabi...
Hybrid tabular-textual question answering (QA) requires reasoning from heterogeneous information, an...
In the speech research community, a very challenging topic researchers are interested in is the sequ...
Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) ...