Distilling knowledge from a well-trained cumbersome network to a small one has recently become a new research topic, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling word embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from a set of high-dimensional embeddings, so that we can reduce model complexity by a large margin as well as retain high accuracy, achieving a good compromise between efficiency and performance. Experiments reveal the phenomenon that distilling knowledge from cumbersome embeddings is better than directly training neural networks with small embeddings.CPCI-S(ISTP)dou...
Feature representation has been one of the most important factors for the success of machine learnin...
We describe a novel approach to generate high-quality lexical word embeddings from an Enhanced Neura...
Word embeddings are the interface between the world of discrete units of text processing and the con...
What is a word embedding? Suppose you have a dictionary of words. The i th word in the dictionary is...
Word embeddings have advanced the state of the art in NLP across numerous tasks. Understanding the c...
Prediction without justification has limited utility. Much of the success of neural models can be at...
Embedding matrices are key components in neural natural language processing (NLP) models that are re...
Research on word representation has always been an important area of interest in the antiquity of Na...
Pre-trained word embeddings encode general word semantics and lexical regularities of natural langua...
Introduction Word embeddings, which are distributed word representations learned by neural language ...
Combining structured information with language models is a standing problem in NLP. Building on prev...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into ve...
We propose a new neural model for word embeddings, which uses Unitary Matrices as the primary device...
Neural language models learn word representations that capture rich linguistic and conceptual inform...
Many natural language processing applications rely on word representations (also called word embeddi...
Feature representation has been one of the most important factors for the success of machine learnin...
We describe a novel approach to generate high-quality lexical word embeddings from an Enhanced Neura...
Word embeddings are the interface between the world of discrete units of text processing and the con...
What is a word embedding? Suppose you have a dictionary of words. The i th word in the dictionary is...
Word embeddings have advanced the state of the art in NLP across numerous tasks. Understanding the c...
Prediction without justification has limited utility. Much of the success of neural models can be at...
Embedding matrices are key components in neural natural language processing (NLP) models that are re...
Research on word representation has always been an important area of interest in the antiquity of Na...
Pre-trained word embeddings encode general word semantics and lexical regularities of natural langua...
Introduction Word embeddings, which are distributed word representations learned by neural language ...
Combining structured information with language models is a standing problem in NLP. Building on prev...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into ve...
We propose a new neural model for word embeddings, which uses Unitary Matrices as the primary device...
Neural language models learn word representations that capture rich linguistic and conceptual inform...
Many natural language processing applications rely on word representations (also called word embeddi...
Feature representation has been one of the most important factors for the success of machine learnin...
We describe a novel approach to generate high-quality lexical word embeddings from an Enhanced Neura...
Word embeddings are the interface between the world of discrete units of text processing and the con...