En este trabajo hemos afrontado la tarea de similitud de textos multilingüe mediante representaciones vectoriales de las palabras. Hemos experimentado con varias colecciones de textos con pares de frases en español e inglés, adaptando dos técnicas basadas en word embeddings que han mostrado su eficacia en la similitud de textos monolingüe: la agregación de vectores y el alineamiento. La agregación permite construir una representación vectorial de un texto a partir de los vectores de las palabras que lo componen, y el algoritmo de alineamiento aprovecha los word embeddigs para decidir el emparejamiento de palabras de los dos textos a comparar. En el proceso se han utilizado dos estrategias distintas: usar traductores automáticos para poder a...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
This work focuses on the task of finding latent vector representations of the words in a corpus. In ...
En este trabajo hemos afrontado la tarea de similitud de textos multilingüe mediante representacione...
En este trabajo mostramos cómo una representación vectorial de palabras basada en word embeddings pu...
Word embeddings are word representations in the form of vectors that allow to maintain certain seman...
En este trabajo mostramos cómo una representación vectorial de palabras basada en word embeddings pu...
Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages...
Cross-lingual embeddings are vector space representations where word translations tend to be co-loca...
One of the notable developments in current natural language processing is the practical efficacy of ...
Word embeddings have become a standard resource in the toolset of any Natural Language Processing p...
∗ Both authors contributed equally Cross-language learning allows one to use training data from one ...
Cross-lingual word embeddings are becoming increasingly important in multilingual NLP. Recently, i...
En este trabajo se describe un nuevo enfoque para generar de manera automática un tesauro de similit...
Representation of words coming from vocabulary of a language as real vectors in a high dimensional s...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
This work focuses on the task of finding latent vector representations of the words in a corpus. In ...
En este trabajo hemos afrontado la tarea de similitud de textos multilingüe mediante representacione...
En este trabajo mostramos cómo una representación vectorial de palabras basada en word embeddings pu...
Word embeddings are word representations in the form of vectors that allow to maintain certain seman...
En este trabajo mostramos cómo una representación vectorial de palabras basada en word embeddings pu...
Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages...
Cross-lingual embeddings are vector space representations where word translations tend to be co-loca...
One of the notable developments in current natural language processing is the practical efficacy of ...
Word embeddings have become a standard resource in the toolset of any Natural Language Processing p...
∗ Both authors contributed equally Cross-language learning allows one to use training data from one ...
Cross-lingual word embeddings are becoming increasingly important in multilingual NLP. Recently, i...
En este trabajo se describe un nuevo enfoque para generar de manera automática un tesauro de similit...
Representation of words coming from vocabulary of a language as real vectors in a high dimensional s...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal ...
This work focuses on the task of finding latent vector representations of the words in a corpus. In ...