This thesis is a proof-of-concept for embedding Swedish documents using continuous vectors. These vectors can be used as input in any subsequent task and serves as an alternative to discrete bag of words vectors. The differences goes beyond fewer dimensions as the continuous vectors also hold contextual information. This means that documents with no shared vocabulary can be directly identified as contextually similar, which is impossible for the bag of words vectors. The continuous vectors are the result of neural language models and algorithms that pool the model output into document-level representations. This thesis has looked into the latest research regarding such models, starting from the Word2Vec algorithms. A wide variety of neural ...
Recent advances in neural language models have contributed new methods for learning distributed vect...
We propose the Neural Vector Space Model (NVSM), a method that learns representations of documents i...
This thesis follows up text categorization. In the first part are described several chosen algorithm...
This thesis is a proof-of-concept for embedding Swedish documents using continuous vectors. These ve...
Word vectors, embeddings of words into a low-dimensional space, have been shown to be useful for a l...
When classifying texts using a linear classifier, the texts are commonly represented as feature vect...
This work highlights some important factors for consideration when developing word vector representa...
Unsupervised learning text representations aims at converting natural languages into vector represen...
The research topic studied in this dissertation is word representation learning, which aims to learn...
In this work I detail the compilation of a unique corpus of Norwegian court decisions. I utilize thi...
We propose two novel model architectures for computing continuous vector representations of words fr...
In today’s modern digital world more and more emails and messengers must be sent, processed and hand...
For processing the textual data using statistical methods like Machine Learning (ML), the data often...
This thesis explores and compares various methods for producing vector representation of unstructure...
Word vector representation is widely used in natural language processing tasks. Most word vectors ar...
Recent advances in neural language models have contributed new methods for learning distributed vect...
We propose the Neural Vector Space Model (NVSM), a method that learns representations of documents i...
This thesis follows up text categorization. In the first part are described several chosen algorithm...
This thesis is a proof-of-concept for embedding Swedish documents using continuous vectors. These ve...
Word vectors, embeddings of words into a low-dimensional space, have been shown to be useful for a l...
When classifying texts using a linear classifier, the texts are commonly represented as feature vect...
This work highlights some important factors for consideration when developing word vector representa...
Unsupervised learning text representations aims at converting natural languages into vector represen...
The research topic studied in this dissertation is word representation learning, which aims to learn...
In this work I detail the compilation of a unique corpus of Norwegian court decisions. I utilize thi...
We propose two novel model architectures for computing continuous vector representations of words fr...
In today’s modern digital world more and more emails and messengers must be sent, processed and hand...
For processing the textual data using statistical methods like Machine Learning (ML), the data often...
This thesis explores and compares various methods for producing vector representation of unstructure...
Word vector representation is widely used in natural language processing tasks. Most word vectors ar...
Recent advances in neural language models have contributed new methods for learning distributed vect...
We propose the Neural Vector Space Model (NVSM), a method that learns representations of documents i...
This thesis follows up text categorization. In the first part are described several chosen algorithm...