Abstract—The paper presents an extended version of the SuperMatrix system — a general tool supporting automatic acquisition of lexical semantic relations from corpora. Extensions focus mainly on parallel processing of massive amounts of data. The construction of the system is discussed. Three distributed parts of the system are presented, i.e., distributed construction of co-incidence matrices from corpora, computation of similarity matrix and parallel solving of synonymy tests. An evaluation of a proposed approach to parallel processing is shown. Paral-lelization of similarity matrix computation demonstrates almost linear speedup. The smallest improvements were achieved for construction of matrices, as this process is mostly bound by readi...
We implement the algorithm of (Rychly and Kilgarriff, 2007) for computing distri-butional similarity...
International audience"Semantic Atlas" is a mathematic and statistic model to visualise word senses ...
The amount of information available through the Internet has been showing a significant growth in t...
Research into corpus-based semantics has focused on the development of ad hoc models that treat sing...
Research into corpus-based semantics has focused on the development of ad hoc models that treat sing...
In distributional semantics, the unsupervised learning approach has been widely used for a large num...
We investigate the creation of corpora from web-harvested data following a scalable approach that ha...
The multi-core machines open new doors to achieve parallelism in single machine. This new architectu...
Parallel Coordinates as a complementary tool for exploring word similarity matrices Visualising mult...
This paper proposes a method of finding correspondences of arbitrary length word sequences in aligne...
We present our semantic textual similarity approach in filtering a noisy web crawled parallel corpus...
This paper demonstrates how token-level word space models (a distributional semantic technique that ...
This thesis presents the patterns and methods uncovered in the development of a new scalable corpus ...
This paper focuses on investigation of the parallel corpora role as a linguistic recourse. The appli...
This paper presents a very simple and effective approach to using parallel corpora for automatic ...
We implement the algorithm of (Rychly and Kilgarriff, 2007) for computing distri-butional similarity...
International audience"Semantic Atlas" is a mathematic and statistic model to visualise word senses ...
The amount of information available through the Internet has been showing a significant growth in t...
Research into corpus-based semantics has focused on the development of ad hoc models that treat sing...
Research into corpus-based semantics has focused on the development of ad hoc models that treat sing...
In distributional semantics, the unsupervised learning approach has been widely used for a large num...
We investigate the creation of corpora from web-harvested data following a scalable approach that ha...
The multi-core machines open new doors to achieve parallelism in single machine. This new architectu...
Parallel Coordinates as a complementary tool for exploring word similarity matrices Visualising mult...
This paper proposes a method of finding correspondences of arbitrary length word sequences in aligne...
We present our semantic textual similarity approach in filtering a noisy web crawled parallel corpus...
This paper demonstrates how token-level word space models (a distributional semantic technique that ...
This thesis presents the patterns and methods uncovered in the development of a new scalable corpus ...
This paper focuses on investigation of the parallel corpora role as a linguistic recourse. The appli...
This paper presents a very simple and effective approach to using parallel corpora for automatic ...
We implement the algorithm of (Rychly and Kilgarriff, 2007) for computing distri-butional similarity...
International audience"Semantic Atlas" is a mathematic and statistic model to visualise word senses ...
The amount of information available through the Internet has been showing a significant growth in t...