We report the first steps of a novel investigation into how a grammar induction algorithm can be modified and used to identify salient information structures in a corpus. The information structures are to be used as representations of semantic content for text mining purposes. We modify the learning regime of the ADIOS algorithm (Solan et al., 2005) so that text is presented as increasingly large snippets around key terms, and instances of selected structures are substituted with common identifiers in the input for subsequent iterations. The technique is applied to 1.4m blog posts about climate change which mention diverse topics and reflect multiple perspectives and different points of view. Observation of the resulting information structu...
The process whereby inferences are made from textual data is broadly referred to as text mining. In ...
A massive amount of online information is natural language text: newspapers, blog articles, forum po...
This work extends a semi-automatic grammar induction approach previously proposed in [1]. We investi...
National audienceText Mining tackles the task of searching useful knowledge (patterns) in a natural...
We report ongoing work that is aiming to develop a data-driven approach to text analysis for computa...
International audienceGiven the huge quantity of the current available textual information, text min...
We explore the interplay between grammar induction and topic modeling approaches to unsupervised tex...
International audienceText Mining is about the task of searching useful knowledge in a natural langu...
In this position paper we introduce our ideas for utilizing text corpus for supplementing or replaci...
mooney,pebronia¡ Text mining concerns looking for patterns in unstructured text. The related task of...
The possibilities for data mining from large text collections are virtually untapped. Text expresses...
Many data mining techniques have been proposed for mining useful patterns in text documents. However...
One goal of computational linguistics is to discover a method for assigning a rich structural annota...
One goal of computational linguistics is to discover a method for assigning a rich structural annota...
The field of information extraction (IE) is concerned with applying natural language processing (NLP...
The process whereby inferences are made from textual data is broadly referred to as text mining. In ...
A massive amount of online information is natural language text: newspapers, blog articles, forum po...
This work extends a semi-automatic grammar induction approach previously proposed in [1]. We investi...
National audienceText Mining tackles the task of searching useful knowledge (patterns) in a natural...
We report ongoing work that is aiming to develop a data-driven approach to text analysis for computa...
International audienceGiven the huge quantity of the current available textual information, text min...
We explore the interplay between grammar induction and topic modeling approaches to unsupervised tex...
International audienceText Mining is about the task of searching useful knowledge in a natural langu...
In this position paper we introduce our ideas for utilizing text corpus for supplementing or replaci...
mooney,pebronia¡ Text mining concerns looking for patterns in unstructured text. The related task of...
The possibilities for data mining from large text collections are virtually untapped. Text expresses...
Many data mining techniques have been proposed for mining useful patterns in text documents. However...
One goal of computational linguistics is to discover a method for assigning a rich structural annota...
One goal of computational linguistics is to discover a method for assigning a rich structural annota...
The field of information extraction (IE) is concerned with applying natural language processing (NLP...
The process whereby inferences are made from textual data is broadly referred to as text mining. In ...
A massive amount of online information is natural language text: newspapers, blog articles, forum po...
This work extends a semi-automatic grammar induction approach previously proposed in [1]. We investi...