Social media language contains huge amount and wide variety of nonstandard tokens, cre-ated both intentionally and unintentionally by the users. It is of crucial importance to nor-malize the noisy nonstandard tokens before applying other NLP techniques. A major challenge facing this task is the system cov-erage, i.e., for any user-created nonstandard term, the system should be able to restore the correct word within its top n output candi-dates. In this paper, we propose a cognitively-driven normalization system that integrates different human perspectives in normalizing the nonstandard tokens, including the en-hanced letter transformation, visual priming, and string/phonetic similarity. The system was evaluated on both word- and message-le...
This work explores normalization for parser adaptation. Traditionally, normalization is used as sepa...
This is an accepted manuscript of an article published by IEEE in 2018 3rd International Conference ...
The automatic analysis (parsing) of natural language is an important ingredient for many natural lan...
MasterNatural Language Processing (NLP) on data from social network services (SNSs) became more diffic...
Text normalization is an indispensable stage for natural language processing of social media data wi...
© 2014 Dr. Bo HanSocial media has been an attractive target for many natural language processing (NL...
is one of the most important data sources in social data analysis. However, the text contained on Tw...
As social media constitute a valuable source for data analysis for a wide range of applications, the...
Tweets often contain a large proportion of abbreviations, alternative spellings, novel words and oth...
In this work, we adapt the traditional framework for spelling correction to the more novel task of n...
The ever-growing usage of social media platforms generates daily vast amounts of textual data which ...
The informal nature of social media text renders it very difficult to be automati-cally processed by...
One of the main characteristics of social media data is the use of non-standard language. Since NLP ...
Existing natural language processing systems have often been designed with standard texts in mind. H...
The language used in social media is often characterized by the abundance of informal and non-standa...
This work explores normalization for parser adaptation. Traditionally, normalization is used as sepa...
This is an accepted manuscript of an article published by IEEE in 2018 3rd International Conference ...
The automatic analysis (parsing) of natural language is an important ingredient for many natural lan...
MasterNatural Language Processing (NLP) on data from social network services (SNSs) became more diffic...
Text normalization is an indispensable stage for natural language processing of social media data wi...
© 2014 Dr. Bo HanSocial media has been an attractive target for many natural language processing (NL...
is one of the most important data sources in social data analysis. However, the text contained on Tw...
As social media constitute a valuable source for data analysis for a wide range of applications, the...
Tweets often contain a large proportion of abbreviations, alternative spellings, novel words and oth...
In this work, we adapt the traditional framework for spelling correction to the more novel task of n...
The ever-growing usage of social media platforms generates daily vast amounts of textual data which ...
The informal nature of social media text renders it very difficult to be automati-cally processed by...
One of the main characteristics of social media data is the use of non-standard language. Since NLP ...
Existing natural language processing systems have often been designed with standard texts in mind. H...
The language used in social media is often characterized by the abundance of informal and non-standa...
This work explores normalization for parser adaptation. Traditionally, normalization is used as sepa...
This is an accepted manuscript of an article published by IEEE in 2018 3rd International Conference ...
The automatic analysis (parsing) of natural language is an important ingredient for many natural lan...