Most previous work of text normalization on informal text made a strong assumption that the system has already known which tokens are non-standard words (NSW) and thus need normalization. However, this is not realistic. In this paper, we propose a method for NSW detection. In addi-tion to the information based on the dic-tionary, e.g., whether a word is out-of-vocabulary (OOV), we leverage novel in-formation derived from the normalization results for OOV words to help make deci-sions. Second, this paper investigates two methods using NSW detection results for named entity recognition (NER) in social media data. One adopts a pipeline strat-egy, and the other uses a joint decoding fashion. We also create a new data set with newly added normal...
The use of short text has become widespread in social media like Twitter and Facebook. Typically, us...
International audienceNamed Entity Recognition (NER) is a traditional Natural Language Processing (N...
Social media data such as Twitter messages ("tweets") pose a particular challenge to NLP systems bec...
Named entity recognition (NER) systems trained on newswire perform very badly when tested on Twitter...
In recent years, social media outlets such as Twitter and Facebook have drawn attention from compani...
In recent years, social media outlets such as Twitter and Facebook have drawn attention from compani...
Social media texts are significant information sources for several application areas including trend...
Social media texts are significant informa-tion sources for several application areas including tren...
The data on Social Network Services (SNSs) has recently become an interesting source for researchers...
Named entity recognition (NER) is one of the well-studied sub-branch of natural language processing ...
Named Entity Recognition (NER) is an important subtask of information extraction that seeks to locat...
Various recent studies show that the performance of named entity recognition (NER) systems developed...
amed Entity Recognition (NER) is an important subtask of information extraction that seeks to locate...
Applying natural language processing for mining and intelligent information access to tweets (a form...
Named Entity Recognition (NER) is a well-studied domain in Natural Language Processing. Traditional ...
The use of short text has become widespread in social media like Twitter and Facebook. Typically, us...
International audienceNamed Entity Recognition (NER) is a traditional Natural Language Processing (N...
Social media data such as Twitter messages ("tweets") pose a particular challenge to NLP systems bec...
Named entity recognition (NER) systems trained on newswire perform very badly when tested on Twitter...
In recent years, social media outlets such as Twitter and Facebook have drawn attention from compani...
In recent years, social media outlets such as Twitter and Facebook have drawn attention from compani...
Social media texts are significant information sources for several application areas including trend...
Social media texts are significant informa-tion sources for several application areas including tren...
The data on Social Network Services (SNSs) has recently become an interesting source for researchers...
Named entity recognition (NER) is one of the well-studied sub-branch of natural language processing ...
Named Entity Recognition (NER) is an important subtask of information extraction that seeks to locat...
Various recent studies show that the performance of named entity recognition (NER) systems developed...
amed Entity Recognition (NER) is an important subtask of information extraction that seeks to locate...
Applying natural language processing for mining and intelligent information access to tweets (a form...
Named Entity Recognition (NER) is a well-studied domain in Natural Language Processing. Traditional ...
The use of short text has become widespread in social media like Twitter and Facebook. Typically, us...
International audienceNamed Entity Recognition (NER) is a traditional Natural Language Processing (N...
Social media data such as Twitter messages ("tweets") pose a particular challenge to NLP systems bec...