Rule Based vs. Statistical Chunking of CoNLL Data Sets

Igor Boehm

Publication date

August 2015

Abstract

One of the most common operations in language process-ing are segmentation and labelling [7]. Chunking is a popular representative of a segmentation process which aims to segment tagged tokens into meaningful struc-tures. This paper compares two chunking approaches, namely an approach based on regular expression rules developed by a human, and a machine based chunking approach based on a N-gram statistical tagger. Experi-mental results show that the performance of the machine based chunker is very similar to the results obtained by the regular expression chunker. Another interesting fact is that it was considerably harder to define regular expressions which capture noun phrases than verb phrases. Obviously this difficulty was caused by the ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Rule Based vs. Statistical Chunking of CoNLL Data Sets

Abstract

Extracted data

Rule Based vs. Statistical Chunking of CoNLL Data Sets

Abstract

Extracted data

Related items

Related items