In this thesis, the autonomization of extracting information from PDFs of Swedish film scriptsthrough various machine learning techniques and named entity recognition (NER) is explored.Furthermore, it is explored if labeled data needed for the NER tasks can be reduced to some degreewith the goal of saving time. The autonomization process is split into two subsystems, one forextracting larger chunks of text and one for extracting relevant information through named entitiesfrom some of the larger text-chunks using NER. The methods explored for accelerating the labelingtime for NER are active learning and self learning. For active learning, three methods are explored:Logprob and Word Entropy as uncertainty based active learning methods, and ac...
Machine learning is one of many buzz words in todays tech-world. Huge company resources are allocate...
Named entity recognition (NER) constitutes an important step in the processing of unstructured text ...
Getting correctly labelled data is an important preliminary stage for many supervisedmachine learnin...
In this thesis, the autonomization of extracting information from PDFs of Swedish film scriptsthroug...
Att manuellt hantera och klassificera stora mängder textdokument tar mycket tid och kräver mycket pe...
The recent advancements of Natural Language Processing have cleared the path for many new applicatio...
Named entity recognition (NER) is the process to sequence label an unstructured data to solve high a...
Document image processing and handwritten text recognition have been applied to a variety of materia...
The existence of Natural Language Processing(NLP) provides numerous benefits, including the understa...
Att manuellt välja en eller flera meningar ur en filmrecension att använda som citat kan vara en tid...
Automatic movie analysis is the task of employing Machine Learning methods to the field of screenpl...
Machine learning is a field within Computer Science that is still growing. Finding innovative ways t...
Labeled data, which is a collection of data samples that have been tagged with one or more labels, p...
Named entity recognition (NER) is a task that concerns detecting and categorising certain informatio...
Data Ductus is a Swedish IT-consultant company, their customer base ranging from small startups to l...
Machine learning is one of many buzz words in todays tech-world. Huge company resources are allocate...
Named entity recognition (NER) constitutes an important step in the processing of unstructured text ...
Getting correctly labelled data is an important preliminary stage for many supervisedmachine learnin...
In this thesis, the autonomization of extracting information from PDFs of Swedish film scriptsthroug...
Att manuellt hantera och klassificera stora mängder textdokument tar mycket tid och kräver mycket pe...
The recent advancements of Natural Language Processing have cleared the path for many new applicatio...
Named entity recognition (NER) is the process to sequence label an unstructured data to solve high a...
Document image processing and handwritten text recognition have been applied to a variety of materia...
The existence of Natural Language Processing(NLP) provides numerous benefits, including the understa...
Att manuellt välja en eller flera meningar ur en filmrecension att använda som citat kan vara en tid...
Automatic movie analysis is the task of employing Machine Learning methods to the field of screenpl...
Machine learning is a field within Computer Science that is still growing. Finding innovative ways t...
Labeled data, which is a collection of data samples that have been tagged with one or more labels, p...
Named entity recognition (NER) is a task that concerns detecting and categorising certain informatio...
Data Ductus is a Swedish IT-consultant company, their customer base ranging from small startups to l...
Machine learning is one of many buzz words in todays tech-world. Huge company resources are allocate...
Named entity recognition (NER) constitutes an important step in the processing of unstructured text ...
Getting correctly labelled data is an important preliminary stage for many supervisedmachine learnin...