This paper presents an automatic approach to creating taxonomies of technical terms based on the Cooperative Patent Classification (CPC). The resulting taxonomy contains about 170k nodes in 9 separate technological branches and is freely available. We also show that a Text-to-Text Transfer Transformer (T5) model can be fine-tuned to generate hypernyms and hyponyms with relatively high precision, confirming the manually assessed quality of the resource. The T5 model opens the taxonomy to any new technological terms for which a hypernym can be generated, thus making the resource updateable with new terms, an essential feature for the constantly evolving field of technological terminology.Comment: ToTh 2022 - Terminology & Ontology: Theories a...
Gurulingappa H, Müller B, Klinger R, et al. Patent Retrieval in Chemistry based on semantically tagg...
In the patent domain Boolean retrieval is particularly common. But despite the importance of Boolean...
International patent corpus is a gigantic source containing today about 80 million of documents. Eve...
International audienceThis paper presents an automatic approach to creating taxonomies of technical ...
Due to the large amount of available patent data, it is no longer feasible for industry actors to ma...
For mining intellectual property texts (patents), a broad-coverage lexicon that covers general Engli...
Although the World Wide Web has of late become an important source to consult for the meaning of wor...
In the patent domain significant efforts are invested to assist researchers in formulating better qu...
International audienceThe understanding of technological innovation's patterns is crucial for both t...
In this paper, we extend some usual techniques of classification resulting from a large-scale data-m...
Retrieving research papers and patents is important for any researcher assessing the scope of a fiel...
There are many general purpose benchmark datasets for Semantic Textual Similarity but none of them a...
NLP methods for automatic information access to rich technological knowledge sources like patents ar...
Abstract: Automatic annotation of key phrases for their semantic categories can help improving effec...
Patents are one of the most reliable sources of technology intelligence, and the true value of paten...
Gurulingappa H, Müller B, Klinger R, et al. Patent Retrieval in Chemistry based on semantically tagg...
In the patent domain Boolean retrieval is particularly common. But despite the importance of Boolean...
International patent corpus is a gigantic source containing today about 80 million of documents. Eve...
International audienceThis paper presents an automatic approach to creating taxonomies of technical ...
Due to the large amount of available patent data, it is no longer feasible for industry actors to ma...
For mining intellectual property texts (patents), a broad-coverage lexicon that covers general Engli...
Although the World Wide Web has of late become an important source to consult for the meaning of wor...
In the patent domain significant efforts are invested to assist researchers in formulating better qu...
International audienceThe understanding of technological innovation's patterns is crucial for both t...
In this paper, we extend some usual techniques of classification resulting from a large-scale data-m...
Retrieving research papers and patents is important for any researcher assessing the scope of a fiel...
There are many general purpose benchmark datasets for Semantic Textual Similarity but none of them a...
NLP methods for automatic information access to rich technological knowledge sources like patents ar...
Abstract: Automatic annotation of key phrases for their semantic categories can help improving effec...
Patents are one of the most reliable sources of technology intelligence, and the true value of paten...
Gurulingappa H, Müller B, Klinger R, et al. Patent Retrieval in Chemistry based on semantically tagg...
In the patent domain Boolean retrieval is particularly common. But despite the importance of Boolean...
International patent corpus is a gigantic source containing today about 80 million of documents. Eve...