There are many general purpose benchmark datasets for Semantic Textual Similarity but none of them are focused on technical concepts found in patents and scientific publications. This work aims to fill this gap by presenting a new human rated contextual phrase to phrase matching dataset. The entire dataset contains close to $50,000$ rated phrase pairs, each with a CPC (Cooperative Patent Classification) class as a context. This paper describes the dataset and some baseline models.Comment: Presented at the SIGIR PatentSemTech 2022 Workshop. The dataset can be accessed at https://www.kaggle.com/datasets/google/google-patent-phrase-similarity-datase
Abstract. Relevance Feedback methods generally suffer from topic drift caused by words ambiguity and...
This research presents a new benchmark dataset for evaluating Short Text Semantic Similarity (STSS) ...
This paper presents an automatic approach to creating taxonomies of technical terms based on the Coo...
This research project aims to develop a Transformer-based multi-label classifier for the classificat...
Abstract: Automatic annotation of key phrases for their semantic categories can help improving effec...
Pairwise semantic similarity measures for US utility patents. Includes measures for citing/cited pat...
We propose using text matching to measure the technological similarity between patents. Technology e...
The intellectual property economy, and, more narrowly, the patent economy, form an incredibly wide-r...
Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering an...
This report summarizes the research, methodologies and experimental implementations on context-based...
AbstractPatent search is a complex task and involves a great level of expertise. Through this resear...
We propose using text matching to measure the technological similarity between patents. Technology e...
For mining intellectual property texts (patents), a broad-coverage lexicon that covers general Engli...
Patent Landscaping, one of the central tasks of intellectual property management, includes selecting...
With the advent of the knowledge economy, firms often compete for intellectual property rights. Bein...
Abstract. Relevance Feedback methods generally suffer from topic drift caused by words ambiguity and...
This research presents a new benchmark dataset for evaluating Short Text Semantic Similarity (STSS) ...
This paper presents an automatic approach to creating taxonomies of technical terms based on the Coo...
This research project aims to develop a Transformer-based multi-label classifier for the classificat...
Abstract: Automatic annotation of key phrases for their semantic categories can help improving effec...
Pairwise semantic similarity measures for US utility patents. Includes measures for citing/cited pat...
We propose using text matching to measure the technological similarity between patents. Technology e...
The intellectual property economy, and, more narrowly, the patent economy, form an incredibly wide-r...
Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering an...
This report summarizes the research, methodologies and experimental implementations on context-based...
AbstractPatent search is a complex task and involves a great level of expertise. Through this resear...
We propose using text matching to measure the technological similarity between patents. Technology e...
For mining intellectual property texts (patents), a broad-coverage lexicon that covers general Engli...
Patent Landscaping, one of the central tasks of intellectual property management, includes selecting...
With the advent of the knowledge economy, firms often compete for intellectual property rights. Bein...
Abstract. Relevance Feedback methods generally suffer from topic drift caused by words ambiguity and...
This research presents a new benchmark dataset for evaluating Short Text Semantic Similarity (STSS) ...
This paper presents an automatic approach to creating taxonomies of technical terms based on the Coo...