Annotations in condition monitoring systems contain information regarding asset history and fault characteristics in the form of unstructured text that could, if unlocked, be used for intelligent fault diagnosis. However, processing these annotations with pre-trained natural language models such as BERT is problematic due to out-of-vocabulary (OOV) technical terms, resulting in inaccurate language embeddings. Here we investigate the effect of OOV technical terms on BERT and SentenceBERT embeddings by substituting technical terms with natural language descriptions. The embeddings were computed for each annotation in a pre-processed corpus, with and without substitution. The K-Means clustering score was calculated on sentence embeddings, and ...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
International audienceIn this paper, we present BERT-POS, a simple method for encoding syntax into B...
Recent studies show that it is possible to detect technical dept automatically from source code comm...
Annotations in condition monitoring systems contain information regarding asset history and fault ch...
We propose a novel approach, technical language labelling, to facilitate supervised intelligent faul...
Condition Monitoring (CM) is widely used in industry to meet sustainability, safety, and equipment e...
Trouble reporting is a substantial component in any technical product's maintenance workflow. In thi...
The often observed unavailability of large amounts of training data typically required by deep learn...
In recent years, various industries have been on the quest to derive new knowledge and information f...
Post-marketing reports of suspected adverse drug reactions are important for establishing the safety...
More than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incid...
The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotati...
In nowadays manufacturing, each technical assistance operation is digitally tracked. This results in...
[EMBARGOED UNTIL 6/1/2023] Recently deep learning methods have achieved great success in understandi...
International audienceRequirements are usually “hand-written” and suffers from several problems like...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
International audienceIn this paper, we present BERT-POS, a simple method for encoding syntax into B...
Recent studies show that it is possible to detect technical dept automatically from source code comm...
Annotations in condition monitoring systems contain information regarding asset history and fault ch...
We propose a novel approach, technical language labelling, to facilitate supervised intelligent faul...
Condition Monitoring (CM) is widely used in industry to meet sustainability, safety, and equipment e...
Trouble reporting is a substantial component in any technical product's maintenance workflow. In thi...
The often observed unavailability of large amounts of training data typically required by deep learn...
In recent years, various industries have been on the quest to derive new knowledge and information f...
Post-marketing reports of suspected adverse drug reactions are important for establishing the safety...
More than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incid...
The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotati...
In nowadays manufacturing, each technical assistance operation is digitally tracked. This results in...
[EMBARGOED UNTIL 6/1/2023] Recently deep learning methods have achieved great success in understandi...
International audienceRequirements are usually “hand-written” and suffers from several problems like...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
International audienceIn this paper, we present BERT-POS, a simple method for encoding syntax into B...
Recent studies show that it is possible to detect technical dept automatically from source code comm...