With appropriate pre-training on unstructured text, larger and more accurate neural network models can be trained. Unfortunately, unstructured pre-training data may contain undesired societal biases, which a model may mimic and amplify. This thesis focuses on both improving unsupervised pre-training and developing diagnostics of obtained pre-trained models for potential undesired behaviour. Pre-training and diagnostics are done on two tasks: coreference resolution and knowledge base completion. For both of them, a novel task-specific method for unsupervised pre-training is introduced. Then, the obtained models are analysed for potential undesired behaviour by evaluating them on relevant datasets, focusing on gender ...
Pre-trained models learn informative representations on large-scale training data through a self-sup...
The human brain has the ability to carry out new tasks with limited experience. It utilizes prior le...
Knowledge Distillation (KD) is a prominent neural model compression technique which heavily relies o...
Thesis (Ph.D.)--University of Washington, 2020Modern machine learning algorithms have been able to a...
Machine learning models are built using training data, which is collected from human experience and ...
Deep neural networks that dominate NLP rely on an immense amount of parameters and require large tex...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Generally, the present disclosure is directed to training machine learning models, e.g., deep learni...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
In recent years, Machine Learning and Deep Learning communities have devoted many efforts to studyin...
Pretrained Language Models (PLMs), though popular, have been diagnosed to encode bias against protec...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
In this project, we want to explore the newly emerging field of prompt engineering and apply it to t...
Language models (LMs) are pretrained on diverse data sources, including news, discussion forums, boo...
Large neural network-based language models play an increasingly important role in contemporary AI. A...
Pre-trained models learn informative representations on large-scale training data through a self-sup...
The human brain has the ability to carry out new tasks with limited experience. It utilizes prior le...
Knowledge Distillation (KD) is a prominent neural model compression technique which heavily relies o...
Thesis (Ph.D.)--University of Washington, 2020Modern machine learning algorithms have been able to a...
Machine learning models are built using training data, which is collected from human experience and ...
Deep neural networks that dominate NLP rely on an immense amount of parameters and require large tex...
Recent research suggests that predictions made by machine-learning models can amplify biases present...
Generally, the present disclosure is directed to training machine learning models, e.g., deep learni...
Machine Learning is a branch of artificial intelligence focused on building applications that learn ...
In recent years, Machine Learning and Deep Learning communities have devoted many efforts to studyin...
Pretrained Language Models (PLMs), though popular, have been diagnosed to encode bias against protec...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
In this project, we want to explore the newly emerging field of prompt engineering and apply it to t...
Language models (LMs) are pretrained on diverse data sources, including news, discussion forums, boo...
Large neural network-based language models play an increasingly important role in contemporary AI. A...
Pre-trained models learn informative representations on large-scale training data through a self-sup...
The human brain has the ability to carry out new tasks with limited experience. It utilizes prior le...
Knowledge Distillation (KD) is a prominent neural model compression technique which heavily relies o...