Forming a reliable judgement of a machine learning (ML) model's appropriateness for an application ecosystem is critical for its responsible use, and requires considering a broad range of factors including harms, benefits, and responsibilities. In practice, however, evaluations of ML models frequently focus on only a narrow range of decontextualized predictive behaviours. We examine the evaluation gaps between the idealized breadth of evaluation concerns and the observed narrow focus of actual evaluations. Through an empirical study of papers from recent high-profile conferences in the Computer Vision and Natural Language Processing communities, we demonstrate a general focus on a handful of evaluation methods. By considering the metrics an...
Developing functional machine learning (ML)-based models to address unmet clinical needs requires un...
How and when can we depend on machine learning systems to make decisions for human-being? This is pr...
Reliable and robust evaluation methods are a necessary first step towards developing machine learnin...
The evaluation of classifiers or learning algorithms is not a topic that has, generally, been given ...
International audienceThis chapter describes how to validate a machine learning model. We start by d...
This paper gives an overview of some ways in which our understanding of performance evaluation measu...
Developing state-of-the-art approaches for specific tasks is a major driving force in our research c...
How can one meaningfully make a measurement, if the meter does not conform to any standard and its s...
A rapidly expanding universe of technology-focused startups is trying to change and improve the way ...
In this paper, we argue that the way we have been training and evaluating ML models has largely forg...
Thesis (Ph.D.)--University of Washington, 2018Despite many successes, complex machine learning syste...
Prompted by its performance on a variety of benchmark tasks, machine learning (ML) is now being appl...
Purpose: Despite the potential of machine learning models, the lack of generalizability has hindered...
Machine Learning has become more and more prominent in our daily lives as the Information Age and Fo...
Machine Learning (ML) models now inform a wide range of human decisions, but using ``black box'' mod...
Developing functional machine learning (ML)-based models to address unmet clinical needs requires un...
How and when can we depend on machine learning systems to make decisions for human-being? This is pr...
Reliable and robust evaluation methods are a necessary first step towards developing machine learnin...
The evaluation of classifiers or learning algorithms is not a topic that has, generally, been given ...
International audienceThis chapter describes how to validate a machine learning model. We start by d...
This paper gives an overview of some ways in which our understanding of performance evaluation measu...
Developing state-of-the-art approaches for specific tasks is a major driving force in our research c...
How can one meaningfully make a measurement, if the meter does not conform to any standard and its s...
A rapidly expanding universe of technology-focused startups is trying to change and improve the way ...
In this paper, we argue that the way we have been training and evaluating ML models has largely forg...
Thesis (Ph.D.)--University of Washington, 2018Despite many successes, complex machine learning syste...
Prompted by its performance on a variety of benchmark tasks, machine learning (ML) is now being appl...
Purpose: Despite the potential of machine learning models, the lack of generalizability has hindered...
Machine Learning has become more and more prominent in our daily lives as the Information Age and Fo...
Machine Learning (ML) models now inform a wide range of human decisions, but using ``black box'' mod...
Developing functional machine learning (ML)-based models to address unmet clinical needs requires un...
How and when can we depend on machine learning systems to make decisions for human-being? This is pr...
Reliable and robust evaluation methods are a necessary first step towards developing machine learnin...