Artificial intelligence develops techniques and systems whose performance must be evaluated on a regular basis in order to certify and foster progress in the discipline. We will describe and critically assess the different ways AI systems are evaluated. We first focus on the traditional task-oriented evaluation approach. We see that black-box (behavioural evaluation) is becoming more and more common, as AI systems are becoming more complex and unpredictable. We identify three kinds of evaluation: human discrimination, problem benchmarks and peer confrontation. We describe the limitations of the many evaluation settings and competitions in these three categories and propose several ideas for a more systematic and robust evaluation. We then f...
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Enginee...
This is the author’s version of a work that was accepted for publication in Artificial Intelligence....
The evaluation of an AGI system can take many forms. There is a long tradition in Artificial Intelli...
Artificial intelligence develops techniques and systems whose performance must be evaluated on a reg...
The final publication is available at Springer via http://dx.doi.org/ 10.1007/s10462-016-9505-7.The...
We report on a series of new platforms and events dealing with AI evaluation that may change the way...
We report on a series of new platforms and events dealing with AI evaluation that may change the way...
Today, available methods that assess AI systems are focused on using empirical techniques to measure...
Artificial General Intelligence seeks to create an artificial system capable of solving many differe...
Abstract—Artificial intelligence (AI) is having a deep impact on the way humans work, communicate an...
Among the difficulties in evaluating AI-type medical diagnosis systems are: the intermediate conclus...
The starting point for our discussion is some shortcoming of the Turing test of Artificial Intellige...
Computer games are becoming more popular for both entertainment and educational applications. The g...
Artificial Intelligence (AI) is a field of computer science that primarily focuses on automating tas...
The hereby article is to present the notions of two concepts: human and artificial intelligence. The...
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Enginee...
This is the author’s version of a work that was accepted for publication in Artificial Intelligence....
The evaluation of an AGI system can take many forms. There is a long tradition in Artificial Intelli...
Artificial intelligence develops techniques and systems whose performance must be evaluated on a reg...
The final publication is available at Springer via http://dx.doi.org/ 10.1007/s10462-016-9505-7.The...
We report on a series of new platforms and events dealing with AI evaluation that may change the way...
We report on a series of new platforms and events dealing with AI evaluation that may change the way...
Today, available methods that assess AI systems are focused on using empirical techniques to measure...
Artificial General Intelligence seeks to create an artificial system capable of solving many differe...
Abstract—Artificial intelligence (AI) is having a deep impact on the way humans work, communicate an...
Among the difficulties in evaluating AI-type medical diagnosis systems are: the intermediate conclus...
The starting point for our discussion is some shortcoming of the Turing test of Artificial Intellige...
Computer games are becoming more popular for both entertainment and educational applications. The g...
Artificial Intelligence (AI) is a field of computer science that primarily focuses on automating tas...
The hereby article is to present the notions of two concepts: human and artificial intelligence. The...
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Enginee...
This is the author’s version of a work that was accepted for publication in Artificial Intelligence....
The evaluation of an AGI system can take many forms. There is a long tradition in Artificial Intelli...