In this post I give an overview of the theory, tools, frameworks and best practices I have found until now around the testing (and debugging) of machine learning applications. I will start by giving an overview of the specificities of testing machine learning applications.
LINK
Implementation of reliable methodologies allowing Reduction, Refinement, and Replacement (3Rs) of animal testing is a process that takes several decades and is still not complete. Reliable methods are essential for regulatory hazard assessment of chemicals where differences in test protocol can influence the test outcomes and thus affect the confidence in the predictive value of the organisms used as an alternative for mammals. Although test guidelines are common for mammalian studies, they are scarce for non-vertebrate organisms that would allow for the 3Rs of animal testing. Here, we present a set of 30 reporting criteria as the basis for such a guideline for Developmental and Reproductive Toxicology (DART) testing in the nematode Caenorhabditis elegans. Small organisms like C. elegans are upcoming in new approach methodologies for hazard assessment; thus, reliable and robust test protocols are urgently needed. A literature assessment of the fulfilment of the reporting criteria demonstrates that although studies describe methodological details, essential information such as compound purity and lot/batch number or type of container is often not reported. The formulated set of reporting criteria for C. elegans testing can be used by (i) researchers to describe essential experimental details (ii) data scientists that aggregate information to assess data quality and include data in aggregated databases (iii) regulators to assess study data for inclusion in regulatory hazard assessment of chemicals.
DOCUMENT
KEY MESSAGE: • Statistical significance testing alone is not the most adequate manner to evaluate if there is indeed a clinically relevant effect • Effect sizes should be added to significance testing • Effect sizes facilitate the decision whether a clinically relevant effect is found, helps determining the sample size for future studies, and facilitates comparison between scientific studies
DOCUMENT
In the last decade, the automotive industry has seen significant advancements in technology (Advanced Driver Assistance Systems (ADAS) and autonomous vehicles) that presents the opportunity to improve traffic safety, efficiency, and comfort. However, the lack of drivers’ knowledge (such as risks, benefits, capabilities, limitations, and components) and confusion (i.e., multiple systems that have similar but not identical functions with different names) concerning the vehicle technology still prevails and thus, limiting the safety potential. The usual sources (such as the owner’s manual, instructions from a sales representative, online forums, and post-purchase training) do not provide adequate and sustainable knowledge to drivers concerning ADAS. Additionally, existing driving training and examinations focus mainly on unassisted driving and are practically unchanged for 30 years. Therefore, where and how drivers should obtain the necessary skills and knowledge for safely and effectively using ADAS? The proposed KIEM project AMIGO aims to create a training framework for learner drivers by combining classroom, online/virtual, and on-the-road training modules for imparting adequate knowledge and skills (such as risk assessment, handling in safety-critical and take-over transitions, and self-evaluation). AMIGO will also develop an assessment procedure to evaluate the impact of ADAS training on drivers’ skills and knowledge by defining key performance indicators (KPIs) using in-vehicle data, eye-tracking data, and subjective measures. For practical reasons, AMIGO will focus on either lane-keeping assistance (LKA) or adaptive cruise control (ACC) for framework development and testing, depending on the system availability. The insights obtained from this project will serve as a foundation for a subsequent research project, which will expand the AMIGO framework to other ADAS systems (e.g., mandatory ADAS systems in new cars from 2020 onwards) and specific driver target groups, such as the elderly and novice.