In this post I give an overview of the theory, tools, frameworks and best practices I have found until now around the testing (and debugging) of machine learning applications. I will start by giving an overview of the specificities of testing machine learning applications.
LINK
Background: In the Netherlands, general practitioners (GPs) play a key role in provider-initiated HIV testing, but opportunities for timely diagnosis are regularly missed. We implemented an educational intervention to improve HIV testing by GPs from 2015 to 2020, and observed a 7% increase in testing in an evaluation using laboratory data. The objective for the current study was to gain a deeper understanding of whether and how practices and perceptions of GPs’ HIV/sexually transmitted infection (STI) testing behaviour changed following the intervention. Methods: We performed a mixed-methods study using questionnaires and semi-structured interviews to assess self-reported changes in HIV/STI testing by participating GPs. Questionnaires were completed by participants at the end of the final educational sessions from 2017 through 2020, and participating GPs were interviewed from January through March 2020. Questionnaire data were analysed descriptively, and open question responses were categorised thematically. Interview data were analysed following thematic analysis methods. Results: In total, 101/103 participants completed questionnaires. Of 65 participants that were included in analyses on the self-reported effect of the programme, forty-seven (72%) reported it had changed their HIV/STI testing, including improved STI consultations, adherence to the STI consultation guideline, more proactive HIV testing, and more extragenital STI testing. Patients’ risk factors, patients’ requests and costs were most important in selecting STI tests ordered. Eight participants were interviewed and 15 themes on improved testing were identified, including improved HIV risk-assessment, more proactive testing for HIV/STI, more focus on HIV indicator conditions and extragenital STI testing, and tools to address HIV during consultations. However, several persistent barriers for optimal HIV/STI testing by GPs were identified, including HIV-related stigma and low perceived risk. Conclusions: Most GPs reported improved HIV/STI knowledge, attitude and testing, but there was a discrepancy between reported changes in HIV testing and observed increases using laboratory data. Our findings highlight challenges in implementation of effective interventions, and in their evaluation. Lessons learned from this intervention may inform follow-up initiatives to keep GPs actively engaged in HIV testing and care, on our way to zero new HIV infections.
DOCUMENT
Implementation of reliable methodologies allowing Reduction, Refinement, and Replacement (3Rs) of animal testing is a process that takes several decades and is still not complete. Reliable methods are essential for regulatory hazard assessment of chemicals where differences in test protocol can influence the test outcomes and thus affect the confidence in the predictive value of the organisms used as an alternative for mammals. Although test guidelines are common for mammalian studies, they are scarce for non-vertebrate organisms that would allow for the 3Rs of animal testing. Here, we present a set of 30 reporting criteria as the basis for such a guideline for Developmental and Reproductive Toxicology (DART) testing in the nematode Caenorhabditis elegans. Small organisms like C. elegans are upcoming in new approach methodologies for hazard assessment; thus, reliable and robust test protocols are urgently needed. A literature assessment of the fulfilment of the reporting criteria demonstrates that although studies describe methodological details, essential information such as compound purity and lot/batch number or type of container is often not reported. The formulated set of reporting criteria for C. elegans testing can be used by (i) researchers to describe essential experimental details (ii) data scientists that aggregate information to assess data quality and include data in aggregated databases (iii) regulators to assess study data for inclusion in regulatory hazard assessment of chemicals.
DOCUMENT
In the last decade, the automotive industry has seen significant advancements in technology (Advanced Driver Assistance Systems (ADAS) and autonomous vehicles) that presents the opportunity to improve traffic safety, efficiency, and comfort. However, the lack of drivers’ knowledge (such as risks, benefits, capabilities, limitations, and components) and confusion (i.e., multiple systems that have similar but not identical functions with different names) concerning the vehicle technology still prevails and thus, limiting the safety potential. The usual sources (such as the owner’s manual, instructions from a sales representative, online forums, and post-purchase training) do not provide adequate and sustainable knowledge to drivers concerning ADAS. Additionally, existing driving training and examinations focus mainly on unassisted driving and are practically unchanged for 30 years. Therefore, where and how drivers should obtain the necessary skills and knowledge for safely and effectively using ADAS? The proposed KIEM project AMIGO aims to create a training framework for learner drivers by combining classroom, online/virtual, and on-the-road training modules for imparting adequate knowledge and skills (such as risk assessment, handling in safety-critical and take-over transitions, and self-evaluation). AMIGO will also develop an assessment procedure to evaluate the impact of ADAS training on drivers’ skills and knowledge by defining key performance indicators (KPIs) using in-vehicle data, eye-tracking data, and subjective measures. For practical reasons, AMIGO will focus on either lane-keeping assistance (LKA) or adaptive cruise control (ACC) for framework development and testing, depending on the system availability. The insights obtained from this project will serve as a foundation for a subsequent research project, which will expand the AMIGO framework to other ADAS systems (e.g., mandatory ADAS systems in new cars from 2020 onwards) and specific driver target groups, such as the elderly and novice.