Terms like ‘big data’, ‘data science’, and ‘data visualisation’ have become buzzwords in recent years and are increasingly intertwined with journalism. Data visualisation may further blur the lines between science communication and graphic design. Our study is situated in these overlaps to compare the design of data visualisations in science news stories across four online news media platforms in South Africa and the United States. Our study contributes to an understanding of how well-considered data visualisations are tools for effective storytelling, and offers practical recommendations for using data visualisation in science communication efforts.
LINK
In this study, a data feedback program to improve teachers’ science and technology (S&T) teaching skills was designed and tested. The aim was to understand whether and how the four design principles underlying this program stimulated the intended teacher support. We examined how teachers in different phases of their career applied and experienced the employed design principles’ key aspects. Eight in-service teachers and eight pre-service teachers attended the data feedback program and kept a logbook in the meantime. Group interviews were held afterwards. Findings show that applying the four employed design principles’ key aspects did support and stimulate in- and pre-service teachers in carrying out data feedback for improving their S&T teaching. However, some key aspects were not applied and/or experienced as intended by all attending teachers. The findings provide possible implications for the development and implementation of professional development programs to support in - and pre-service teachers’ S&T teaching using data feedback.
DOCUMENT
Routine immunization (RI) of children is the most effective and timely public health intervention for decreasing child mortality rates around the globe. Pakistan being a low-and-middle-income-country (LMIC) has one of the highest child mortality rates in the world occurring mainly due to vaccine-preventable diseases (VPDs). For improving RI coverage, a critical need is to establish potential RI defaulters at an early stage, so that appropriate interventions can be targeted towards such population who are identified to be at risk of missing on their scheduled vaccine uptakes. In this paper, a machine learning (ML) based predictive model has been proposed to predict defaulting and non-defaulting children on upcoming immunization visits and examine the effect of its underlying contributing factors. The predictive model uses data obtained from Paigham-e-Sehat study having immunization records of 3,113 children. The design of predictive model is based on obtaining optimal results across accuracy, specificity, and sensitivity, to ensure model outcomes remain practically relevant to the problem addressed. Further optimization of predictive model is obtained through selection of significant features and removing data bias. Nine machine learning algorithms were applied for prediction of defaulting children for the next immunization visit. The results showed that the random forest model achieves the optimal accuracy of 81.9% with 83.6% sensitivity and 80.3% specificity. The main determinants of vaccination coverage were found to be vaccine coverage at birth, parental education, and socio-economic conditions of the defaulting group. This information can assist relevant policy makers to take proactive and effective measures for developing evidence based targeted and timely interventions for defaulting children.
MULTIFILE
Paralympic wheelchair athletes solely depend on the power of their upper-body for their on-court wheeled mobility as well as for performing sport-specific actions in ball sports, like a basketball shot or a tennis serve. The objective of WheelPower is to improve the power output of athletes in their sport-specific wheelchair to perform better in competition. To achieve this objective the current project systematically combines the three Dutch measurement innovations (WMPM, Esseda wheelchair ergometer, PitchPerfect system) to monitor a large population of athletes from different wheelchair sports resulting in optimal power production by wheelchair athletes during competition. The data will be directly implemented in feedback tools accessible to athletes, trainers and coaches which gives them the unique opportunity to adapt their training and wheelchair settings for optimal performance. Hence, the current consortium facilitates mass and focus by uniting scientists and all major Paralympic wheelchair sports to monitor the power output of many wheelchair athletes under field and lab conditions, which will be assisted by the best data science approach to this challenge.
DOCUMENT
Background: Adverse outcome pathway (AOP) networks are versatile tools in toxicology and risk assessment that capture and visualize mechanisms driving toxicity originating from various data sources. They share a common structure consisting of a set of molecular initiating events and key events, connected by key event relationships, leading to the actual adverse outcome. AOP networks are to be considered living documents that should be frequently updated by feeding in new data. Such iterative optimization exercises are typically done manually, which not only is a time-consuming effort, but also bears the risk of overlooking critical data. The present study introduces a novel approach for AOP network optimization of a previously published AOP network on chemical-induced cholestasis using artificial intelligence to facilitate automated data collection followed by subsequent quantitative confidence assessment of molecular initiating events, key events, and key event relationships. Methods: Artificial intelligence-assisted data collection was performed by means of the free web platform Sysrev. Confidence levels of the tailored Bradford-Hill criteria were quantified for the purpose of weight-of-evidence assessment of the optimized AOP network. Scores were calculated for biological plausibility, empirical evidence, and essentiality, and were integrated into a total key event relationship confidence value. The optimized AOP network was visualized using Cytoscape with the node size representing the incidence of the key event and the edge size indicating the total confidence in the key event relationship. Results: This resulted in the identification of 38 and 135 unique key events and key event relationships, respectively. Transporter changes was the key event with the highest incidence, and formed the most confident key event relationship with the adverse outcome, cholestasis. Other important key events present in the AOP network include: nuclear receptor changes, intracellular bile acid accumulation, bile acid synthesis changes, oxidative stress, inflammation and apoptosis. Conclusions: This process led to the creation of an extensively informative AOP network focused on chemical-induced cholestasis. This optimized AOP network may serve as a mechanistic compass for the development of a battery of in vitro assays to reliably predict chemical-induced cholestatic injury.
DOCUMENT
Brochure from the Inauguration of Klaas Dijkstra, professor Computer Vision and Data Science
DOCUMENT
During the past two decades the implementation and adoption of information technology has rapidly increased. As a consequence the way businesses operate has changed dramatically. For example, the amount of data has grown exponentially. Companies are looking for ways to use this data to add value to their business. This has implications for the manner in which (financial) governance needs to be organized. The main purpose of this study is to obtain insight in the changing role of controllers in order to add value to the business by means of data analytics. To answer the research question a literature study was performed to establish a theoretical foundation concerning data analytics and its potential use. Second, nineteen interviews were conducted with controllers, data scientists and academics in the financial domain. Thirdly, a focus group with experts was organized in which additional data were gathered. Based on the literature study and the participants responses it is clear that the challenge of the data explosion consist of converting data into information, knowledge and meaningful insights to support decision-making processes. Performing data analyses enables the controller to support rational decision making to complement the intuitive decision making by (senior) management. In this way, the controller has the opportunity to be in the lead of the information provision within an organization. However, controllers need to have more advanced data science and statistic competences to be able to provide management with effective analysis. Specifically, we found that an important skill regarding statistics is the visualization and communication of statistical analysis. This is needed for controllers in order to grow in their role as business partner..
DOCUMENT
Abstract Background: COVID-19 was first identified in December 2019 in the city of Wuhan, China. The virus quickly spread and was declared a pandemic on March 11, 2020. After infection, symptoms such as fever, a (dry) cough, nasal congestion, and fatigue can develop. In some cases, the virus causes severe complications such as pneumonia and dyspnea and could result in death. The virus also spread rapidly in the Netherlands, a small and densely populated country with an aging population. Health care in the Netherlands is of a high standard, but there were nevertheless problems with hospital capacity, such as the number of available beds and staff. There were also regions and municipalities that were hit harder than others. In the Netherlands, there are important data sources available for daily COVID-19 numbers and information about municipalities. Objective: We aimed to predict the cumulative number of confirmed COVID-19 infections per 10,000 inhabitants per municipality in the Netherlands, using a data set with the properties of 355 municipalities in the Netherlands and advanced modeling techniques. Methods: We collected relevant static data per municipality from data sources that were available in the Dutch public domain and merged these data with the dynamic daily number of infections from January 1, 2020, to May 9, 2021, resulting in a data set with 355 municipalities in the Netherlands and variables grouped into 20 topics. The modeling techniques random forest and multiple fractional polynomials were used to construct a prediction model for predicting the cumulative number of confirmed COVID-19 infections per 10,000 inhabitants per municipality in the Netherlands. Results: The final prediction model had an R2 of 0.63. Important properties for predicting the cumulative number of confirmed COVID-19 infections per 10,000 inhabitants in a municipality in the Netherlands were exposure to particulate matter with diameters <10 μm (PM10) in the air, the percentage of Labour party voters, and the number of children in a household. Conclusions: Data about municipality properties in relation to the cumulative number of confirmed infections in a municipality in the Netherlands can give insight into the most important properties of a municipality for predicting the cumulative number of confirmed COVID-19 infections per 10,000 inhabitants in a municipality. This insight can provide policy makers with tools to cope with COVID-19 and may also be of value in the event of a future pandemic, so that municipalities are better prepared.
LINK
In this project we take a look at the laws and regulations surrounding data collection using sensors in assistive technology and the literature on concerns of people about this technology. We also look into the Smart Teddy device and how it operates. An analysis required by the General Data Protection Regulation (GDPR) [5] will reveal the risks in terms of privacy and security in this project and how to mitigate them. https://nl.linkedin.com/in/haniers
MULTIFILE
In the rapidly evolving field of Machine Learning , selecting the most appropriate model for a given dataset is crucial. Understanding the characteristics of a dataset can significantly influence the outcomes of predictive modeling efforts, making the study of the properties of the dataset an essential component of data science. This study investigates the possibilities of using simulated human data for personalized applications, specifically for testing clustering approaches. In particular, the study focuses on the relationship between dataset characteristics and the selection of the optimal classification model for clusters of datasets. The results of this study provide critical insights for researchers and practitioners in machine learning, emphasizing the importance of dataset characteristics and variability in building and selecting robust models for diverse data conditions. The use of human simulation data provide valuable insights but requires further refinement to capture the full variability of real-world conditions.
DOCUMENT