Poster presented at the 14th Congress of the European Society for Research in Mathematics Education, Free University of Bozen-Bolsano, Italy.
DOCUMENT
Many students persistently misinterpret histograms. This calls for closer inspection of students’ strategies when interpreting histograms and case-value plots (which look similar but are diferent). Using students’ gaze data, we ask: How and how well do upper secondary pre-university school students estimate and compare arithmetic means of histograms and case-value plots? We designed four item types: two requiring mean estimation and two requiring means comparison. Analysis of gaze data of 50 students (15–19 years old) solving these items was triangulated with data from cued recall. We found five strategies. Two hypothesized most common strategies for estimating means were confirmed: a strategy associated with horizontal gazes and a strategy associated with vertical gazes. A third, new, count-and-compute strategy was found. Two more strategies emerged for comparing means that take specific features of the distribution into account. In about half of the histogram tasks, students used correct strategies. Surprisingly, when comparing two case-value plots, some students used distribution features that are only relevant for histograms, such as symmetry. As several incorrect strategies related to how and where the data and the distribution of these data are depicted in histograms, future interventions should aim at supporting students in understanding these concepts in histograms. A methodological advantage of eye-tracking data collection is that it reveals more details about students’ problem-solving processes than thinking-aloud protocols. We speculate that spatial gaze data can be re-used to substantiate ideas about the sensorimotor origin of learning mathematics.
LINK
Graphs are ubiquitous. Many graphs, including histograms, bar charts, and stacked dotplots, have proven tricky to interpret. Students’ gaze data can indicate students’ interpretation strategies on these graphs. We therefore explore the question: In what way can machine learning quantify differences in students’ gaze data when interpreting two near-identical histograms with graph tasks in between? Our work provides evidence that using machine learning in conjunction with gaze data can provide insight into how students analyze and interpret graphs. This approach also sheds light on the ways in which students may better understand a graph after first being presented with other graph types, including dotplots. We conclude with a model that can accurately differentiate between the first and second time a student solved near-identical histogram tasks.
DOCUMENT
Citizens regularly search the Web to make informed decisions on daily life questions, like online purchases, but how they reason with the results is unknown. This reasoning involves engaging with data in ways that require statistical literacy, which is crucial for navigating contemporary data. However, many adults struggle to critically evaluate and interpret such data and make data-informed decisions. Existing literature provides limited insight into how citizens engage with web-sourced information. We investigated: How do adults reason statistically with web-search results to answer daily life questions? In this case study, we observed and interviewed three vocationally educated adults searching for products or mortgages. Unlike data producers, consumers handle pre-existing, often ambiguous data with unclear populations and no single dataset. Participants encountered unstructured (web links) and structured data (prices). We analysed their reasoning and the process of preparing data, which is part of data-ing. Key data-ing actions included judging relevance and trustworthiness of the data and using proxy variables when relevant data were missing (e.g., price for product quality). Participants’ statistical reasoning was mainly informal. For example, they reasoned about association but did not calculate a measure of it, nor assess underlying distributions. This study theoretically contributes to understanding data-ing and why contemporary data may necessitate updating the investigative cycle. As current education focuses mainly on producers’ tasks, we advocate including consumers’ tasks by using authentic contexts (e.g., music, environment, deferred payment) to promote data exploration, informal statistical reasoning, and critical web-search skills—including selecting and filtering information, identifying bias, and evaluating sources.
LINK
Terms like ‘big data’, ‘data science’, and ‘data visualisation’ have become buzzwords in recent years and are increasingly intertwined with journalism. Data visualisation may further blur the lines between science communication and graphic design. Our study is situated in these overlaps to compare the design of data visualisations in science news stories across four online news media platforms in South Africa and the United States. Our study contributes to an understanding of how well-considered data visualisations are tools for effective storytelling, and offers practical recommendations for using data visualisation in science communication efforts.
LINK
Big data analytics received much attention in the last decade and is viewed as one of the next most important strategic resources for organizations. Yet, the role of employees' data literacy seems to be neglected in current literature. The aim of this study is twofold: (1) it develops data literacy as an organization competency by identifying its dimensions and measurement, and (2) it examines the relationship between data literacy and governmental performance (internal and external). Using data from a survey of 120 Dutch governmental agencies, the proposed model was tested using PLS-SEM. The results empirically support the suggested theoretical framework and corresponding measurement instrument. The results partially support the relationship of data literacy with performance as a significant effect of data literacy on internal performance. However, counter-intuitively, this significant effect is not found in relation to external performance.
MULTIFILE
We present a novel architecture for an AI system that allows a priori knowledge to combine with deep learning. In traditional neural networks, all available data is pooled at the input layer. Our alternative neural network is constructed so that partial representations (invariants) are learned in the intermediate layers, which can then be combined with a priori knowledge or with other predictive analyses of the same data. This leads to smaller training datasets due to more efficient learning. In addition, because this architecture allows inclusion of a priori knowledge and interpretable predictive models, the interpretability of the entire system increases while the data can still be used in a black box neural network. Our system makes use of networks of neurons rather than single neurons to enable the representation of approximations (invariants) of the output.
LINK
Exploratory analyses are an important first step in psychological research, particularly in problem-based research where various variables are often included from multiple theoretical perspectives not studied together in combination before. Notably, exploratory analyses aim to give first insights into how items and variables included in a study relate to each other. Typically, exploratory analyses involve computing bivariate correlations between items and variables and presenting them in a table. While this is suitable for relatively small data sets, such tables can easily become overwhelming when datasets contain a broad set of variables from multiple theories. We propose the Gaussian graphical model as a novel exploratory analyses tool and present a systematic roadmap to apply this model to explore relationships between items and variables in environmental psychology research. We demonstrate the use and value of the Gaussian graphical model to study relationships between a broad set of items and variables that are expected to explain the effectiveness of community energy initiatives in promoting sustainable energy behaviors.
LINK
Background: Adverse outcome pathway (AOP) networks are versatile tools in toxicology and risk assessment that capture and visualize mechanisms driving toxicity originating from various data sources. They share a common structure consisting of a set of molecular initiating events and key events, connected by key event relationships, leading to the actual adverse outcome. AOP networks are to be considered living documents that should be frequently updated by feeding in new data. Such iterative optimization exercises are typically done manually, which not only is a time-consuming effort, but also bears the risk of overlooking critical data. The present study introduces a novel approach for AOP network optimization of a previously published AOP network on chemical-induced cholestasis using artificial intelligence to facilitate automated data collection followed by subsequent quantitative confidence assessment of molecular initiating events, key events, and key event relationships. Methods: Artificial intelligence-assisted data collection was performed by means of the free web platform Sysrev. Confidence levels of the tailored Bradford-Hill criteria were quantified for the purpose of weight-of-evidence assessment of the optimized AOP network. Scores were calculated for biological plausibility, empirical evidence, and essentiality, and were integrated into a total key event relationship confidence value. The optimized AOP network was visualized using Cytoscape with the node size representing the incidence of the key event and the edge size indicating the total confidence in the key event relationship. Results: This resulted in the identification of 38 and 135 unique key events and key event relationships, respectively. Transporter changes was the key event with the highest incidence, and formed the most confident key event relationship with the adverse outcome, cholestasis. Other important key events present in the AOP network include: nuclear receptor changes, intracellular bile acid accumulation, bile acid synthesis changes, oxidative stress, inflammation and apoptosis. Conclusions: This process led to the creation of an extensively informative AOP network focused on chemical-induced cholestasis. This optimized AOP network may serve as a mechanistic compass for the development of a battery of in vitro assays to reliably predict chemical-induced cholestatic injury.
DOCUMENT
Inaugural lecture as Lector Precision Livestock Farming at HAS University of Applied Sciences on October 14, 2016. PLF, Precision Livestock Farming, uses technologies to continuously monitor animal behaviour, animal health, production and environmental impact.
DOCUMENT