Terms like ‘big data’, ‘data science’, and ‘data visualisation’ have become buzzwords in recent years and are increasingly intertwined with journalism. Data visualisation may further blur the lines between science communication and graphic design. Our study is situated in these overlaps to compare the design of data visualisations in science news stories across four online news media platforms in South Africa and the United States. Our study contributes to an understanding of how well-considered data visualisations are tools for effective storytelling, and offers practical recommendations for using data visualisation in science communication efforts.
LINK
Within the context of the Iliad project, the authors present early design mock-ups and resulting technical challenges for a 2D/3D/4D geo-data visualisation application focused on microparticle flows. The Iliad – Digital Twins of the Ocean project (EU Horizon 2020) aims to develop a ‘system of systems’ for creating cutting-edge digital twins of specific sea and ocean areas for diverse purposes related to their sustainable use and protection. One of the Iliad pilots addresses the topic of water quality monitoring by creating an application offering dynamic 2D and 3D visualisations of specifically identified microparticles, initially observed by buoys/sensors deployed at specific locations and whose subsequent flows are modelled by separate software. The main upcoming technical challenges concern the data-driven approach, where the application’s input data is completely obtained through external API-based services offering (near) real-time observed data from buoys/sensors and simulated data emanating from particle transport models
DOCUMENT
Completeness of data is vital for the decision making and forecasting on Building Management Systems (BMS) as missing data can result in biased decision making down the line. This study creates a guideline for imputing the gaps in BMS datasets by comparing four methods: K Nearest Neighbour algorithm (KNN), Recurrent Neural Network (RNN), Hot Deck (HD) and Last Observation Carried Forward (LOCF). The guideline contains the best method per gap size and scales of measurement. The four selected methods are from various backgrounds and are tested on a real BMS and metereological dataset. The focus of this paper is not to impute every cell as accurately as possible but to impute trends back into the missing data. The performance is characterised by a set of criteria in order to allow the user to choose the imputation method best suited for its needs. The criteria are: Variance Error (VE) and Root Mean Squared Error (RMSE). VE has been given more weight as its ability to evaluate the imputed trend is better than RMSE. From preliminary results, it was concluded that the best K‐values for KNN are 5 for the smallest gap and 100 for the larger gaps. Using a genetic algorithm the best RNN architecture for the purpose of this paper was determined to be GatedRecurrent Units (GRU). The comparison was performed using a different training dataset than the imputation dataset. The results show no consistent link between the difference in Kurtosis or Skewness and imputation performance. The results of the experiment concluded that RNN is best for interval data and HD is best for both nominal and ratio data. There was no single method that was best for all gap sizes as it was dependent on the data to be imputed.
MULTIFILE
The report from Inholland University is dedicated to the impacts of data-driven practices on non-journalistic media production and creative industries. It explores trends, showcases advancements, and highlights opportunities and threats in this dynamic landscape. Examining various stakeholders' perspectives provides actionable insights for navigating challenges and leveraging opportunities. Through curated showcases and analyses, the report underscores the transformative potential of data-driven work while addressing concerns such as copyright issues and AI's role in replacing human artists. The findings culminate in a comprehensive overview that guides informed decision-making in the creative industry.
MULTIFILE
Citizens regularly search the Web to make informed decisions on daily life questions, like online purchases, but how they reason with the results is unknown. This reasoning involves engaging with data in ways that require statistical literacy, which is crucial for navigating contemporary data. However, many adults struggle to critically evaluate and interpret such data and make data-informed decisions. Existing literature provides limited insight into how citizens engage with web-sourced information. We investigated: How do adults reason statistically with web-search results to answer daily life questions? In this case study, we observed and interviewed three vocationally educated adults searching for products or mortgages. Unlike data producers, consumers handle pre-existing, often ambiguous data with unclear populations and no single dataset. Participants encountered unstructured (web links) and structured data (prices). We analysed their reasoning and the process of preparing data, which is part of data-ing. Key data-ing actions included judging relevance and trustworthiness of the data and using proxy variables when relevant data were missing (e.g., price for product quality). Participants’ statistical reasoning was mainly informal. For example, they reasoned about association but did not calculate a measure of it, nor assess underlying distributions. This study theoretically contributes to understanding data-ing and why contemporary data may necessitate updating the investigative cycle. As current education focuses mainly on producers’ tasks, we advocate including consumers’ tasks by using authentic contexts (e.g., music, environment, deferred payment) to promote data exploration, informal statistical reasoning, and critical web-search skills—including selecting and filtering information, identifying bias, and evaluating sources.
LINK
Municipalities increasingly seek to include citizens in decision-making processes regarding local issues, such as urban planning. This paper presents a case study on using Virtual Reality (VR) in a process of civic participation in the redesign of a public park. The municipality included citizens in intensive co-design activities to create three designs for the park and engaged the neighbourhood community in co-decision, in the form of a ballot. Through the civic participatory process, we studied the effectiveness of using VR technology to engage the community in participating in the co-decision process. The three designs were presented using highly realistic 360˚ visualisations and the effects on engagement were compared between various devices: VR headsets, smartphones, tablets, and computers. Viewing the designs in 2D paper plans was also included in the comparison. The study included over 1300 respondents that participated in the ballot. A statistical analysis of the collected data shows that participants viewing the 360˚ rendered images with VR technology expressed a significantly higher engagement in the co-decision process than those using their computer at home or viewing 2D paper plans. The paper describes the complete participatory design process and the impact of the e-governance used on the target group as well as on the actors organizing the e-governance process. We discuss how the use of new technology and active presence of a voting-support team inspired citizens to participate in the co-creation process and how the investment in this procedure helped the local authorities to generate support for the plans and strengthen its relationship with the community. The use of realistic visualisations that can be easily assessed by citizens through user-friendly technology, enabled a large and diverse audience to participate. This resulted in greater visibility of municipal efforts to enhance the living environment of citizens and is therefore an important step in increased civic engagement in municipal policy-making and implementation.
DOCUMENT
During the COVID-19 pandemic, the bidirectional relationship between policy and data reliability has been a challenge for researchers of the local municipal health services. Policy decisions on population specific test locations and selective registration of negative test results led to population differences in data quality. This hampered the calculation of reliable population specific infection rates needed to develop proper data driven public health policy. https://doi.org/10.1007/s12508-023-00377-y
MULTIFILE
Completeness of data is vital for the decision making and forecasting on Building Management Systems (BMS) as missing data can result in biased decision making down the line. This study creates a guideline for imputing the gaps in BMS datasets by comparing four methods: K Nearest Neighbour algorithm (KNN), Recurrent Neural Network (RNN), Hot Deck (HD) and Last Observation Carried Forward (LOCF). The guideline contains the best method per gap size and scales of measurement. The four selected methods are from various backgrounds and are tested on a real BMS and meteorological dataset. The focus of this paper is not to impute every cell as accurately as possible but to impute trends back into the missing data. The performance is characterised by a set of criteria in order to allow the user to choose the imputation method best suited for its needs. The criteria are: Variance Error (VE) and Root Mean Squared Error (RMSE). VE has been given more weight as its ability to evaluate the imputed trend is better than RMSE. From preliminary results, it was concluded that the best K‐values for KNN are 5 for the smallest gap and 100 for the larger gaps. Using a genetic algorithm the best RNN architecture for the purpose of this paper was determined to be Gated Recurrent Units (GRU). The comparison was performed using a different training dataset than the imputation dataset. The results show no consistent link between the difference in Kurtosis or Skewness and imputation performance. The results of the experiment concluded that RNN is best for interval data and HD is best for both nominal and ratio data. There was no single method that was best for all gap sizes as it was dependent on the data to be imputed.
DOCUMENT
Due to the need to present information in a fast and attractive way, organizations are eager to use information visualisations. This study explores the collision between the different experts involved in the production of these visualisations using the model of trading zones supplemented with the learning mechanisms found in the boundary crossing literature. Results show that that there is not one single good solution to effective interdisciplinary cooperation in the field of information visualisation. Rather, all four types of cooperation that we distinguish – enforced, dominated, fractionated, and attuned – might work well, as long as they are adapted to the situation and the participants accept the constraints of the specific cooperation type they are engaged in. In any case the involved experts and initiators have to understand and incorporate approaches that enhance the cocreative, iterative nature of the production process. In surveying the different forms of collaboration we detect two major forms of trading zones: the one that encompasses the collaboration between an external client and a designer (external trading zone) and the trading zones within an organization between content producer and designer (internal trading zone). Both mechanisms of identifying each other’s expertise and coordinating the different tasks in the production process seem beneficial for the production process.
LINK