Data-driven condition-based maintenance (CBM) and predictive maintenance (PdM) strategies have emerged over recent years and aim at minimizing the aviation maintenance costs and environmental impact by the diagnosis and prognosis of aircraft systems. As the use of data and relevant algorithms is essential to AI-based gas turbine diagnostics, there are different technical, operational, and regulatory challenges that need to be tackled in order for the aeronautical industry to be able to exploit their full potential. In this work, the machine learning (ML) method of the generalised additive model (GAM) is used in order to predict the evolution of an aero engine’s exhaust gas temperature (EGT). Three different continuous synthetic data sets developed by NASA are employed, known as New Commercial Modular Aero-Propulsion System Simulation (N-CMAPSS), with increasing complexity in engine deterioration. The results show that the GAM can be predict the evolution of the EGT with high accuracy when using several input features that resemble the types of physical sensors installed in aero gas turbines currently in operation. As the GAM offers good interpretability, this case study is used to discuss the different data attributes a data set needs to have in order to build trust and move towards certifiable models in the future.
Citizens regularly search the Web to make informed decisions on daily life questions, like online purchases, but how they reason with the results is unknown. This reasoning involves engaging with data in ways that require statistical literacy, which is crucial for navigating contemporary data. However, many adults struggle to critically evaluate and interpret such data and make data-informed decisions. Existing literature provides limited insight into how citizens engage with web-sourced information. We investigated: How do adults reason statistically with web-search results to answer daily life questions? In this case study, we observed and interviewed three vocationally educated adults searching for products or mortgages. Unlike data producers, consumers handle pre-existing, often ambiguous data with unclear populations and no single dataset. Participants encountered unstructured (web links) and structured data (prices). We analysed their reasoning and the process of preparing data, which is part of data-ing. Key data-ing actions included judging relevance and trustworthiness of the data and using proxy variables when relevant data were missing (e.g., price for product quality). Participants’ statistical reasoning was mainly informal. For example, they reasoned about association but did not calculate a measure of it, nor assess underlying distributions. This study theoretically contributes to understanding data-ing and why contemporary data may necessitate updating the investigative cycle. As current education focuses mainly on producers’ tasks, we advocate including consumers’ tasks by using authentic contexts (e.g., music, environment, deferred payment) to promote data exploration, informal statistical reasoning, and critical web-search skills—including selecting and filtering information, identifying bias, and evaluating sources.
LINK
Trustworthy data-driven prognostics in gas turbine engines are crucial for safety, cost-efficiency, and sustainability. Accurate predictions depend on data quality, model accuracy, uncertainty estimation, and practical implementation. This work discusses data quality attributes to build trust using anonymized real-world engine data, focusing on traceability, completeness, and representativeness. A significant challenge is handling missing data, which introduces bias and affects training and predictions. The study compares the accuracy of predictions using Exhaust Gas Temperature (EGT) margin, a key health indicator, by keeping missing values, using KNN-imputation, and employing a Generalized Additive Model (GAM). Preliminary results indicate that while KNN-imputation can be useful for identifying general trends, it may not be as effective for specific predictions compared to GAM, which considers the context of missing data. The choice of method depends on the study’s objective: broad trend forecasting or specific event prediction, each requiring different approaches to manage missing data.