Completeness of data is vital for the decision making and forecasting on Building Management Systems (BMS) as missing data can result in biased decision making down the line. This study creates a guideline for imputing the gaps in BMS datasets by comparing four methods: K Nearest Neighbour algorithm (KNN), Recurrent Neural Network (RNN), Hot Deck (HD) and Last Observation Carried Forward (LOCF). The guideline contains the best method per gap size and scales of measurement. The four selected methods are from various backgrounds and are tested on a real BMS and metereological dataset. The focus of this paper is not to impute every cell as accurately as possible but to impute trends back into the missing data. The performance is characterised by a set of criteria in order to allow the user to choose the imputation method best suited for its needs. The criteria are: Variance Error (VE) and Root Mean Squared Error (RMSE). VE has been given more weight as its ability to evaluate the imputed trend is better than RMSE. From preliminary results, it was concluded that the best K‐values for KNN are 5 for the smallest gap and 100 for the larger gaps. Using a genetic algorithm the best RNN architecture for the purpose of this paper was determined to be GatedRecurrent Units (GRU). The comparison was performed using a different training dataset than the imputation dataset. The results show no consistent link between the difference in Kurtosis or Skewness and imputation performance. The results of the experiment concluded that RNN is best for interval data and HD is best for both nominal and ratio data. There was no single method that was best for all gap sizes as it was dependent on the data to be imputed.
MULTIFILE
Completeness of data is vital for the decision making and forecasting on Building Management Systems (BMS) as missing data can result in biased decision making down the line. This study creates a guideline for imputing the gaps in BMS datasets by comparing four methods: K Nearest Neighbour algorithm (KNN), Recurrent Neural Network (RNN), Hot Deck (HD) and Last Observation Carried Forward (LOCF). The guideline contains the best method per gap size and scales of measurement. The four selected methods are from various backgrounds and are tested on a real BMS and meteorological dataset. The focus of this paper is not to impute every cell as accurately as possible but to impute trends back into the missing data. The performance is characterised by a set of criteria in order to allow the user to choose the imputation method best suited for its needs. The criteria are: Variance Error (VE) and Root Mean Squared Error (RMSE). VE has been given more weight as its ability to evaluate the imputed trend is better than RMSE. From preliminary results, it was concluded that the best K‐values for KNN are 5 for the smallest gap and 100 for the larger gaps. Using a genetic algorithm the best RNN architecture for the purpose of this paper was determined to be Gated Recurrent Units (GRU). The comparison was performed using a different training dataset than the imputation dataset. The results show no consistent link between the difference in Kurtosis or Skewness and imputation performance. The results of the experiment concluded that RNN is best for interval data and HD is best for both nominal and ratio data. There was no single method that was best for all gap sizes as it was dependent on the data to be imputed.
DOCUMENT
This paper investigate to use of information technology, i.e. machine learning algorithms for water assessment in Timor-Leste. It is essential to access clean water to ensure the safety for humans and others livings in this world. The Water Quality Index (WQI) is the standard tool for assessing water quality, which can be calculated from physicochemical and microbiological parameters. However, in developing countries, it is continuing need to bring water and energy for the most disadvantaged, make it necessary to find new solutions. In such case, missing-value imputation and machine learning models are useful for classifying water samples into suitable or unsuitable with significant accuracy. Some imputation methods were tested, and four machine learning algorithms were explored: logistic regression, support vector machine, random forest, and Gaussian naïve Bayes. We obtained a dataset with 368 observations from 26 groundwater sampling points in Dili city of Timor-Leste. According to experimental results, it is found that 64% of the water samples are suitable for human consumption. We also found k-NN imputation and random forest method were the clear winners, achieving 96% accuracy with three-fold cross validation. The analysis revealed that some parameters significantly affected the classification results.
DOCUMENT
Trustworthy data-driven prognostics in gas turbine engines are crucial for safety, cost-efficiency, and sustainability. Accurate predictions depend on data quality, model accuracy, uncertainty estimation, and practical implementation. This work discusses data quality attributes to build trust using anonymized real-world engine data, focusing on traceability, completeness, and representativeness. A significant challenge is handling missing data, which introduces bias and affects training and predictions. The study compares the accuracy of predictions using Exhaust Gas Temperature (EGT) margin, a key health indicator, by keeping missing values, using KNN-imputation, and employing a Generalized Additive Model (GAM). Preliminary results indicate that while KNN-imputation can be useful for identifying general trends, it may not be as effective for specific predictions compared to GAM, which considers the context of missing data. The choice of method depends on the study’s objective: broad trend forecasting or specific event prediction, each requiring different approaches to manage missing data.
DOCUMENT
Research on follow-up outcomes of systemic interventions for family members with an intellectual disability is scarce. In this study, short-term and long-term follow-up outcomes of multisystemic therapy for adolescents with antisocial or delinquent behaviour and an intellectual disability (MST-ID) are reported. In addition, the role of parental intellectual disability was examined. Outcomes of 55 families who had received MST-ID were assessed at the end of treatment and at 6-month, 12-month and 18-month follow-up. Parental intellectual disability was used as a predictor of treatment outcomes. Missing data were handled using multiple imputation. Rule-breaking behaviour of adolescents declined during treatment and stabilized until 18 months post-treatment. The presence or absence of parental intellectual disability did not predict treatment outcomes. This study was the first to report long-term outcomes of MST-ID. The intervention achieved similar results in families with and without parents with an intellectual disability.
DOCUMENT
Background: Early identification of older cardiac patients at high risk of readmission or mortality facilitates targeted deployment of preventive interventions. In the Netherlands, the frailty tool of the Dutch Safety Management System (DSMS-tool) consists of (the risk of) delirium, falling, functional impairment, and malnutrition and is currently used in all older hospitalised patients. However, its predictive performance in older cardiac patients is unknown. Aim: To estimate the performance of the DSMS-tool alone and combined with other predictors in predicting hospital readmission or mortality within 6 months in acutely hospitalised older cardiac patients. Methods: An individual patient data meta-analysis was performed on 529 acutely hospitalised cardiac patients ≥70 years from four prospective cohorts. Missing values for predictor and outcome variables were multiply imputed. We explored discrimination and calibration of: (1) the DSMS-tool alone; (2) the four components of the DSMS-tool and adding easily obtainable clinical predictors; (3) the four components of the DSMS-tool and more difficult to obtain predictors. Predictors in model 2 and 3 were selected using backward selection using a threshold of p = 0.157. We used shrunk c-statistics, calibration plots, regression slopes and Hosmer-Lemeshow p-values (PHL) to describe predictive performance in terms of discrimination and calibration. Results: The population mean age was 82 years, 52% were males and 51% were admitted for heart failure. DSMS-tool was positive in 45% for delirium, 41% for falling, 37% for functional impairments and 29% for malnutrition. The incidence of hospital readmission or mortality gradually increased from 37 to 60% with increasing DSMS scores. Overall, the DSMS-tool discriminated limited (c-statistic 0.61, 95% 0.56-0.66). The final model included the DSMS-tool, diagnosis at admission and Charlson Comorbidity Index and had a c-statistic of 0.69 (95% 0.63-0.73; PHL was 0.658). Discussion: The DSMS-tool alone has limited capacity to accurately estimate the risk of readmission or mortality in hospitalised older cardiac patients. Adding disease-specific risk factor information to the DSMS-tool resulted in a moderately performing model. To optimise the early identification of older hospitalised cardiac patients at high risk, the combination of geriatric and disease-specific predictors should be further explored.
DOCUMENT
Background: The diffusion of telehealth into hospital care is still low, partially because of a lack of telehealth competence among nurses. In an earlier study, we reported on the knowledge, skills, and attitudes (KSAs) nurses require for the use of telehealth. The current study describes hospital nurses' confidence in possessing these telehealth KSAs. Method: In a cross-sectional study, we invited 3,543 nurses from three hospitals in the Netherlands to rate their self-confidence in 31 telehealth KSAs on a 5-point Likert scale, using an online questionnaire. Results: A total of 1,017 nurses responded to the survey. Nine KSAs were scored with a median value of 4.0, 19 KSAs with a median value of 3.0, and three KSAs with a median value of 2.0. Conclusion: Given that hospital nurses have self-confidence in only nine of the 31 essential telehealth KSAs, continuing education in additional KSAs is recommended to support nurses in gaining confidence in using telehealth.
DOCUMENT
Intra-ocular straylight can cause decreased visual functioning, and it may cause diminished vision-related quality of life (VRQOL). This cross-sectional population-based study investigates the association between straylight and VRQOL in middle-aged and elderly individuals. Multivariable linear regression analyses were used to assess the association between straylight modeled continuously and cutoff at the recommended fitness-to-drive value, straylight ≥ 1.4 log(s), and VRQOL. The study showed that participants with normal straylight values, straylight ≤ 1.4 log(s), rated their VRQOL slightly better than those with high straylight values (straylight ≥ 1.4 log(s)). Furthermore, multivariable regression analysis revealed a borderline statistical significant association (p = .06) between intra-ocular straylight and self-reported VRQOL in middle-aged and elderly individuals. The association between straylight and self-reported VRQOL was not influenced by the status of the intra-ocular lens (natural vs. artificial intra-ocular lens after cataract extraction) or the number of (instrumental) activities of daily living that were reported as difficult for the elderly individuals.
DOCUMENT
Background: Medically unexplained physical symptoms (MUPS) are a leading cause of reduced work functioning. It is not known which factors are associated with reduced work functioning in people with moderate MUPS. Insight in these factors can contribute to prevention of reduced work functioning, associated work-related costs and in MUPS becoming chronic. Therefore, the aim of this study was to identify which demographic and health-related factors are associated with reduced work functioning, operationalized as impaired work performance and absenteeism, in people with moderate MUPS. Methods: Data of 104 participants from an ongoing study on people with moderate MUPS were used in this cross-sectional study. Ten independent variables were measured at baseline to determine their association with reduced work functioning: severity of psychosocial symptoms (four domains, measured with the Four-Dimensional Symptom Questionnaire), physical health (RAND 36-Item Health Survey), moderate or vigorous physical activity (Activ8 activity monitor), age, sex, education level and duration of complaints. Two separate multivariable linear regression analyses were performed with backward stepwise selection, for both impaired work performance and absenteeism. Results: Absenteeism rate rose with 2.5 and 0.6% for every increased point on the Four-Dimensional Symptom Questionnaire for domain 'depression' (B = 0.025, SE = 0.009, p = .006) and domain 'somatization' (B = 0.006, SE = 0.003, p = .086), respectively. An R2 value of 0.118 was found. Impaired work performance rate rose with 0.2 and 0.5% for every increased point on the Four-Dimensional Symptom Questionnaire for domain 'distress' (B = 0.002, SE = 0.001, p = .084) and domain 'somatization' (B = 0.005, SE = 0.001, p < .001), respectively. An R2 value of 0.252 was found. Conclusions: Severity of distress, probability of a depressive disorder and probability of somatization are positively associated with higher rates of reduced work functioning in people with moderate MUPS. To prevent long-term absenteeism and highly impaired work performance severity of psychosocial symptoms seem to play a significant role. However, because of the low percentage of explained variance, additional research is necessary to gain insight in other factors that might explain the variance in reduced work functioning even better.
DOCUMENT
Objective To develop and internally validate a prognostic model to predict chronic pain after a new episode of acute or subacute non-specific idiopathic, non-traumatic neck pain in patients presenting to physiotherapy primary care, emphasising modifiable biomedical, psychological and social factors. Design A prospective cohort study with a 6-month follow-up between January 2020 and March 2023. Setting 30 physiotherapy primary care practices. Participants Patients with a new presentation of non-specific idiopathic, non-traumatic neck pain, with a duration lasting no longer than 12 weeks from onset. Baseline measures Candidate prognostic variables collected from participants included age and sex, neck pain symptoms, work-related factors, general factors, psychological and behavioural factors and the remaining factors: therapeutic relation and healthcare provider attitude. Outcome measures Pain intensity at 6 weeks, 3 months and 6 months on a Numeric Pain Rating Scale (NPRS) after inclusion. An NPRS score of ≥3 at each time point was used to define chronic neck pain. Results 62 (10%) of the 603 participants developed chronic neck pain. The prognostic factors in the final model were sex, pain intensity, reported pain in different body regions, headache since and before the neck pain, posture during work, employment status, illness beliefs about pain identity and recovery, treatment beliefs, distress and self-efficacy. The model demonstrated an optimism-corrected area under the curve of 0.83 and a corrected R2 of 0.24. Calibration was deemed acceptable to good, as indicated by the calibration curve. The Hosmer–Lemeshow test yielded a p-value of 0.7167, indicating a good model fit. Conclusion This model has the potential to obtain a valid prognosis for developing chronic pain after a new episode of acute and subacute non-specific idiopathic, non-traumatic neck pain. It includes mostly potentially modifiable factors for physiotherapy practice. External validation of this model is recommended.
LINK