Completeness of data is vital for the decision making and forecasting on Building Management Systems (BMS) as missing data can result in biased decision making down the line. This study creates a guideline for imputing the gaps in BMS datasets by comparing four methods: K Nearest Neighbour algorithm (KNN), Recurrent Neural Network (RNN), Hot Deck (HD) and Last Observation Carried Forward (LOCF). The guideline contains the best method per gap size and scales of measurement. The four selected methods are from various backgrounds and are tested on a real BMS and meteorological dataset. The focus of this paper is not to impute every cell as accurately as possible but to impute trends back into the missing data. The performance is characterised by a set of criteria in order to allow the user to choose the imputation method best suited for its needs. The criteria are: Variance Error (VE) and Root Mean Squared Error (RMSE). VE has been given more weight as its ability to evaluate the imputed trend is better than RMSE. From preliminary results, it was concluded that the best K‐values for KNN are 5 for the smallest gap and 100 for the larger gaps. Using a genetic algorithm the best RNN architecture for the purpose of this paper was determined to be Gated Recurrent Units (GRU). The comparison was performed using a different training dataset than the imputation dataset. The results show no consistent link between the difference in Kurtosis or Skewness and imputation performance. The results of the experiment concluded that RNN is best for interval data and HD is best for both nominal and ratio data. There was no single method that was best for all gap sizes as it was dependent on the data to be imputed.
DOCUMENT
Completeness of data is vital for the decision making and forecasting on Building Management Systems (BMS) as missing data can result in biased decision making down the line. This study creates a guideline for imputing the gaps in BMS datasets by comparing four methods: K Nearest Neighbour algorithm (KNN), Recurrent Neural Network (RNN), Hot Deck (HD) and Last Observation Carried Forward (LOCF). The guideline contains the best method per gap size and scales of measurement. The four selected methods are from various backgrounds and are tested on a real BMS and metereological dataset. The focus of this paper is not to impute every cell as accurately as possible but to impute trends back into the missing data. The performance is characterised by a set of criteria in order to allow the user to choose the imputation method best suited for its needs. The criteria are: Variance Error (VE) and Root Mean Squared Error (RMSE). VE has been given more weight as its ability to evaluate the imputed trend is better than RMSE. From preliminary results, it was concluded that the best K‐values for KNN are 5 for the smallest gap and 100 for the larger gaps. Using a genetic algorithm the best RNN architecture for the purpose of this paper was determined to be GatedRecurrent Units (GRU). The comparison was performed using a different training dataset than the imputation dataset. The results show no consistent link between the difference in Kurtosis or Skewness and imputation performance. The results of the experiment concluded that RNN is best for interval data and HD is best for both nominal and ratio data. There was no single method that was best for all gap sizes as it was dependent on the data to be imputed.
MULTIFILE
Machine learning models have proven to be reliable methods in classification tasks. However, little research has been done on classifying dwelling characteristics based on smart meter & weather data before. Gaining insights into dwelling characteristics can be helpful to create/improve the policies for creating new dwellings at NZEB standard. This paper compares the different machine learning algorithms and the methods used to correctly implement the models. These methods include the data pre-processing, model validation and evaluation. Smart meter data was provided by Groene Mient, which was used to train several machine learning algorithms. The models that were generated by the algorithms were compared on their performance. The results showed that Recurrent Neural Network (RNN) 2performed the best with 96% of accuracy. Cross Validation was used to validate the models, where 80% of the data was used for training purposes and 20% was used for testing purposes. Evaluation metrices were used to produce classification reports, which can indicate which of the models work the best for this specific problem. The models were programmed in Python.
DOCUMENT
De arbeidsmarkt is continu in ontwikkeling, leidend tot een steeds veranderende vraag naar competenties en banen. Dit vraagt naast beroepsgerichte vaardigheden en kennis over veerkracht en wendbaarheid van professionals. Van de student wordt daarom verwacht dat die zich ontwikkeld in zelfgereguleerd (ZGL) leren. ZGL gaat over regie van het eigen leerproces: studenten bepalen zelf hoe tot leerresultaten te komen, deze te evalueren en sturen het leerproces zelf bij. Voor opleidingen is het de vraag hoe ze ZGL kunnen begeleiden en bevorderen. Dit behoeft inzicht in leergedrag, patronen hierin en bewustzijn over hoe deze inzichten gebruikt kunnen worden om ZGL te ondersteunen en het leerproces te begeleiden. In dit onderzoek is geïnventariseerd of de data die studenten in de elektronische leeromgeving (ELO) achterlaten een indicatie kan geven over het leerproces en ZGL van de student. Om de ingewikkelde patronen uit de data te halen, zijn de data uit de ELO met behulp van AItechnieken geanalyseerd. Hiermee kon het leerproces van studenten in verschillende categorieën worden onderverdeeld. De categorieën geven een eerste indicatie over het ZGL van de student. Verder onderzoek is benodigd, ook om te onderzoeken wat dit betekent voor de ondersteuning van studenten in hun leerproces.
DOCUMENT
Machine learning models have proven to be reliable methods in classification tasks. However, little research has been conducted on the classification of dwelling characteristics based on smart meter and weather data before. Gaining insights into dwelling characteristics, which comprise of the type of heating system used, the number of inhabitants, and the number of solar panels installed, can be helpful in creating or improving the policies to create new dwellings at nearly zero-energy standard. This paper compares different supervised machine learning algorithms, namely Logistic Regression, Support Vector Machine, K-Nearest Neighbor, and Long-short term memory, and methods used to correctly implement these algorithms. These methods include data pre-processing, model validation, and evaluation. Smart meter data, which was used to train several machine learning algorithms, was provided by Groene Mient. The models that were generated by the algorithms were compared on their performance. The results showed that the Long-short term memory performed the best with 96% accuracy. Cross Validation was used to validate the models, where 80% of the data was used for training purposes and 20% was used for testing purposes. Evaluation metrics were used to produce classification reports, which indicates that the Long-short term memory outperforms the compared models on the evaluation metrics for this specific problem.
DOCUMENT
This study presents an automated method for detecting and measuring the apex head thickness of tomato plants, a critical phenotypic trait associated with plant health, fruit development, and yield forecasting. Due to the apex's sensitivity to physical contact, non-invasive monitoring is essential. This paper addresses the demand for automated, contactless systems among Dutch growers. Our approach integrates deep learning models (YOLO and Faster RCNN) with RGB-D camera imaging to enable accurate, scalable, and non-invasive measurement in greenhouse environments. A dataset of 600 RGB-D images captured in a controlled greenhouse, was fully preprocessed, annotated, and augmented for optimal training. Experimental results show that YOLOv8n achieved superior performance with a precision of 91.2 %, recall of 86.7 %, and an Intersection over Union (IoU) score of 89.4 %. Other models, such as YOLOv9t, YOLOv10n, YOLOv11n, and Faster RCNN, demonstrated lower precision scores of 83.6 %, 74.6 %, 75.4 %, and 78 %, respectively. Their IoU scores were also lower, indicating less reliable detection. This research establishes a robust, real-time method for precision agriculture through automated apex head thickness measurement.
DOCUMENT
The security of online assessments is a major concern due to widespread cheating. One common form of cheating is impersonation, where students invite unauthorized persons to take assessments on their behalf. Several techniques exist to handle impersonation. Some researchers recommend use of integrity policy, but communicating the policy effectively to the students is a challenge. Others propose authentication methods like, password and fingerprint; they offer initial authentication but are vulnerable thereafter. Face recognition offers post-login authentication but necessitates additional hardware. Keystroke Dynamics (KD) has been used to provide post-login authentication without any additional hardware, but its use is limited to subjective assessment. In this work, we address impersonation in assessments with Multiple Choice Questions (MCQ). Our approach combines two key strategies: reinforcement of integrity policy for prevention, and keystroke-based random authentication for detection of impersonation. To the best of our knowledge, it is the first attempt to use keystroke dynamics for post-login authentication in the context of MCQ. We improve an online quiz tool for the data collection suited to our needs and use feature engineering to address the challenge of high-dimensional keystroke datasets. Using machine learning classifiers, we identify the best-performing model for authenticating the students. The results indicate that the highest accuracy (83%) is achieved by the Isolation Forest classifier. Furthermore, to validate the results, the approach is applied to Carnegie Mellon University (CMU) benchmark dataset, thereby achieving an improved accuracy of 94%. Though we also used mouse dynamics for authentication, but its subpar performance leads us to not consider it for our approach.
DOCUMENT