Het platform voor open praktijkgericht onderzoek

product

Software engineering for machine learning applications

Both Software Engineering and Machine Learning have become recognized disciplines. In this article I analyse the combination of the two: engineering of machine learning applications. I believe the systematic way of working for machine learning applications is at certain points different from traditional (rule-based) software engineering. The question I set out to investigate is “How does software engineering change when we develop machine learning applications”?. This question is not an easy to answer and turns out to be a rather new, with few publications. This article collects what I have found until now.

LINK

product

Testing machine learning applications

In this post I give an overview of the theory, tools, frameworks and best practices I have found until now around the testing (and debugging) of machine learning applications. I will start by giving an overview of the specificities of testing machine learning applications.

LINK

product

Gaining insights into dwelling characteristics using machine learning for policy making on nearly zero-energy buildings with the use of smart meter and weather data

Machine learning models have proven to be reliable methods in classification tasks. However, little research has been conducted on the classification of dwelling characteristics based on smart meter and weather data before. Gaining insights into dwelling characteristics, which comprise of the type of heating system used, the number of inhabitants, and the number of solar panels installed, can be helpful in creating or improving the policies to create new dwellings at nearly zero-energy standard. This paper compares different supervised machine learning algorithms, namely Logistic Regression, Support Vector Machine, K-Nearest Neighbor, and Long-short term memory, and methods used to correctly implement these algorithms. These methods include data pre-processing, model validation, and evaluation. Smart meter data, which was used to train several machine learning algorithms, was provided by Groene Mient. The models that were generated by the algorithms were compared on their performance. The results showed that the Long-short term memory performed the best with 96% accuracy. Cross Validation was used to validate the models, where 80% of the data was used for training purposes and 20% was used for testing purposes. Evaluation metrics were used to produce classification reports, which indicates that the Long-short term memory outperforms the compared models on the evaluation metrics for this specific problem.

DOCUMENT

Gaining insights into dwelling characteristics using machine learning for policy making on nearly zero-energy buildings with the use of smart meter and weather data

product

Using machine learning to understand students' gaze patterns on graphing tasks

Graphs are ubiquitous. Many graphs, including histograms, bar charts, and stacked dotplots, have proven tricky to interpret. Students’ gaze data can indicate students’ interpretation strategies on these graphs. We therefore explore the question: In what way can machine learning quantify differences in students’ gaze data when interpreting two near-identical histograms with graph tasks in between? Our work provides evidence that using machine learning in conjunction with gaze data can provide insight into how students analyze and interpret graphs. This approach also sheds light on the ways in which students may better understand a graph after first being presented with other graph types, including dotplots. We conclude with a model that can accurately differentiate between the first and second time a student solved near-identical histogram tasks.

DOCUMENT

Using machine learning to understand students' gaze patterns on graphing tasks

product

Classification of dwelling characteristics with machine learning algorithms based on smart meter & weather data

Machine learning models have proven to be reliable methods in classification tasks. However, little research has been done on classifying dwelling characteristics based on smart meter & weather data before. Gaining insights into dwelling characteristics can be helpful to create/improve the policies for creating new dwellings at NZEB standard. This paper compares the different machine learning algorithms and the methods used to correctly implement the models. These methods include the data pre-processing, model validation and evaluation. Smart meter data was provided by Groene Mient, which was used to train several machine learning algorithms. The models that were generated by the algorithms were compared on their performance. The results showed that Recurrent Neural Network (RNN) 2performed the best with 96% of accuracy. Cross Validation was used to validate the models, where 80% of the data was used for training purposes and 20% was used for testing purposes. Evaluation metrices were used to produce classification reports, which can indicate which of the models work the best for this specific problem. The models were programmed in Python.

DOCUMENT

Classification of dwelling characteristics with machine learning algorithms based on smart meter & weather data

product

Examining strategic diversity communication on social media using supervised machine learning: Development, validation and future research directions

In this paper, we present a digital tool named Diversity Perspectives in Social Media (DivPSM) which conducts automated content analysis of strategic diversity communication in organizational social media posts, using supervised machine-learning. DivPSM is trained to identify whether a post makes mention of diversity or a diversity-related issue, and to subsequently code for the presence of three diversity dimensions (cultural/ethnic/racial, gender, and LHGBTQ+ diversity) and three diversity perspectives (the moral, market, and innovation perspectives). In Study 1, we describe the training and validation of the instrument, and examine how it performs compared to human coders. Our findings confirm that DivPSM is sufficiently reliable for use in future research. In study 2, we illustrate the type of data that DivPSM generates, by analyzing the prevalence of strategic diversity communication in social media posts (n = 84,561) of large organizations in the Netherlands. Our results show that in this context gender diversity is most prevalent, followed by LHGBTQ+ and cultural/ethnic/racial diversity. Furthermore, gender diversity is often associated with the innovation perspective, whereas LHGBTQ+ diversity is more often associated with the moral perspective. Cultural/ethnic/racial diversity does not show strong associations with any of the perspectives. Theoretical implications and directions for future research are discussed at the end of the paper.

MULTIFILE

Examining strategic diversity communication on social media using supervised machine learning: Development, validation and future research directions

product

ICT research methods for machine learning engineering

The current set of research methods on ictresearchmethods.nl contains only one research method that refers to machine learning: the “Data analytics” method in the “Lab” strategy. This does not reflect the way of working in ML projects, where Data Analytics is not a method to answer one question but the main goal of the project. For ML projects, the Data Analytics method should be divided in several smaller steps, each becoming a method of its own. In other words, we should treat the Data Analytics (or more appropriate ML engineering) process in the same way the software engineering process is treated in the framework. In the remainder of this post I will briefly discuss each of the existing research methods and how they apply to ML projects. The methods are organized by strategy. In the discussion I will give pointers to relevant tools or literature for ML projects.

LINK

product

Importance analysis of psychosociological variables in frailty syndrome in heart failure patients using machine learning approach

The prevention and diagnosis of frailty syndrome (FS) in cardiac patients requires innovative systems to support medical personnel, patient adherence, and self-care behavior. To do so, modern medicine uses a supervised machine learning approach (ML) to study the psychosocial domains of frailty in cardiac patients with heart failure (HF). This study aimed to determine the absolute and relative diagnostic importance of the individual components of the Tilburg Frailty Indicator (TFI) questionnaire in patients with HF. An exploratory analysis was performed using machine learning algorithms and the permutation method to determine the absolute importance of frailty components in HF. Based on the TFI data, which contain physical and psychosocial components, machine learning models were built based on three algorithms: a decision tree, a random decision forest, and the AdaBoost Models classifier. The absolute weights were used to make pairwise comparisons between the variables and obtain relative diagnostic importance. The analysis of HF patients’ responses showed that the psychological variable TFI20 diagnosing low mood was more diagnostically important than the variables from the physical domain: lack of strength in the hands and physical fatigue. The psychological variable TFI21 linked with agitation and irritability was diagnostically more important than all three physical variables considered: walking difficulties, lack of hand strength, and physical fatigue. In the case of the two remaining variables from the psychological domain (TFI19, TFI22), and for all variables from the social domain, the results do not allow for the rejection of the null hypothesis. From a long-term perspective, the ML based frailty approach can support healthcare professionals, including psychologists and social workers, in drawing their attention to the nonphysical origins of HF.

DOCUMENT

Importance analysis of psychosociological variables in frailty syndrome in heart failure patients using machine learning approach

product

Machine learning and Facebook

Youyou et al. showed that from 70 likes the algorithm could predict the personality better than friends, from 150 likes better than family members and from 300 likes even better than the test person himself. However, the machine learning algorithm does not know the person better than the colleagues, the friends or the person themselves. The machine can "only", after sufficient "supervised learning" trials (iterations), determine the correlation between the click behaviour on Facebook and the scored Big5 factors better than individuals. Prediction replaces the Big5 questionnaire. But we are not getting closer to the personality of people than with the Big5 questionnaire. It is argued that - though data mining can help enormously - psychology remains a subject of the narrative in the end.

MULTIFILE

Zoekresultaten

Producten 119

Software engineering for machine learning applications

Testing machine learning applications

Gaining insights into dwelling characteristics using machine learning for policy making on nearly zero-energy buildings with the use of smart meter and weather data

Using machine learning to understand students' gaze patterns on graphing tasks

Classification of dwelling characteristics with machine learning algorithms based on smart meter & weather data

Examining strategic diversity communication on social media using supervised machine learning: Development, validation and future research directions

ICT research methods for machine learning engineering

Importance analysis of psychosociological variables in frailty syndrome in heart failure patients using machine learning approach

Machine learning and Facebook

Navigeer naar

Categorieën

Filters

Producten 119

Software engineering for machine learning applications

Testing machine learning applications

Gaining insights into dwelling characteristics using machine learning for policy making on nearly zero-energy buildings with the use of smart meter and weather data

Using machine learning to understand students' gaze patterns on graphing tasks

Classification of dwelling characteristics with machine learning algorithms based on smart meter & weather data

Examining strategic diversity communication on social media using supervised machine learning: Development, validation and future research directions

ICT research methods for machine learning engineering

Importance analysis of psychosociological variables in frailty syndrome in heart failure patients using machine learning approach

Machine learning and Facebook