The current set of research methods on ictresearchmethods.nl contains only one research method that refers to machine learning: the “Data analytics” method in the “Lab” strategy. This does not reflect the way of working in ML projects, where Data Analytics is not a method to answer one question but the main goal of the project. For ML projects, the Data Analytics method should be divided in several smaller steps, each becoming a method of its own. In other words, we should treat the Data Analytics (or more appropriate ML engineering) process in the same way the software engineering process is treated in the framework. In the remainder of this post I will briefly discuss each of the existing research methods and how they apply to ML projects. The methods are organized by strategy. In the discussion I will give pointers to relevant tools or literature for ML projects.
LINK
The past two years I have conducted an extensive literature and tool review to answer the question: “What should software engineers learn about building production-ready machine learning systems?”. During my research I noted that because the discipline of building production-ready machine learning systems is so new, it is not so easy to get the terminology straight. People write about it from different perspectives and backgrounds and have not yet found each other to join forces. At the same time the field is moving fast and far from mature. My focus on material that is ready to be used with our bachelor level students (applied software engineers, profession-oriented education), helped me to consolidate everything I have found into a body of knowledge for building production-ready machine learning (ML) systems. In this post I will first define the discipline and introduce the terminology for AI engineering and MLOps.
LINK
Both Software Engineering and Machine Learning have become recognized disciplines. In this article I analyse the combination of the two: engineering of machine learning applications. I believe the systematic way of working for machine learning applications is at certain points different from traditional (rule-based) software engineering. The question I set out to investigate is “How does software engineering change when we develop machine learning applications”?. This question is not an easy to answer and turns out to be a rather new, with few publications. This article collects what I have found until now.
LINK
This chapter discusses how to build production-ready machine learning systems. There are several challenges involved in accomplishing this, each with its specific solutions regarding practices and tool support. The chapter presents those solutions and introduces MLOps (machine learning operations, also called machine learning engineering) as an overarching and integrated approach in which data engineers, data scientists, software engineers, and operations engineers integrate their activities to implement validated machine learning applications managed from initial idea to daily operation in a production environment. This approach combines agile software engineering processes with the machine learning-specific workflow. Following the principles of MLOps is paramount in building high-quality production-ready machine learning systems. The current state of MLOps is discussed in terms of best practices and tool support. The chapter ends by describing future developments that are bound to improve and extend the tool support for implementing an MLOps approach.
LINK
Recently, the job market for Artificial Intelligence (AI) engineers has exploded. Since the role of AI engineer is relatively new, limited research has been done on the requirements as set by the industry. Moreover, the definition of an AI engineer is less established than for a data scientist or a software engineer. In this study we explore, based on job ads, the requirements from the job market for the position of AI engineer in The Netherlands. We retrieved job ad data between April 2018 and April 2021 from a large job ad database, Jobfeed from TextKernel. The job ads were selected with a process similar to the selection of primary studies in a literature review. We characterize the 367 resulting job ads based on meta-data such as publication date, industry/sector, educational background and job titles. To answer our research questions we have further coded 125 job ads manually. The job tasks of AI engineers are concentrated in five categories: business understanding, data engineering, modeling, software development and operations engineering. Companies ask for AI engineers with different profiles: 1) data science engineer with focus on modeling, 2) AI software engineer with focus on software development , 3) generalist AI engineer with focus on both models and software. Furthermore, we present the tools and technologies mentioned in the selected job ads, and the soft skills. Our research helps to understand the expectations companies have for professionals building AI-enabled systems. Understanding these expectations is crucial both for prospective AI engineers and educational institutions in charge of training those prospective engineers. Our research also helps to better define the profession of AI engineering. We do this by proposing an extended AI engineering life-cycle that includes a business understanding phase.
LINK
In my previous post on AI engineering I defined the concepts involved in this new discipline and explained that with the current state of the practice, AI engineers could also be named machine learning (ML) engineers. In this post I would like to 1) define our view on the profession of applied AI engineer and 2) present the toolbox of an AI engineer with tools, methods and techniques to defy the challenges AI engineers typically face. I end this post with a short overview of related work and future directions. Attached to it is an extensive list of references and additional reading material.
LINK
Over the past three years we have built a practice-oriented, bachelor level, educational programme for software engineers to specialize as AI engineers. The experience with this programme and the practical assignments our students execute in industry has given us valuable insights on the profession of AI engineer. In this paper we discuss our programme and the lessons learned for industry and research.
MULTIFILE
In this post I give an overview of the theory, tools, frameworks and best practices I have found until now around the testing (and debugging) of machine learning applications. I will start by giving an overview of the specificities of testing machine learning applications.
LINK
The prevention and diagnosis of frailty syndrome (FS) in cardiac patients requires innovative systems to support medical personnel, patient adherence, and self-care behavior. To do so, modern medicine uses a supervised machine learning approach (ML) to study the psychosocial domains of frailty in cardiac patients with heart failure (HF). This study aimed to determine the absolute and relative diagnostic importance of the individual components of the Tilburg Frailty Indicator (TFI) questionnaire in patients with HF. An exploratory analysis was performed using machine learning algorithms and the permutation method to determine the absolute importance of frailty components in HF. Based on the TFI data, which contain physical and psychosocial components, machine learning models were built based on three algorithms: a decision tree, a random decision forest, and the AdaBoost Models classifier. The absolute weights were used to make pairwise comparisons between the variables and obtain relative diagnostic importance. The analysis of HF patients’ responses showed that the psychological variable TFI20 diagnosing low mood was more diagnostically important than the variables from the physical domain: lack of strength in the hands and physical fatigue. The psychological variable TFI21 linked with agitation and irritability was diagnostically more important than all three physical variables considered: walking difficulties, lack of hand strength, and physical fatigue. In the case of the two remaining variables from the psychological domain (TFI19, TFI22), and for all variables from the social domain, the results do not allow for the rejection of the null hypothesis. From a long-term perspective, the ML based frailty approach can support healthcare professionals, including psychologists and social workers, in drawing their attention to the nonphysical origins of HF.
DOCUMENT
A modified genetic algorithm (MGA) optimization procedure, alongside time series machine learning (ML) classifiers, is proposed to minimize handovers in a digital twin-based visible light communication (VLC) system. Frequent handovers have a direct impact on the overall performance of the VLC system due to the inherent connection downtime of a handover process. The handover scheme proposed in this article considers the receiver trajectory information to minimize handovers, maintaining the system performance below the forward error correction limit. Simulation results indicate that the proposed scheme outperforms a power-based handover scheme, achieving handover reductions of 42.47%. Therefore, the MGA combined to the ML models approach is an effective means of minimizing handovers, as well as improving overall VLC system performance.
DOCUMENT