The built environment requires energy-flexible buildings to reduce energy peak loads and to maximize the use of (decentralized) renewable energy sources. The challenge is to arrive at smart control strategies that respond to the increasing variations in both the energy demand as well as the variable energy supply. This enables grid integration in existing energy networks with limited capacity and maximises use of decentralized sustainable generation. Buildings can play a key role in the optimization of the grid capacity by applying demand-side management control. To adjust the grid energy demand profile of a building without compromising the user requirements, the building should acquire some energy flexibility capacity. The main ambition of the Brains for Buildings Work Package 2 is to develop smart control strategies that use the operational flexibility of non-residential buildings to minimize energy costs, reduce emissions and avoid spikes in power network load, without compromising comfort levels. To realise this ambition the following key components will be developed within the B4B WP2: (A) Development of open-source HVAC and electric services models, (B) development of energy demand prediction models and (C) development of flexibility management control models. This report describes the developed first two key components, (A) and (B). This report presents different prediction models covering various building components. The models are from three different types: white box models, grey-box models, and black-box models. Each model developed is presented in a different chapter. The chapters start with the goal of the prediction model, followed by the description of the model and the results obtained when applied to a case study. The models developed are two approaches based on white box models (1) White box models based on Modelica libraries for energy prediction of a building and its components and (2) Hybrid predictive digital twin based on white box building models to predict the dynamic energy response of the building and its components. (3) Using CO₂ monitoring data to derive either ventilation flow rate or occupancy. (4) Prediction of the heating demand of a building. (5) Feedforward neural network model to predict the building energy usage and its uncertainty. (6) Prediction of PV solar production. The first model aims to predict the energy use and energy production pattern of different building configurations with open-source software, OpenModelica, and open-source libraries, IBPSA libraries. The white-box model simulation results are used to produce design and control advice for increasing the building energy flexibility. The use of the libraries for making a model has first been tested in a simple residential unit, and now is being tested in a non-residential unit, the Haagse Hogeschool building. The lessons learned show that it is possible to model a building by making use of a combination of libraries, however the development of the model is very time consuming. The test also highlighted the need for defining standard scenarios to test the energy flexibility and the need for a practical visualization if the simulation results are to be used to give advice about potential increase of the energy flexibility. The goal of the hybrid model, which is based on a white based model for the building and systems and a data driven model for user behaviour, is to predict the energy demand and energy supply of a building. The model's application focuses on the use case of the TNO building at Stieltjesweg in Delft during a summer period, with a specific emphasis on cooling demand. Preliminary analysis shows that the monitoring results of the building behaviour is in line with the simulation results. Currently, development is in progress to improve the model predictions by including the solar shading from surrounding buildings, models of automatic shading devices, and model calibration including the energy use of the chiller. The goal of the third model is to derive recent and current ventilation flow rate over time based on monitoring data on CO₂ concentration and occupancy, as well as deriving recent and current occupancy over time, based on monitoring data on CO₂ concentration and ventilation flow rate. The grey-box model used is based on the GEKKO python tool. The model was tested with the data of 6 Windesheim University of Applied Sciences office rooms. The model had low precision deriving the ventilation flow rate, especially at low CO2 concentration rates. The model had a good precision deriving occupancy from CO₂ concentration and ventilation flow rate. Further research is needed to determine if these findings apply in different situations, such as meeting spaces and classrooms. The goal of the fourth chapter is to compare the working of a simplified white box model and black-box model to predict the heating energy use of a building. The aim is to integrate these prediction models in the energy management system of SME buildings. The two models have been tested with data from a residential unit since at the time of the analysis the data of a SME building was not available. The prediction models developed have a low accuracy and in their current form cannot be integrated in an energy management system. In general, black-box model prediction obtained a higher accuracy than the white box model. The goal of the fifth model is to predict the energy use in a building using a black-box model and measure the uncertainty in the prediction. The black-box model is based on a feed-forward neural network. The model has been tested with the data of two buildings: educational and commercial buildings. The strength of the model is in the ensemble prediction and the realization that uncertainty is intrinsically present in the data as an absolute deviation. Using a rolling window technique, the model can predict energy use and uncertainty, incorporating possible building-use changes. The testing in two different cases demonstrates the applicability of the model for different types of buildings. The goal of the sixth and last model developed is to predict the energy production of PV panels in a building with the use of a black-box model. The choice for developing the model of the PV panels is based on the analysis of the main contributors of the peak energy demand and peak energy delivery in the case of the DWA office building. On a fault free test set, the model meets the requirements for a calibrated model according to the FEMP and ASHRAE criteria for the error metrics. According to the IPMVP criteria the model should be improved further. The results of the performance metrics agree in range with values as found in literature. For accurate peak prediction a year of training data is recommended in the given approach without lagged variables. This report presents the results and lessons learned from implementing white-box, grey-box and black-box models to predict energy use and energy production of buildings or of variables directly related to them. Each of the models has its advantages and disadvantages. Further research in this line is needed to develop the potential of this approach.
DOCUMENT
The growing availability of data offers plenty of opportunities for data driven innovation of business models for SMEs like interactive media companies. However, SMEs lack the knowledge and processes to translate data into attractive propositions and design viable data-driven business models. In this paper we develop and evaluate a practical method for designing data driven business models (DDBM) in the context of interactive media companies. The development follows a design science research approach. The main result is a step-by-step approach for designing DDBM, supported by pattern cards and game boards. Steps consider required data sources and data activities, actors and value network, revenue model and implementation aspects. Preliminary evaluation shows that the method works as a discussion tool to uncover assumptions and make assessments to create a substantiated data driven business model.
MULTIFILE
With the proliferation of misinformation on the web, automatic misinformation detection methods are becoming an increasingly important subject of study. Large language models have produced the best results among content-based methods, which rely on the text of the article rather than the metadata or network features. However, finetuning such a model requires significant training data, which has led to the automatic creation of large-scale misinformation detection datasets. In these datasets, articles are not labelled directly. Rather, each news site is labelled for reliability by an established fact-checking organisation and every article is subsequently assigned the corresponding label based on the reliability score of the news source in question. A recent paper has explored the biases present in one such dataset, NELA-GT-2018, and shown that the models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. We confirm a part of their findings. Apart from studying the characteristics and potential biases of the datasets, we also find it important to examine in what way the model architecture influences the results. We therefore explore which text features or combinations of features are learned by models based on contextual word embeddings as opposed to basic bag-of-words models. To elucidate this, we perform extensive error analysis aided by the SHAP post-hoc explanation technique on a debiased portion of the dataset. We validate the explanation technique on our inherently interpretable baseline model.
DOCUMENT
Abstract Despite the numerous business benefits of data science, the number of data science models in production is limited. Data science model deployment presents many challenges and many organisations have little model deployment knowledge. This research studied five model deployments in a Dutch government organisation. The study revealed that as a result of model deployment a data science subprocess is added into the target business process, the model itself can be adapted, model maintenance is incorporated in the model development process and a feedback loop is established between the target business process and the model development process. These model deployment effects and the related deployment challenges are different in strategic and operational target business processes. Based on these findings, guidelines are formulated which can form a basis for future principles how to successfully deploy data science models. Organisations can use these guidelines as suggestions to solve their own model deployment challenges.
DOCUMENT
IntroductionThe growing availability of data offers plenty of opportunities for data-driven innovation of business models. This certainly applies to interactive mediacompanies. Interactive media companies are engaged in the development, provisioning, and exploitation of interactive media services and applications.Through the service interactions, they may collect large amounts of data which can be used to enhance applications or even define new propositions and business models. According to Lippell (2016), media companies can publish content in more sophisticated ways. They can build a deeper and more engaging customer relationship based on a deeper understanding of their users. Indeed, research from Weill & Woerner (2015) suggests that companies involved in the digitalecosystem that better understand their customers than their average competitor have significantly higher profit margins than their industry averages. Moreover, the same research suggests that businesses need to think more broadly about their position in the ecosystem. Open innovation and collaboration are essential for new growth, for example combining data within and across industries (Parmar et al., 2014). However, according to (Mathis and Köbler, 2016), these opportunities remain largely untapped as especially SMEs lack the knowledge and processes to translate data into attractive propositions and design viable data driven business models (DDBM). In this paper, we investigate how interactive media companies can structurally gain more insight and value from data and how they can develop DDBM. We define a DDBM as a business model relying on data as a key resource (Hartmann et al., 2016).
DOCUMENT
Research, advisory companies, consultants and system integrators all predict that a lot of money will be earned with decision management (business rules, algorithms and analytics). But how can you actually make money with decision management or in other words: Which business models are exactly available? In this article, we present seven business models for decision management.
LINK
Recent years have seen a massive growth in ethical and legal frameworks to govern data science practices. Yet one of the core questions associated with ethical and legal frameworks is the extent to which they are implemented in practice. A particularly interesting case in this context comes to public officials, for whom higher standards typically exist. We are thus trying to understand how ethical and legal frameworks influence the everyday practices on data and algorithms of public sector data professionals. The following paper looks at two cases: public sector data professionals (1) at municipalities in the Netherlands and (2) at the Netherlands Police. We compare these two cases based on an analytical research framework we develop in this article to help understanding of everyday professional practices. We conclude that there is a wide gap between legal and ethical governance rules and the everyday practices.
MULTIFILE
Background: Advanced statistical modeling techniques may help predict health outcomes. However, it is not the case that these modeling techniques always outperform traditional techniques such as regression techniques. In this study, external validation was carried out for five modeling strategies for the prediction of the disability of community-dwelling older people in the Netherlands. Methods: We analyzed data from five studies consisting of community-dwelling older people in the Netherlands. For the prediction of the total disability score as measured with the Groningen Activity Restriction Scale (GARS), we used fourteen predictors as measured with the Tilburg Frailty Indicator (TFI). Both the TFI and the GARS are self-report questionnaires. For the modeling, five statistical modeling techniques were evaluated: general linear model (GLM), support vector machine (SVM), neural net (NN), recursive partitioning (RP), and random forest (RF). Each model was developed on one of the five data sets and then applied to each of the four remaining data sets. We assessed the performance of the models with calibration characteristics, the correlation coefficient, and the root of the mean squared error. Results: The models GLM, SVM, RP, and RF showed satisfactory performance characteristics when validated on the validation data sets. All models showed poor performance characteristics for the deviating data set both for development and validation due to the deviating baseline characteristics compared to those of the other data sets. Conclusion: The performance of four models (GLM, SVM, RP, RF) on the development data sets was satisfactory. This was also the case for the validation data sets, except when these models were developed on the deviating data set. The NN models showed a much worse performance on the validation data sets than on the development data sets.
DOCUMENT
Het project van Aeres Hogeschool Dronten heeft als doel om via het delen en analyseren van telersdata binnen een groep van dertien telers te komen tot nieuwe inzichten, betere bedrijfsvoering en efficiëntere ketens, gericht op economische en ecologische duurzaamheid. Hiervoor wordt een data-infrastructuur gerealiseerd waarmee telers gefaciliteerd worden in het verzamelen, delen en analyseren van data en toegang krijgen tot complexere analyse technieken. Het project beoogt een groep telers op te leiden om de infrastructuur en tools te gebruiken en gezamenlijk data te delen en te analyseren om de teelt te verbeteren. Aan het einde van het project worden concrete verbeteringen verwacht op het gebied van input en opbrengst in de aardappelteelt.Het project richtte zich op het onderzoeken van hoe data van agrarische ondernemers in Flevoland gebruikt en gedeeld kan worden om economische en ecologische verbeteringen te bereiken. De landbouwsector verzamelt steeds meer gegevens over variabelen die de groei en bewaring van gewassen beïnvloeden, waarmee de benadering van landbouw verduurzaamd kan worden. Echter, het gebruik van data staat nog in de kinderschoenen en beslissingen worden vaak genomen op basis van advisering van externe commerciële partijen. Het delen van data is ook nog gevoelige materie. Het project wil deze drempels verlagen door telers meer data onderling te laten uitwisselen en met partners in de keten.De data-infrastructuur wordt gerealiseerd voor een groep van 15-20 telers die bereid zijn teelt- en/of bewaarsturing te doen op basis van beschikbare object-specifieke en actuele data. De data kunnen met elkaar gedeeld worden en zo kunnen de bedrijven verbeterd worden. De telers krijgen via de infrastructuur toegang tot complexere analyse technieken. Het project is opgedeeld in drie groepen op basis van locatie in de provincie: een groep telers rond een pilot bedrijf in Dronten, een groep rond een pilot bedrijf in Swifterbant en een groep in de NOP.De drie pilot bedrijven hebben aan het begin van het project een inventarisatie gedaan op basis van een door Aeres opgestelde vragenlijst om inzicht te krijgen in de minimale beschikbare data voor deelname aan het project. De meeste gevraagde data zijn reeds beschikbaar, behalve bij het pilot bedrijf in de NOP. De ontbrekende data kunnen worden opgevraagd bij lokale weerstations of in het project door projectpartners worden gerealiseerd.In de agrarische sector komt het vaak voor dat er ontbrekende data zijn over de factoren die bijdragen aan mislukkingen in de precisielandbouw. Dit komt doordat er vaak wordt gedacht in termen van wat wel werkt, in plaats van wat niet werkt. Een manier om dit tegen te gaan is door bewust te zijn van de ontbrekende data en deze proactief op te zoeken. Dit kan bijvoorbeeld door onderzoek te doen naar de milieu-impact van landbouw.Door dit project is beter inzicht verkregen in de effectiviteit van inputs alsmede met betrekking tot de impact op de omgeving. De volgende verbeteringen zijn gerealiseerd:• Beter inzicht in timing van teelthandelingen waardoor de bodem wordt ontzien.• Beter inzicht in effecten van teeltrotaties waardoor gekozen kan worden voor rotaties met minder impact en toch goede financiële resultaten behaald worden.• Door vergelijking kan er effectiever omgegaan worden met inputs zoals mest en gewasbeschermingsmiddelen waardoor naast minder gebruik ook minder af- en uitspoeling zal plaatsvinden.• Door effectiever gebruik van inputs zal per kg geproduceerde aardappelen minder oppervlakte, energie en chemie nodig zijn.Trefwoorden: digitalisering boerenbedrijf, data, pop3, databoeren, precisielandbouw RVO zaaknummer: 17717000042
DOCUMENT
This final installment in our e-learning series offers a comprehensive look at the current impact and future potential of data science across industries. Using real-world examples like medical image analysis and operational efficiencies at Rotterdam The Hague Airport, we showcase data science’s transformative capabilities. The video also introduces the promise of Large Language Models (LLMs) such as Chat GPT and the simplification brought by Automated Machine Learning (AutoML). Emphasizing the blend of technology and human insight, we explore the evolving landscape of AI and data science for businesses.
VIDEO