The main goal of this study was to investigate if a computational analyses of text data from the National Student Survey (NSS) can add value to the existing, manual analysis. The results showed the computational analysis of the texts from the open questions of the NSS contain information which enriches the results of standard quantitative analysis of the NSS.
DOCUMENT
The goal of this study was therefore to test the idea that computationally analysing the Fontys National Student Surveys (NSS) open answers using a selection of standard text mining methods (Manning & Schütze 1999) will increase the value of these answers for educational quality assurance. It is expected that human effort and time of analysis will decrease significally. The text data (in Dutch) of several years of Fontys National Student Surveys (2013-2018) was provided to Fontys students of the minor Applied Data Science. The results of the analysis were to include topic and sentiment modelling across multiple years of survey data. Comparing multiple years was necessary to capture and visualize any trends that a human investigator may have missed while analysing the data by hand. During data cleaning all stop words and punctuation were removed, all text was brought to a lower case, names and inappropriate language – such as swear words – were deleted. About 80% of 24.000 records were manually labelled with sentiment; reminder was used for algorithms’ validation. In the following step a machine learning analysis steps: training, testing, outcomes analysis and visualisation, for a better text comprehension, were executed. Students aimed to improve classification accuracy by applying multiple sentiment analysis algorithms and topics modelling methods. The models were chosen arbitrarily, with a preference for a low complexity of a model. For reproducibility of our study open source tooling was used. One of these tools was based on Latent Dirichlet allocation (LDA). LDA is a generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar (Blei, Ng & Jordan, 2003). For topic modelling the Gensim (Řehůřek, 2011) method was used. Gensim is an open-source vector space modelling and topic modelling toolkit implemented in Python. In addition, we recognized the absence of pretrained models for Dutch language. To complete our prototype a simple user interface was created in Python. This final step integrated our automated text analysis with visualisations of sentiments and topics. Remarkably, all extracted topics are related to themes defined by the NSS. This indicates that in general students’ answers are related to topics of interest for educational institutions. The extracted list of the words related to the topic is also relevant to this topic. Despite the fact that most of the results require further human expert interpretation, it is indicative to conclude that the computational analysis of the texts from the open questions of the NSS contain information which enriches the results of standard quantitative analysis of the NSS.
DOCUMENT
The majority of houses in the Groningen gas field region, the largest in Europe, consist of unreinforced masonry material. Because of their particular characteristics (cavity walls of different material, large openings, limited bearing walls in one direction, etc.) these houses are exceptionally vulnerable to shallow induced earthquakes, frequently occurring in the region during the last decade. Raised by the damage incurred in the Groningen buildings due to induced earthquakes, the question whether the small and sometimes invisible plastic deformations prior to a major earthquake affect the overall final response becomes of high importance as its answer is associated with legal liability and consequences due to the damage-claim procedures employed in the region. This paper presents, for the first time, evidence of cumulative damage from available experimental and numerical data reported in the literature. Furthermore, the available modelling tools are scrutinized in terms of their pros and cons in modelling cumulative damage in masonry. Results of full-scale shake-table tests, cyclic wall tests, complex 3D nonlinear time-history analyses, single degree of freedom (SDOF) analyses and finally wall element analyses under periodic dynamic loading have been used for better explaining the phenomenon. It was concluded that a user intervention is needed for most of the SDOF modelling tools if cumulative damage is to be modelled. Furthermore, the results of the cumulative damage in SDOF models are sensitive to the degradation parameters, which require calibration against experimental data. The overall results of numerical models, such as SDOF residual displacement or floor lateral displacements, may be misleading in understanding the damage accumulation. On the other hand, detailed discrete-element modelling is found to be computationally expensive but more consistent in terms of providing insights in real damage accumulation.
DOCUMENT