The platform for open and practice-oriented research

1.355 Results for 'text data'

Sort:Relevance

product

Syntactic predictors for text quality in Dutch upper‐secondary school students’ L1 argumentative writing

Among other things, learning to write entails learning how to use complex sentences effectively in discourse. Some research has therefore focused on relating measures of syntactic complexity to text quality. Apart from the fact that the existing research on this topic appears inconclusive, most of it has been conducted in English L1 contexts. This is potentially problematic, since relevant syntactic indices may not be the same across languages. The current study is the first to explore which syntactic features predict text quality in Dutch secondary school students’ argumentative writing. In order to do so, the quality of 125 argumentative essays written by students was rated and the syntactic features of the texts were analyzed. A multilevel regression analysis was then used to investigate which features contribute to text quality. The resulting model (explaining 14.5% of the variance in text quality) shows that the relative number of finite clauses and the ratio between the number of relative clauses and the number of finite clauses positively predict text quality. Discrepancies between our findings and those of previous studies indicate that the relations between syntactic features and text quality may vary based on factors such as language and genre. Additional (cross-linguistic) research is needed to gain a more complete understanding of the relationships between syntactic constructions and text quality and the potential moderating role of language and genre.

DOCUMENT

Syntactic predictors for text quality in Dutch upper‐secondary school students’ L1 argumentative writing

product

Online evaluation of text-to-sign translation by deaf end users

We present a number of methodological recommendations concerning the online evaluation of avatars for text-to-sign translation, focusing on the structure, format and length of the questionnaire, as well as methods for eliciting and faithfully transcribing responses.

LINK

product

Considering Human Interaction and Variability in Automatic Text Simplification

Research into automatic text simplification aims to promote access to information for all members of society. To facilitate generalizability, simplification research often abstracts away from specific use cases, and targets a prototypical reader and an underspecified content creator. In this paper, we consider a real-world use case – simplification technology for use in Dutch municipalities – and identify the needs of the content creators and the target audiences in this scenario. The stakeholders envision a system that (a) assists the human writer without taking over the task; (b) provides diverse outputs, tailored for specific target audiences; and (c) explains the suggestions that it outputs. These requirements call for technology that is characterized by modularity, explainability, and variability. We argue that these are important research directions that require further exploration

MULTIFILE

Considering Human Interaction and Variability in Automatic Text Simplification

product

An original template solution for FAIR scientific text mining

This method paper presents a template solution for text mining of scientific literature using the R tm package. Literature to be analyzed can be collected manually or automatically using the code provided with this paper. Once the literature is collected, the three steps for conducting text mining can be performed as outlined below:• loading and cleaning of text from articles,• processing, statistical analysis, and clustering, and• presentation of results using generalized and tailor-made visualizations.The text mining steps can be applied to a single, multiple, or time series groups of documents.References are provided to three published peer reviewed articles that use the presented text mining methodology. The main advantages of our method are: (1) Its suitability for both research and educational purposes, (2) Compliance with the Findable Accessible Interoperable and Reproducible (FAIR) principles, and (3) code and example data are made available on GitHub under the open-source Apache V2 license.

DOCUMENT

An original template solution for FAIR scientific text mining

editorial

Zorg met impact: Robbert Gobbens benoemd tot Medical Delta-lector

product

Data-analyse nader geanalyseerd

Hoe overbruggen we de kloof tussen accountant en dataspecialist? Deel 1 van een drieluik over data-analyse. In dit eerste deel worden zes typen van data-analyse belicht.

DOCUMENT

product

Exploring Bias in Data and Models for Misinformation Detection from Text

With the proliferation of misinformation on the web, automatic misinformation detection methods are becoming an increasingly important subject of study. Large language models have produced the best results among content-based methods, which rely on the text of the article rather than the metadata or network features. However, finetuning such a model requires significant training data, which has led to the automatic creation of large-scale misinformation detection datasets. In these datasets, articles are not labelled directly. Rather, each news site is labelled for reliability by an established fact-checking organisation and every article is subsequently assigned the corresponding label based on the reliability score of the news source in question. A recent paper has explored the biases present in one such dataset, NELA-GT-2018, and shown that the models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. We confirm a part of their findings. Apart from studying the characteristics and potential biases of the datasets, we also find it important to examine in what way the model architecture influences the results. We therefore explore which text features or combinations of features are learned by models based on contextual word embeddings as opposed to basic bag-of-words models. To elucidate this, we perform extensive error analysis aided by the SHAP post-hoc explanation technique on a debiased portion of the dataset. We validate the explanation technique on our inherently interpretable baseline model.

DOCUMENT

product

Data-analyse (nader) geanalyseerd II

Hoe overbruggen we de kloof tussen accountant en dataspecialist? Deel 2 van een drieluik over data-analyse. In een eerder artikel is de buitenste ring van het 'VTA-model toegelicht'. In dit vervolgartikel worden de twee binnenste ringen besproken.

DOCUMENT

MULTIFILE

Big Data Analytics Capability and Governmental Performance

DOCUMENT

product

Determining aspects of text difficulty for the Sign Language of the Netherlands (NGT) Functional Assessment instrument

DOCUMENT

product

Big Data Analytics Capability and Governmental Performance

MULTIFILE

Search results

1.355 Results for 'text data'

Syntactic predictors for text quality in Dutch upper‐secondary school students’ L1 argumentative writing

Online evaluation of text-to-sign translation by deaf end users

Considering Human Interaction and Variability in Automatic Text Simplification

An original template solution for FAIR scientific text mining

Zorg met impact: Robbert Gobbens benoemd tot Medical Delta-lector

Data-analyse nader geanalyseerd

Exploring Bias in Data and Models for Misinformation Detection from Text

Data-analyse (nader) geanalyseerd II

De zachte kant van robotisering

Take out what you can: quantitative analysis of the open question results from the National Student Survey

Determining aspects of text difficulty for the Sign Language of the Netherlands (NGT) Functional Assessment instrument

Big Data Analytics Capability and Governmental Performance

Navigate to

Categories

Filters

Productsfilters

1.355 Results for 'text data'

Syntactic predictors for text quality in Dutch upper‐secondary school students’ L1 argumentative writing

Online evaluation of text-to-sign translation by deaf end users

Considering Human Interaction and Variability in Automatic Text Simplification

An original template solution for FAIR scientific text mining

Zorg met impact: Robbert Gobbens benoemd tot Medical Delta-lector

Data-analyse nader geanalyseerd

Exploring Bias in Data and Models for Misinformation Detection from Text

Data-analyse (nader) geanalyseerd II

De zachte kant van robotisering

Take out what you can: quantitative analysis of the open question results from the National Student Survey

Determining aspects of text difficulty for the Sign Language of the Netherlands (NGT) Functional Assessment instrument

Big Data Analytics Capability and Governmental Performance