Service of SURF
© 2025 SURF
With the proliferation of misinformation on the web, automatic misinformation detection methods are becoming an increasingly important subject of study. Large language models have produced the best results among content-based methods, which rely on the text of the article rather than the metadata or network features. However, finetuning such a model requires significant training data, which has led to the automatic creation of large-scale misinformation detection datasets. In these datasets, articles are not labelled directly. Rather, each news site is labelled for reliability by an established fact-checking organisation and every article is subsequently assigned the corresponding label based on the reliability score of the news source in question. A recent paper has explored the biases present in one such dataset, NELA-GT-2018, and shown that the models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. We confirm a part of their findings. Apart from studying the characteristics and potential biases of the datasets, we also find it important to examine in what way the model architecture influences the results. We therefore explore which text features or combinations of features are learned by models based on contextual word embeddings as opposed to basic bag-of-words models. To elucidate this, we perform extensive error analysis aided by the SHAP post-hoc explanation technique on a debiased portion of the dataset. We validate the explanation technique on our inherently interpretable baseline model.
From the article: The ethics guidelines put forward by the AI High Level Expert Group (AI-HLEG) present a list of seven key requirements that Human-centered, trustworthy AI systems should meet. These guidelines are useful for the evaluation of AI systems, but can be complemented by applied methods and tools for the development of trustworthy AI systems in practice. In this position paper we propose a framework for translating the AI-HLEG ethics guidelines into the specific context within which an AI system operates. This approach aligns well with a set of Agile principles commonly employed in software engineering. http://ceur-ws.org/Vol-2659/
Presentatie van eerste versie manuscript 'How voters ' multiple identities affect their response to politicians ' moral violations ' door Annemarie Walter en David Redlawsk, bij het Department Politicologie en Internationale Betrekkingen van de Universiteit van Delaware op 17 maart 2021.
MULTIFILE