With the proliferation of misinformation on the web, automatic methods for detecting misinformation are becoming an increasingly important subject of study. If automatic misinformation detection is applied in a real-world setting, it is necessary to validate the methods being used. Large language models (LLMs) have produced the best results among text-based methods. However, fine-tuning such a model requires a significant amount of training data, which has led to the automatic creation of large-scale misinformation detection datasets. In this paper, we explore the biases present in one such dataset for misinformation detection in English, NELA-GT-2019. We find that models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. Furthermore, we use SHAP to interpret the outputs of a fine-tuned LLM and validate the explanation method using our inherently interpretable baseline. We critically analyze the suitability of SHAP for text applications by comparing the outputs of SHAP to the most important features from our logistic regression models.
DOCUMENT
With the proliferation of misinformation on the web, automatic misinformation detection methods are becoming an increasingly important subject of study. Large language models have produced the best results among content-based methods, which rely on the text of the article rather than the metadata or network features. However, finetuning such a model requires significant training data, which has led to the automatic creation of large-scale misinformation detection datasets. In these datasets, articles are not labelled directly. Rather, each news site is labelled for reliability by an established fact-checking organisation and every article is subsequently assigned the corresponding label based on the reliability score of the news source in question. A recent paper has explored the biases present in one such dataset, NELA-GT-2018, and shown that the models are at least partly learning the stylistic and other features of different news sources rather than the features of unreliable news. We confirm a part of their findings. Apart from studying the characteristics and potential biases of the datasets, we also find it important to examine in what way the model architecture influences the results. We therefore explore which text features or combinations of features are learned by models based on contextual word embeddings as opposed to basic bag-of-words models. To elucidate this, we perform extensive error analysis aided by the SHAP post-hoc explanation technique on a debiased portion of the dataset. We validate the explanation technique on our inherently interpretable baseline model.
DOCUMENT
More people voted in 2024 than any other year in human history, while often relying on the internet for political information. This combination resulted in critical challenges for democracy. To address these concerns, we designed an exhibition that applied interactive experiences to help visitors understand the impact of digitization on democracy. This late-breaking work addresses the research questions: 1) What do participants, exposed to playful interventions, think about these topics? and 2) How do people estimate their skills and knowledge about countering misinformation? We collected data in 5 countries through showcases held within weeks of relevant 2024 elections. During visits, participants completed a survey detailing their experiences and emotional responses. Participants expressed high levels of self-confidence regarding the detection of misinformation and spotting AI-generated content. This paper contributes to addressing digital literacy needs by fostering engaging interactions with AI and politically relevant issues surrounding campaigning and misinformation.
MULTIFILE
Increasingly, Instagram is discussed as a site for misinformation, inau-thentic activities, and polarization, particularly in recent studies aboutelections, the COVID-19 pandemic and vaccines. In this study, we havefound a different platform. By looking at the content that receives themost interactions over two time periods (in 2020) related to three U.S.presidential candidates and the issues of COVID-19, healthcare, 5G andgun control, we characterize Instagram as a site of earnest (as opposedto ambivalent) political campaigning and moral support, with a rela-tive absence of polarizing content (particularly from influencers) andlittle to no misinformation and artificial amplification practices. Mostimportantly, while misinformation and polarization might be spreadingon the platform, they do not receive much user interaction.
MULTIFILE
Leerlingen groeien op in een wereld die permanent online is. Ze hebben toegang tot een grote hoeveelheid informatie en ze zijn constant online in interactie. Het onderwijs kan leerlingen opleiden tot mediawijze burgers.
LINK
The objective of this study is to shed light on the added value of the services of five disciplines in M&A advisory in the SME domain: accountants, bankers, business brokers, fiscalists and lawyers. Theory is inconclusive in the added value of advisory services and research on the subject is hardly available. RBV predicts direct benefits in using advisory services in M&A, leading to less obstacles in and directly after M&A or lagged effects on more renewal of the firm. The theory of structural holes, agency theory and management entrenchment theory on the other hand predict neutral or negative effect of advisory services in M&A. The dataset includes 899 mergers and acquisitions (1) completed before 2003; (2) with an acquirer having bought 100% of target shares or assets; (3) of German, Belgian or Dutch origin; (4) of non-listed firms; (5) where acquirer and target firm are not member of the same family. Using (M)ANOVA’s and controlling for the effects of more than one advisor involved, the outcomes show consistently that the M&A advisory services do not reduce obstacles like financing, misinformation and culture and staff problems during or immediately after M&A. Looking at lagged effects of advisory services in the period of two years after M&A strategic more renewal by innovation occurs if bankers, fiscalists and lawyers are involved. Involvement of accountants and business brokers on the other hand decrease renewal.
DOCUMENT
Social media platforms such as Facebook, YouTube, and Twitter have millions of users logging in every day, using these platforms for commu nication, entertainment, and news consumption. These platforms adopt rules that determine how users communicate and thereby limit and shape public discourse.2 Platforms need to deal with large amounts of data generated every day. For example, as of October 2021, 4.55 billion social media users were ac tive on an average number of 6.7 platforms used each month per internet user.3 As a result, platforms were compelled to develop governance models and content moderation systems to deal with harmful and undesirable content, including disinformation. In this study: • ‘Content governance’ is defined as a set of processes, procedures, and systems that determine how a given platform plans, publishes, moder ates, and curates content. • ‘Content moderation’ is the organised practice of a social media plat form of pre-screening, removing, or labelling undesirable content to reduce the damage that inappropriate content can cause.
MULTIFILE
Informatiebeveiliging is actueel. Hoe vaak lezen we in de krant niet dat er weer een bedrijf of instelling getroffen is? En dan valt het nog mee als de schade zich beperkt tot één organisatie, want voor hetzelfde geld wordt een hele regio 'platgelegd', zoals heel Rome door een grootschalige stroomstoring op 28 september jl. Dat kan hier ook gebeuren. En de berichten in de media zijn nog maar het topje van de ijsberg; dat zijn de echte calamiteiten, waarbij doden en gewonden zijn gevallen. Kleinere incidenten komen veel vaker voor. Vaker dan u denkt. Elk moment kan uw computer crashen, een hacker uw website binnendringen, het netwerk verstopt raken, of een nieuw computervirus uw bestanden aantasten. We hebben de neiging om de kans van al die bedreigingen te bagatelliseren. Hoe vaak hoor je niet uitspraken zoals bijvoorbeeld "er is bij ons toch niets te halen", of "de kans op brand is zó klein"? We hebben er zelfs een spreekwoord voor, eentje over een kalf en een put. Blijkbaar is het heel gewoon om niets te doen tot het te laat is; tot we zelf getroffen zijn. En als er dan wel maatregelen getroffen worden, dan is het vooral om huis en haard te beschermen. Maar hoe zit het dan met de informatiesystemen en de peperdure informatie daarin? De informatie waar het voortbestaan van de organisatie van afhangt? Daaraan moet nog heel wat verbeterd worden.
DOCUMENT
This study utilises a quantitative observation study on student journalists (n=47), as well as reconstruction interviews with experienced editors and reporters in newsrooms (n=12), to understand how Dutch journalists search, select, and verify sources online. Through the recording of screen activity, we show that search strategies are heavily influenced by how the search engine sorts and ranks potential sources. Eventual selection of sources remains relatively traditional, focused on legacy media and their websites. Moreover, online news production clearly challenges the verification process. Results suggest that journalists use no explicit but only so-called hybrid methods of verifications, such as background checks of websites and social media accounts, and cross-checking of sources.
LINK