A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite satisfactory results, this measure does not take (semantic) relations between words into account. In this paper we study some alternative relevance measures that do use relations between words. They are computed by defining co-occurrence distributions for words and comparing these distributions with the document and the corpus distribution. We then evaluate keyword extraction algorithms defined by selecting different relevance measures. For two corpora of abstracts with manually assigned keywords, we compare manually extracted keywords with different automatically extracted ones. The results show that using word co-occurrence information can improve precision and recall over tf.idf.
DOCUMENT
A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite satisfactory results, this measure does not take (semantic) relations between words into account.
DOCUMENT
Using either freshly pulped or preserved seaweed biomass for the extraction of protein can have a great effect on the amount of protein that can be extracted. In this study, the effect of four preservation techniques (frozen, freeze-dried, and air-dried at 40 and 70 °C) on the protein extractability, measured as Kjeldahl nitrogen, of four seaweed species, Chondrus crispus (Rhodophyceae), Ascophyllum nodosum, Saccharina latissima (both Phaeophyceae) and Ulva lactuca (Chlorophyceae), was tested and compared with extracting freshly pulped biomass. The effect of preservation is species dependent: in all four seaweed species, a differenttreatment resulted in the highest protein extractability. The pellet (i.e., the non-dissolved biomass after extraction) was also analyzed as in most cases the largest part of the initial protein ended up in the pellet and not in the supernatant. Of the four species tested, freeze-dried A. nodosum yielded the highest overall protein extractability of 59.6% with a significantly increased protein content compared with the sample before extraction. For C. crispus extracting biomass air-dried at 40 °C gave the best results with a protein extractability of 50.4%. Preservation had little effect on the protein extraction for S. latissima; only air-drying at 70 °C decreased the yield significantly. Over 70% of the initial protein ended up in the pellet for all U. lactuca extractions while increasing the protein content significantly. Extracting freshly pulped U. lactuca resulted in a 78% increase in protein content in the pellet while still containing 84.5% of the total initial total protein. These results show the importance of the right choice when selecting a preservation method and seaweed species for protein extraction. Besides the extracted protein fraction, the remainingpellet also has the potential as a source with an increased protein content.
DOCUMENT
To study the ways in which compounds can induce adverse effects, toxicologists have been constructing Adverse Outcome Pathways (AOPs). An AOP can be considered as a pragmatic tool to capture and visualize mechanisms underlying different types of toxicity inflicted by any kind of stressor, and describes the interactions between key entities that lead to the adverse outcome on multiple biological levels of organization. The construction or optimization of an AOP is a labor intensive process, which currently depends on the manual search, collection, reviewing and synthesis of available scientific literature. This process could however be largely facilitated using Natural Language Processing (NLP) to extract information contained in scientific literature in a systematic, objective, and rapid manner that would lead to greater accuracy and reproducibility. This would support researchers to invest their expertise in the substantive assessment of the AOPs by replacing the time spent on evidence gathering by a critical review of the data extracted by NLP. As case examples, we selected two frequent adversities observed in the liver: namely, cholestasis and steatosis denoting accumulation of bile and lipid, respectively. We used deep learning language models to recognize entities of interest in text and establish causal relationships between them. We demonstrate how an NLP pipeline combining Named Entity Recognition and a simple rules-based relationship extraction model helps screen compounds related to liver adversities in the literature, but also extract mechanistic information for how such adversities develop, from the molecular to the organismal level. Finally, we provide some perspectives opened by the recent progress in Large Language Models and how these could be used in the future. We propose this work brings two main contributions: 1) a proof-of-concept that NLP can support the extraction of information from text for modern toxicology and 2) a template open-source model for recognition of toxicological entities and extraction of their relationships. All resources are openly accessible via GitHub (https://github.com/ontox-project/en-tox).
DOCUMENT
The huge number of images shared on the Web makes effective cataloguing methods for efficient storage and retrieval procedures specifically tailored on the end-user needs a very demanding and crucial issue. In this paper, we investigate the applicability of Automatic Image Annotation (AIA) for image tagging with a focus on the needs of database expansion for a news broadcasting company. First, we determine the feasibility of using AIA in such a context with the aim of minimizing an extensive retraining whenever a new tag needs to be incorporated in the tag set population. Then, an image annotation tool integrating a Convolutional Neural Network model (AlexNet) for feature extraction and a K-Nearest-Neighbours classifier for tag assignment to images is introduced and tested. The obtained performances are very promising addressing the proposed approach as valuable to tackle the problem of image tagging in the framework of a broadcasting company, whilst not yet optimal for integration in the business process.
DOCUMENT
This study explores how households interact with smart systems for energy usage, providing insights into the field's trends, themes and evolution through a bibliometric analysis of 547 relevant literature from 2015 to 2025. Our findings discover: (1) Research activity has grown over the past decade, with leading journals recognizing several productive authors. Increased collaboration and interdisciplinary work are expected to expand; (2) Key research hotspots, identified through keyword co-occurrence, with two (exploration and development) stages, highlighting the interplay between technological, economic, environmental, and behavioral factors within the field; (3) Future research should place greater emphasis on understanding how emerging technologies interact with human, with a deeper understanding of users. Beyond the individual perspective, social dimensions also demand investigation. Finally, research should also aim to support policy development. To conclude, this study contributes to a broader perspective of this topic and highlights directions for future research development.
MULTIFILE
Introduction: Worldwide, there is an increase in the extent and severity of mental illness. Exacerbation of somatic complaints in this group of people can result in recurring ambulance and emergency department care. The care of patients with a mental dysregulation (ie, experiencing a mental health problem and disproportionate feelings like fear, anger, sadness or confusion, possibly with associated behaviours) can be complex and challenging in the emergency care context, possibly evoking a wide variety of feelings, ranging from worry or pity to annoyance and frustration in emergency care staff members. This in return may lead to stigma towards patients with a mental dysregulation seeking emergency care. Interventions have been developed impacting attitude and behaviour and minimising stigma held by healthcare professionals. However, these interventions are not explicitly aimed at the emergency care context nor do these represent perspectives of healthcare professionals working within this context. Therefore, the aim of the proposed review is to gain insight into interventions targeting healthcare professionals, which minimise stigma including beliefs, attitudes and behaviour towards patients with a mental dysregulation within the emergency care context. Methods and analysis: The protocol for a systematic integrative review is presented, using the Preferred Reporting Items for Systematic Review and Meta-Analysis Protocols recommendations. A systematic search was performed on 13 July 2023. Study selection and data extraction will be performed by two independent reviewers. In each step, an expert with lived experience will comment on process and results. Software applications RefWorks-ProQuest, Rayyan and ATLAS.ti will be used to enhance the quality of the review and transparency of process and results. Ethics and dissemination: No ethical approval or safety considerations are required for this review. The proposed review will be submitted to a relevant international journal. Results will be presented at relevant medical scientific conferences.
LINK
Objective Primary to provide an overview of diagnostic accuracy for clinical tests for common elbow (sport) injuries, secondary accompanied by reproducible instructions to perform these tests. Design A systematic literature review according to the PRISMA statement. Data sources A comprehensive literature search was performed in MEDLINE via PubMed and EMBASE. Eligibility criteria We included studies reporting diagnostic accuracy and a description on the performance for elbow tests, targeting the following conditions: distal biceps rupture, triceps rupture, posteromedial impingement, medial collateral ligament (MCL) insufficiency, posterolateral rotatory instability (PLRI), lateral epicondylitis and medial epicondylitis. After identifying the articles, the methodological quality was assessed using the QUADAS-2 checklist. Results Our primary literature search yielded 1144 hits. After assessment 10 articles were included: six for distal biceps rupture, one for MCL insufficiency, two for PLRI and one for lateral epicondylitis. No articles were selected for triceps rupture, posteromedial impingement and medial epicondylitis. Quality assessment showed high or unclear risk of bias in nine studies. We described 24 test procedures of which 14 tests contained data on diagnostic accuracy. Conclusions Numerous clinical tests for the elbow were described in literature, seldom accompanied with data on diagnostic accuracy. None of the described tests can provide adequate certainty to rule in or rule out a disease based on sufficient diagnostic accuracy.
LINK
With artificial intelligence (AI) systems entering our working and leisure environments with increasing adaptation and learning capabilities, new opportunities arise for developing hybrid (human-AI) intelligence (HI) systems, comprising new ways of collaboration. However, there is not yet a structured way of specifying design solutions of collaboration for hybrid intelligence (HI) systems and there is a lack of best practices shared across application domains. We address this gap by investigating the generalization of specific design solutions into design patterns that can be shared and applied in different contexts. We present a human-centered bottom-up approach for the specification of design solutions and their abstraction into team design patterns. We apply the proposed approach for 4 concrete HI use cases and show the successful extraction of team design patterns that are generalizable, providing re-usable design components across various domains. This work advances previous research on team design patterns and designing applications of HI systems.
MULTIFILE
Intra-ocular straylight can cause decreased visual functioning, and it may cause diminished vision-related quality of life (VRQOL). This cross-sectional population-based study investigates the association between straylight and VRQOL in middle-aged and elderly individuals. Multivariable linear regression analyses were used to assess the association between straylight modeled continuously and cutoff at the recommended fitness-to-drive value, straylight ≥ 1.4 log(s), and VRQOL. The study showed that participants with normal straylight values, straylight ≤ 1.4 log(s), rated their VRQOL slightly better than those with high straylight values (straylight ≥ 1.4 log(s)). Furthermore, multivariable regression analysis revealed a borderline statistical significant association (p = .06) between intra-ocular straylight and self-reported VRQOL in middle-aged and elderly individuals. The association between straylight and self-reported VRQOL was not influenced by the status of the intra-ocular lens (natural vs. artificial intra-ocular lens after cataract extraction) or the number of (instrumental) activities of daily living that were reported as difficult for the elderly individuals.
DOCUMENT