A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite satisfactory results, this measure does not take (semantic) relations between words into account. In this paper we study some alternative relevance measures that do use relations between words. They are computed by defining co-occurrence distributions for words and comparing these distributions with the document and the corpus distribution. We then evaluate keyword extraction algorithms defined by selecting different relevance measures. For two corpora of abstracts with manually assigned keywords, we compare manually extracted keywords with different automatically extracted ones. The results show that using word co-occurrence information can improve precision and recall over tf.idf.
DOCUMENT
Preprint submitted to Information Processing & Management Tags are a convenient way to label resources on the web. An interesting question is whether one can determine the semantic meaning of tags in the absence of some predefined formal structure like a thesaurus. Many authors have used the usage data for tags to find their emergent semantics. Here, we argue that the semantics of tags can be captured by comparing the contexts in which tags appear. We give an approach to operationalizing this idea by defining what we call paradigmatic similarity: computing co-occurrence distributions of tags with tags in the same context, and comparing tags using information theoretic similarity measures of these distributions, mostly the Jensen-Shannon divergence. In experiments with three different tagged data collections we study its behavior and compare it to other distance measures. For some tasks, like terminology mapping or clustering, the paradigmatic similarity seems to give better results than similarity measures based on the co-occurrence of the documents or other resources that the tags are associated to. We argue that paradigmatic similarity, is superior to other distance measures, if agreement on topics (as opposed to style, register or language etc.), is the most important criterion, and the main differences between the tagged elements in the data set correspond to different topics
DOCUMENT
This study explores how households interact with smart systems for energy usage, providing insights into the field's trends, themes and evolution through a bibliometric analysis of 547 relevant literature from 2015 to 2025. Our findings discover: (1) Research activity has grown over the past decade, with leading journals recognizing several productive authors. Increased collaboration and interdisciplinary work are expected to expand; (2) Key research hotspots, identified through keyword co-occurrence, with two (exploration and development) stages, highlighting the interplay between technological, economic, environmental, and behavioral factors within the field; (3) Future research should place greater emphasis on understanding how emerging technologies interact with human, with a deeper understanding of users. Beyond the individual perspective, social dimensions also demand investigation. Finally, research should also aim to support policy development. To conclude, this study contributes to a broader perspective of this topic and highlights directions for future research development.
MULTIFILE
The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also wellsuited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. © 2010 Frijters et al.
DOCUMENT
AimTo investigate: (a) language difficulties in children with developmental coordination disorder (DCD), and (b) motor difficulties in children with developmental language disorder (DLD).MethodIn this systematic review, PubMed, CINAHL, PsycINFO, and Embase were searched to identify peer-reviewed studies. Two researchers independently identified, screened and evaluated the methodological quality of the included studies following the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA). For objective (a), we combined the terms: “developmental coordination disorder” AND “language skills” AND “children”. For objective (b) we combined the terms: “developmental language disorder” AND “motor skills” AND “children”.ResultsTen studies on language skills in children with DCD and 34 studies on motor skills in children with DLD are included, most with relatively good methodological quality. The results for language comprehension and production in children with DCD are contradictory, but there is evidence that children with DCD have communication and phonological problems. Evidence for general motor problems in children with DLD is consistent. Studies report problems in balance, locomotor, and fine motor skills in children with DLD. Evidence for aiming and catching skills is inconsistent.InterpretationThe findings of this systematic review highlight the co-occurrence of language impairments in children with DCD and motor impairments in children with DLD. Healthcare professionals involved in the assessment and diagnosis of children with DCD or DLD should be attentive to this co-occurrence. In doing so, children with DCD and DLD can receive optimal interventions to minimize problems in their daily life.
DOCUMENT
Rioolwaterzuiveringen zijn de belangrijkste bron van geneesmiddelen en kunstmatige zoetstoffen in oppervlaktewater. De mate waarin rwzi’s deze organische microverontreinigingen verwijderen, lijkt te variëren van locatie tot locatie en/of in de tijd. Oriënterend onderzoek bij zeven rwzi’s in Groningen en Drenthe toonde aan dat de verwijdering van de zoetstof acesulfaam erg varieerde. Om het verschil in de biologische verwijderingscapaciteit voor acesulfaam en geneesmiddelen te kunnen verklaren, bieden nieuwe DNA-technieken wellicht uitkomst. Met Next Generation Sequencing (NGS) komen verschillen tussen bacteriepopulaties aan het licht die mogelijk verschillen in verwijdering van geneesmiddelen en zoetstoffen kunnen verklaren.
DOCUMENT
BACKGROUND: Multimorbidity, the co-occurrence of two or more chronic medical conditions within a single individual, is increasingly becoming part of daily care of general medical practice. Literature-based discovery may help to investigate the patterns of multimorbidity and to integrate medical knowledge for improving healthcare delivery for individuals with co-occurring chronic conditions. OBJECTIVE: To explore the usefulness of literature-based discovery in primary care research through the key-case of finding associations between psychiatric and somatic diseases relevant to general practice in a large biomedical literature database (Medline). METHODS: By using literature based discovery for matching disease profiles as vectors in a high-dimensional associative concept space, co-occurrences of a broad spectrum of chronic medical conditions were matched for their potential in biomedicine. An experimental setting was chosen in parallel with expert evaluations and expert meetings to assess performance and to generate targets for integrating literature-based discovery in multidisciplinary medical research of psychiatric and somatic disease associations. RESULTS: Through stepwise reductions a reference set of 21,945 disease combinations was generated, from which a set of 166 combinations between psychiatric and somatic diseases was selected and assessed by text mining and expert evaluation. CONCLUSIONS: Literature-based discovery tools generate specific patterns of associations between psychiatric and somatic diseases: one subset was appraised as promising for further research; the other subset surprised the experts, leading to intricate discussions and further eliciting of frameworks of biomedical knowledge. These frameworks enable us to specify targets for further developing and integrating literature-based discovery in multidisciplinary research of general practice, psychology and psychiatry, and epidemiology.
DOCUMENT
Objective. The objective of this article is to analyze the scientific production on creative tourism indexed in the Scopus database and to identify gaps, trends and future lines of research. Method. The bibliometric method was used to map the state of the art and identify trends, gaps and future lines of research. A search was made in the Scopus database for scientific articles that included the terms creative tourism in the title, abstract or keywords. Bibexcel software was used to calculate productivity indicators and h-index. The VOSviewer software allowed the analysis of bibilometric networks of citation, co-citation and co-occurrence of keywords. Results. A total of 120 articles corresponding to the period 2002-2020 were found. The scientific production on creative tourism is growing and presents a high rate of topicality. Greg Richards was the most prolix author with the highest h-index, which confirms him as a reference in the subject. The most productive journals are Current Issues in Tourism and Annals of Tourism Research. Creative tourism has been studied from three fundamental thematic lines: tourism and creativity, creative experience and creative space. Conclusions. The implications of the results of the study for academics, researchers and tourism managers were presented. Studies on the profile of the creative tourist, the role of new technologies, co-creation of experiences, as well as the inclusion of variables such as repetition intention, image, motivation and the role of the community in creative tourism were proposed as research opportunities.
DOCUMENT
This study furthers game-based learning for circular business model innovation (CBMI), the complex, dynamic process of designing business models according to the circular economy principles. The study explores how game-play in an educational setting affects learning progress on the level of business model elements and from the perspective of six learning categories. We experimented with two student groups using our game education package Re-Organise. All students first studied a reader and a game role description and then filled out a circular business model canvas and a learning reflection. The first group, i.e., the game group, updated the canvas and the reflection in an interactive tutorial after gameplay. The control group submitted their updated canvas and reflection directly after the interactive tutorial without playing the game. The results were analyzed using text-mining and qualitative methods such as word co-occurrence and sentiment polarity. The game group created richer business models (using more waste processing technologies) and reflections with stronger sentiments toward the learning experience. Our detailed study results (i.e., per business model element and learning category) enhance understanding of game-based learning for circular business model innovation while providing directions for improving serious games and accompanying educational packages.
MULTIFILE
Hoofdstuk 2 gaat over peer en professionele online support voor ouders bij het opvoeden. In totaal bevat het boek 31 hoofdstukken over sociaal netwerken, geschreven door tientallen onderzoekers wereldwijd.
MULTIFILE