1e alinea column: De ontstellende hoeveelheid informatie en contactmogelijkheden op internet stelt ons voor de keuze wie we willen zijn en volgens welke waarden we willen leven. Waar Internet 1.0 nog vooral gezien kon worden als een grote database met Google als markt-hit, speelt in het semantic web sociale interactie een grote rol. In het semantic web kan alle data en dus bijvoorbeeld ook al uw berichtjes, profielgegevens, bestandjes en teksten en dat van anderen, nog gemakkelijker verspreid, gecombineerd, maar ook geanalyseerd en op maat worden gepresenteerd. Op iedere unieke vraag of zoekopdracht direct dus een uniek antwoord.
LINK
1e alinea column: De grote beweging via ketenomkering naar customer self care en bottom-up self assembled teaming is zich snel aan het voltrekken. De klant neemt het initiatief en Tofflers prosumership wordt zichtbaar. Het aantal business voorbeelden wordt snel groter, al gaat het om je auto zelf samenstellen, onderdelen bestellen, 3D printing, zelfroosteren, civil journalism, klanten die restaurants recenseren, tracking &tracing van de post, medische zorg. Neem Qlinx als open architectuur in combinatie met bijvoorbeeld Twitter, dat laat goed zien wat dit kan gaan betekenen voor de dynamiek op de arbeidsmarkt. Wolfram-alpha toont de potentie van het semantic web. In bijvoorbeeld Share2Start - power of the open mind zien we de kracht van crowdfunding en het begin van ‘financials 2.0’. Deze sites laten goed zien welke richting het uitgaat.
LINK
Preprint submitted to Information Processing & Management Tags are a convenient way to label resources on the web. An interesting question is whether one can determine the semantic meaning of tags in the absence of some predefined formal structure like a thesaurus. Many authors have used the usage data for tags to find their emergent semantics. Here, we argue that the semantics of tags can be captured by comparing the contexts in which tags appear. We give an approach to operationalizing this idea by defining what we call paradigmatic similarity: computing co-occurrence distributions of tags with tags in the same context, and comparing tags using information theoretic similarity measures of these distributions, mostly the Jensen-Shannon divergence. In experiments with three different tagged data collections we study its behavior and compare it to other distance measures. For some tasks, like terminology mapping or clustering, the paradigmatic similarity seems to give better results than similarity measures based on the co-occurrence of the documents or other resources that the tags are associated to. We argue that paradigmatic similarity, is superior to other distance measures, if agreement on topics (as opposed to style, register or language etc.), is the most important criterion, and the main differences between the tagged elements in the data set correspond to different topics
DOCUMENT
We review over 10 years of research at Elsevier and various Dutch academic institutions on establishing a new format for the scientific research article. Our work rests on two main theoretical principles: the concept of modular documents, consisting of content elements that can exist and be published independently and are linked by meaningful relations, and the use of semantic data standards allowing access to heterogeneous data. We discuss the application of these concepts in five different projects: a modular format for physics articles, an XML encyclopedia in pharmacology, a semantic data integration project, a modular format for computer science proceedings papers, and our current work on research articles in cell biology.
DOCUMENT
A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite satisfactory results, this measure does not take (semantic) relations between words into account.
DOCUMENT
The semantic web, social media and the amount of user-generated content continues to grow at a staggering rate. Social Media significantly contributed to the information flow during the Arab Spring, the Occupy and Wall Street movement continue to maintain a global online presence using social media technology. But is the social media information explosion really a unique event in media history? How did story telling evolve into social media? In order to place social media in its historical context and anticipate digital native expectations, we explore the origins of narrative and storytelling from the perspective of a documentary producer. How did past media technologies prepare the way for social media? How do digital natives perceive the world via social media and what do they expect from today's documentary producer? What are the viewing habits of digital natives? What do previous 'information explosions' have in common with social and digital media? These are essential questions for media and documentary producers navigating their way through the vast maze of social media technology and the semantic web, in addition to traditional media.
DOCUMENT
A common strategy to assign keywords to documents is to select the most appropriate words from the document text. One of the most important criteria for a word to be selected as keyword is its relevance for the text. The tf.idf score of a term is a widely used relevance measure. While easy to compute and giving quite satisfactory results, this measure does not take (semantic) relations between words into account. In this paper we study some alternative relevance measures that do use relations between words. They are computed by defining co-occurrence distributions for words and comparing these distributions with the document and the corpus distribution. We then evaluate keyword extraction algorithms defined by selecting different relevance measures. For two corpora of abstracts with manually assigned keywords, we compare manually extracted keywords with different automatically extracted ones. The results show that using word co-occurrence information can improve precision and recall over tf.idf.
DOCUMENT
This publication gives an account of the Public Annotation of Cultural Heritage research project (PACE) conducted at the Crossmedialab. The project was carried out between 1 January 2008 and 31 December 2009, and was funded by the Ministry of Education, Culture, and Science. Three members of the Dutch Association of Science Centres (Vereniging Science Centra) actively participated in the execution of the project: the Utrecht University Museum, the National Museum of Natural History (Naturalis), and Museon. In addition, two more knowledge institutes participated: Novay and the Utrecht University of Applied Sciences. BMC Consultancy and Manage¬ment also took part in the project. This broad consortium has enabled us to base the project on both knowledge and experience from a practical and scientific perspective. The purpose of the PACE project was to examine the ways in which social tagging could be deployed as a tool to enrich collections, improve their acces¬sibility and to increase visitor group involvement. The museums’ guiding question for the project was: ‘When is it useful to deploy social tagging as a tool for the benefit of museums and what kind of effect can be expected from such deployment?’ For the Crossmedialab the PACE project presented a unique opportunity to conduct concrete research into the highly interesting phenomenon of social tagging with parties and experts in the field.
DOCUMENT
Metadateren van bewegende beelden, een gat in de markt voor de slimme catalogiseerder? Waarom ook niet? In de toekomst zijn we nog veel meer dan nu visueel georiënteerd, en de computer kan geen beelden 'zien'. Of toch?
DOCUMENT
The huge number of images shared on the Web makes effective cataloguing methods for efficient storage and retrieval procedures specifically tailored on the end-user needs a very demanding and crucial issue. In this paper, we investigate the applicability of Automatic Image Annotation (AIA) for image tagging with a focus on the needs of database expansion for a news broadcasting company. First, we determine the feasibility of using AIA in such a context with the aim of minimizing an extensive retraining whenever a new tag needs to be incorporated in the tag set population. Then, an image annotation tool integrating a Convolutional Neural Network model (AlexNet) for feature extraction and a K-Nearest-Neighbours classifier for tag assignment to images is introduced and tested. The obtained performances are very promising addressing the proposed approach as valuable to tackle the problem of image tagging in the framework of a broadcasting company, whilst not yet optimal for integration in the business process.
DOCUMENT