Key to reinforcement learning in multi-agent systems is the ability to exploit the fact that agents only directly influence only a small subset of the other agents. Such loose couplings are often modelled using a graphical model: a coordination graph. Finding an (approximately) optimal joint action for a given coordination graph is therefore a central subroutine in cooperative multi-agent reinforcement learning (MARL). Much research in MARL focuses on how to gradually update the parameters of the coordination graph, whilst leaving the solving of the coordination graph up to a known typically exact and generic subroutine. However, exact methods { e.g., Variable Elimination { do not scale well, and generic methods do not exploit the MARL setting of gradually updating a coordination graph and recomputing the joint action to select. In this paper, we examine what happens if we use a heuristic method, i.e., local search, to select joint actions in MARL, and whether we can use outcome of this local search from a previous time-step to speed up and improve local search. We show empirically that by using local search, we can scale up to many agents and complex coordination graphs, and that by reusing joint actions from the previous time-step to initialise local search, we can both improve the quality of the joint actions found and the speed with which these joint actions are found.
LINK
The number of applications in which industrial robots share their working environment with people is increasing. Robots appropriate for such applications are equipped with safety systems according to ISO/TS 15066:2016 and are often referred to as collaborative robots (cobots). Due to the nature of human-robot collaboration, the working environment of cobots is subjected to unforeseeable modifications caused by people. Vision systems are often used to increase the adaptability of cobots, but they usually require knowledge of the objects to be manipulated. The application of machine learning techniques can increase the flexibility by enabling the control system of a cobot to continuously learn and adapt to unexpected changes in the working environment. In this paper we address this issue by investigating the use of Reinforcement Learning (RL) to control a cobot to perform pick-and-place tasks. We present the implementation of a control system that can adapt to changes in position and enables a cobot to grasp objects which were not part of the training. Our proposed system uses deep Q-learning to process color and depth images and generates an (Formula presented.) -greedy policy to define robot actions. The Q-values are estimated using Convolution Neural Networks (CNNs) based on pre-trained models for feature extraction. To reduce training time, we implement a simulation environment to first train the RL agent, then we apply the resulting system on a real cobot. System performance is compared when using the pre-trained CNN models ResNext, DenseNet, MobileNet, and MNASNet. Simulation and experimental results validate the proposed approach and show that our system reaches a grasping success rate of 89.9% when manipulating a never-seen object operating with the pre-trained CNN model MobileNet.
DOCUMENT
Industrial robot manipulators are widely used for repetitive applications that require high precision, like pick-and-place. In many cases, the movements of industrial robot manipulators are hard-coded or manually defined, and need to be adjusted if the objects being manipulated change position. To increase flexibility, an industrial robot should be able to adjust its configuration in order to grasp objects in variable/unknown positions. This can be achieved by off-the-shelf vision-based solutions, but most require prior knowledge about each object tobe manipulated. To address this issue, this work presents a ROS-based deep reinforcement learning solution to robotic grasping for a Collaborative Robot (Cobot) using a depth camera. The solution uses deep Q-learning to process the color and depth images and generate a greedy policy used to define the robot action. The Q-values are estimated using Convolutional Neural Network (CNN) based on pre-trained models for feature extraction. Experiments were carried out in a simulated environment to compare the performance of four different pre-trained CNNmodels (RexNext, MobileNet, MNASNet and DenseNet). Results showthat the best performance in our application was reached by MobileNet,with an average of 84 % accuracy after training in simulated environment.
DOCUMENT
Het probleem dat deze projectaanvraag adresseert is de hoge werkdruk van zorgprofessionals in de dementiezorg. Door een stijging in het aantal ouderen met dementie, stijgt de zorgvraag, terwijl het tekort aan zorgprofessionals groeit. Door de inzet van slimme technologische innovaties zoals een Intelligente Zorgomgeving kan deze werkdruk sterk verminderd worden. Een Intelligente Zorgomgeving maakt gebruik van sensortechnieken en gebruikt Artificiële Intelligentie (AI) om gepersonaliseerde zorg te leveren door de zorgbehoefte in kaart te brengen en daarop te reageren. De Intelligente Zorgomgeving werkt daarbij samen met de zorgprofessional. Deze oplossingsrichting wordt in dit project verder uitgewerkt samen met vier zorgpartijen en drie innovatieve MKB. Aan de hand van de casus “Ondersteuning bij eten en drinken” worden Just-in-time adaptive interventions (JITAI) ontwikkeld zodat de zorgprofessional de zorgprofessional ondersteund wordt in het uitvoeren van bepaalde zorgtaken. Een voorbeeld van een interventie is het op het juiste moment geven van op de persoon aangepaste zintuigelijke prikkels (geluiden, lichten en projecties) die senioren stimuleren om te eten. Door dergelijke interventies wordt de druk op de zorgprofessional verminderd en neemt de kwaliteit van de zorg toe. Niet alleen de integratie van de AI-modules is van belang maar ook hoe de AI ‘getoond’ wordt aan de zorgprofessional. Daarom wordt er in dit project ook extra aandacht besteed aan de interactie tussen zorgprofessional en de Intelligente Zorgomgeving waardoor het gebruiksgemak wordt verhoogd en zowel cliënt als zorgprofessional een hogere mate van autonomie kunnen ervaren. Door het prototype van de Intelligente Zorgomgeving verder te ontwikkelen in zorginstellingen in samenwerking met verschillende zorgprofessionals en aandacht te besteden aan het ontwikkelen van AI en Interactie met het systeem kunnen de wensen en behoeften van de zorgprofessionals worden geïntegreerd in de Intelligente Zorgomgeving. Dit gebeurt in drie iteraties waarbij de drie opeenvolgende beschikbare living labs in toenemende mate complex en realistisch zijn.