To study the ways in which compounds can induce adverse effects, toxicologists have been constructing Adverse Outcome Pathways (AOPs). An AOP can be considered as a pragmatic tool to capture and visualize mechanisms underlying different types of toxicity inflicted by any kind of stressor, and describes the interactions between key entities that lead to the adverse outcome on multiple biological levels of organization. The construction or optimization of an AOP is a labor intensive process, which currently depends on the manual search, collection, reviewing and synthesis of available scientific literature. This process could however be largely facilitated using Natural Language Processing (NLP) to extract information contained in scientific literature in a systematic, objective, and rapid manner that would lead to greater accuracy and reproducibility. This would support researchers to invest their expertise in the substantive assessment of the AOPs by replacing the time spent on evidence gathering by a critical review of the data extracted by NLP. As case examples, we selected two frequent adversities observed in the liver: namely, cholestasis and steatosis denoting accumulation of bile and lipid, respectively. We used deep learning language models to recognize entities of interest in text and establish causal relationships between them. We demonstrate how an NLP pipeline combining Named Entity Recognition and a simple rules-based relationship extraction model helps screen compounds related to liver adversities in the literature, but also extract mechanistic information for how such adversities develop, from the molecular to the organismal level. Finally, we provide some perspectives opened by the recent progress in Large Language Models and how these could be used in the future. We propose this work brings two main contributions: 1) a proof-of-concept that NLP can support the extraction of information from text for modern toxicology and 2) a template open-source model for recognition of toxicological entities and extraction of their relationships. All resources are openly accessible via GitHub (https://github.com/ontox-project/en-tox).
From the article: "To enable selection of novel chemicals for new processes, there is a recognized need for alternative toxicity screening assays to assess potential risks to man and the environment. For human health hazard assessment these screening assays need to be translational to humans, have high throughput capability, and from an animal welfare perspective be harmonized with the principles of the 3Rs (Reduction, Refinement, Replacement). In the area of toxicology a number of cell culture systems are available but while these have some predictive value, they are not ideally suited for the prediction of developmental and reproductive toxicology (DART). This is because they often lack biotransformation capacity, multicellular or multi- organ complexity, for example, the hypothalamus pituitary gonad (HPG) axis and the complete life cycle of whole organisms. To try to overcome some of these limitations in this study, we have used Caenorhabditis elegans (nematode) and Danio rerio embryos (zebrafish) as alternative assays for DART hazard assessment of some candidate chemicals being considered for a new commercial application. Nematodes exposed to Piperazine and one of the analogs tested showed a slight delay in development compared to untreated animals but only at high concentrations and with Piperazine as the most sensitive compound. Total brood size of the nematodes was also reduced primarily by Piperazine and one of the analogs. In zebrafish Piperazine and analogs showed developmental delays. Malformations and mortality in individual fish were also scored. Significant malformations were most sensitively identified with Piperazine, significant mortality was only observed in Piperazine and only at the higest dose. Thus, Piperazine seemed the most toxic compound for both nematodes and zebrafish. The results of the nematode and zebrafish studies were in alignment with data obtained from conventional mammalian toxicity studies indicating that these have potential as developmental toxicity screening systems. The results of these studies also provided reassurance that none of the Piperazines tested are likely to have any significant developmental and/or reproductive toxicity issues to humans when used in their commercial applications."
LINK
Summary: Xpaths is a collection of algorithms that allow for the prediction of compound-induced molecular mechanisms of action by integrating phenotypic endpoints of different species; and proposes follow-up tests for model organisms to validate these pathway predictions. The Xpaths algorithms are applied to predict developmental and reproductive toxicity (DART) and implemented into an in silico platform, called DARTpaths.