An original template solution for FAIR scientific text mining

Frazen Tolentino Zondervan; Niels A.  Zondervan

doi:doi:10.1016/j.mex.2023.102145

product

An original template solution for FAIR scientific text mining

Description

This method paper presents a template solution for text mining of scientific literature using the R tm package. Literature to be analyzed can be collected manually or automatically using the code provided with this paper. Once the literature is collected, the three steps for conducting text mining can be performed as outlined below:

• loading and cleaning of text from articles,
• processing, statistical analysis, and clustering, and
• presentation of results using generalized and tailor-made visualizations.

The text mining steps can be applied to a single, multiple, or time series groups of documents.

References are provided to three published peer reviewed articles that use the presented text mining methodology. The main advantages of our method are: (1) Its suitability for both research and educational purposes, (2) Compliance with the Findable Accessible Interoperable and Reproducible (FAIR) principles, and (3) code and example data are made available on GitHub under the open-source Apache V2 license.