Research paper: “Can the quality of articles published in academic journals be assessed with machine learning?”


The article linked below was recently published by Quantitative Science Studies.


Can the quality of articles published in academic journals be assessed with machine learning?


Mike Thelwall
University of Wolverhampton


Quantitative Scientific Studies 1–23
DO I: 10.1162/qss_a_00185


Formal assessments of the quality of research produced by departments and universities are now carried out by many countries to monitor achievements and allocate performance-related funding. These assessments are extremely time-consuming if done by post-publication peer review and are simplistic if based on citations or journal impact factors. This article examines whether machine learning could help reduce peer review burden by using citations and metadata to learn how to rate articles from a peer-reviewed sample. An experiment is used to support the discussion, attempting to predict journal citation thirds, as a proxy for article quality scores, for all Scopus narrow domains from 2014 to 2020. The results show that these thirds of Proxy quality can be predicted with better than baseline accuracy in all 326 narrow fields, with Gradient Boosting Classifier, Random Forest Classifier, or Multinomial Naïve Bayes being the most accurate in almost all cases. Nevertheless, the results are partly based on journal writing styles and topics, which are undesirable for some practical applications and lead to substantial changes in average scores between countries and between institutions within the same country. It may be possible to predict article scores when the predictions have the highest probability.

Access Full text article


About Author

Comments are closed.