Unmasking and Combating Publishing Malpractices 1: Paper Mills

Scholarly Communication Services Scholarly Publishing

Unmasking and Combating Publishing Malpractices 1: Paper Mills

November 1, 2024November 4, 2024

Post Views: 753

— by Fanny Liu

Academics are faced with a pressure to consistently and frequently publish research to sustain or advance their careers, which incentivizes quantity over quality. This “publish-or-perish” mentality has brought potential negative impacts to scholarly research, for example unethical publishing behaviour compromising scientific integrity. Misconduct in science is “fabrication, falsification, or plagiarism in proposing, performing, or reviewing research, or in reporting research results” (National Academy of Sciences et al., 2009):

Fabrication is “making up data or results.”
Falsification is “manipulating research materials, equipment, or processes, or changing or omitting data or results such that the research is not accurately represented in the research record.”
Plagiarism is “the appropriation of another person’s ideas, processes, results, or words without giving appropriate credit.”

In this post, a long-standing and accelerated problem – paper mills – will be discussed.

Introduction

Paper mills are “commercial companies that organize on-demand writing of fraudulent academic manuscripts and offer co-authorship of these papers for sale” (Abalkina, 2023, p. 689). They earn money by selling co-authorship to authors, and the prices may base on the impact factor of the journal and the position in the list of authors (COPE & STM, 2022, p. 6).

Issue

Manuscripts and publications generated by paper mills are considered a serious threat to scientific publishing, as quality and integrity are critical in establishing trust (Byrne & Christopher, 2020).

The scale of the problem might be shocking. An analysis suggests that over the past two decades, more than 400,000 research articles have been published that show strong textual similarities to known studies produced by paper mills; And in 2022, 1.5–2% of all scientific papers published closely resemble paper-mill works (Van Noorden, 2023). Another study identiﬁes that from 2019 to mid-2022, at least 451 papers are potentially linked to a paper mill in Russia, co-authored by scholars from at least 39 countries and are submitted to both predatory and reputable journals (Abalkina, 2023).

The quick development of generative AI tools, such as ChatGPT, also lead to concerns on the provision of a low-cost and simple route for the paper mill industry, including but not limited to (Kendall & Teixeira da Silva, 2024):

Generating papers quickly, usually within minutes or hours.
Generating unlimited number of papers.
Generating papers in batch mode.
Plagiarising papers but disguise them so that they would pass a plagiarism checker.
Translating (and plagiarising) papers written in one language to another.
Rewriting the same (generated) paper to generate multiple versions.
Generating images and data.

Identification

Journals have been paying efforts to identify and deter submissions by paper mills, such as requesting raw data for scrutinization, posting manuscripts to preprint servers to deter simultaneous submission to multiple journals, and accelerating post-publication review processes (Byrne & Christopher, 2020).

Common indicators for paper mill-contributions are summarised as follows (COPE & STM, 2022, pp. 8-9; Else, 2021; Van Noorden, 2023):

Topic: Frequently in the field of cellular and molecular biology (but this changes all the time).
Experiments: Usually many Western blot experiments, cytometry assays, histology/cell staining.
Experimental data: Western blots often “too clean” especially the background, cytometry assays “too clean”, molecular weight markers for Western blot experiments missing.
Layout: Very similar layout (graphs, statistical error bars, fonts in figures, etc.) among the papers.
Affiliations: Authors affiliations often not showing a specific university, mentioned departments sometimes not matching the topic of the paper.
Authors: Usually with no publishing record within the specific journal or elsewhere, using non-institutional email-addresses, using new ORCIDs for each submission.
Experimental design: Flaws in experimental design found upon closer evaluation.
Missing ethical approval for animal experiments.
Substantial changes to the author list during revision or proof corrections.
Images re-used from a different paper published elsewhere.
Citations of other paper-mill studies.
Duplicate submissions across journals.
Using “tortured phrases”, expressions resulting from the use of a spinner on a well-established scientific expression with a specific and fixed meaning to disguise plagiarism, e.g., “counterfeit consciousness” to replace “artificial intelligence”.

Usually, a paper mill submission would have a combination of the characteristics mentioned above.

Conclusion

Scholarly community (authors, peer-reviewers, editors, publishers, research institutions, funders) should work together immediately to tackle the problem before the effects become too widespread to be able to remedy, for example:

Detecting the papers submitted by paper mills during the editorial or review process.
Developing and using technology to detect the papers where possible.
Investigating and retracting the papers passing the editorial process.
Changing incentives for researchers to deter researchers from using services for quick but fake publications.

References

Abalkina, A. (2023). Publication and collaboration anomalies in academic papers originating from a paper mill: Evidence from a Russia-based paper mill. Learned Publishing, 36(4), 689-702. https://doi.org/10.1002/leap.1574

Byrne, J. A., & Christopher, J. (2020). Digital magic, or the dark arts of the 21st century—how can journals and peer reviewers detect manuscripts and publications from paper mills? FEBS Letters, 594(4), 583-589. https://doi.org/10.1002/1873-3468.13747

COPE, & STM. (2022). Paper Mills — Research report from COPE & STM.

Else, H. (2021). ‘Tortured phrases’ give away fabricated research papers. Nature, 596(7872), 328-329. https://doi.org/10.1038/d41586-021-02134-0

Kendall, G., & Teixeira da Silva, J. A. (2024). Risks of abuse of large language models, like ChatGPT, in scientific publishing: Authorship, predatory publishing, and paper mills. Learned Publishing, 37(1), 55-62. https://doi.org/10.1002/leap.1578

National Academy of Sciences, National Academy of Engineering, Institute of Medicine, & Committee on Science, Engineering, and Public Policy. (2009). Research Conduct. In On Being a Scientist: A Guide to Responsible Conduct in Research (3rd ed.). National Academies Press. https://doi.org/10.17226/12192

Van Noorden, R. (2023). How big is science’s fake-paper problem? Nature, 623(7987), 466-467. https://doi.org/10.1038/d41586-023-03464-x