An academic contribution to computer science becomes impactful when incorporated into a real software project. For machine learning (ML), open-source frameworks facilitate researchers to exploit and share their research output with other researchers and practitioners. However, such contributionsas other changes-need to be properly reviewed. This paper reports preliminary findings of an investigation conducted on Tensorflow aimed at analyzing how contributions originating from scientific articles are reviewed and how such a review process compares with code review of conventional software systems. We have quantitatively and qualitatively analyzed 16 cases in which ideas/solutions from articles made into TensorFlow after a pull request review, investigating (i) the nature of pull request review comments, (ii) the role of the reviewer, and (iii) the artifacts being reviewed or shared during the review process. The results show how, in line with previous investigations on the development process of ML systems, the code review process involves the interaction of data scientists and academics with software developers. Also, it interleaves phases assessing the scientific merits and compatibility of the article's solution with conventional code review focused on code readability and maintainability issues.
How Do Papers Make into Machine Learning Frameworks: A Preliminary Study on Tensorflow
Pepe F.;Antoniol G.;Di Penta M.
2025-01-01
Abstract
An academic contribution to computer science becomes impactful when incorporated into a real software project. For machine learning (ML), open-source frameworks facilitate researchers to exploit and share their research output with other researchers and practitioners. However, such contributionsas other changes-need to be properly reviewed. This paper reports preliminary findings of an investigation conducted on Tensorflow aimed at analyzing how contributions originating from scientific articles are reviewed and how such a review process compares with code review of conventional software systems. We have quantitatively and qualitatively analyzed 16 cases in which ideas/solutions from articles made into TensorFlow after a pull request review, investigating (i) the nature of pull request review comments, (ii) the role of the reviewer, and (iii) the artifacts being reviewed or shared during the review process. The results show how, in line with previous investigations on the development process of ML systems, the code review process involves the interaction of data scientists and academics with software developers. Also, it interleaves phases assessing the scientific merits and compatibility of the article's solution with conventional code review focused on code readability and maintainability issues.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


