It’s hard to not notice that training in computer science has been streamlined to primarily fill industry-level engineering jobs. In this respect, a degree from a top university in computer science is a lucrative choice for a young person to make. The path, and resulting enterprise tools that result from this pipeline are impressive - advances in machine learning, GPUs, and mobile technologies that are changing the world. However, in this overwhelming positivity, it’s easy to look over the empty space, or specifically, to ask where those same graduates aren’t going.
As a research software engineer, I came out of traditional academia, and first hand experienced the unintended result of this practice. Practicing reproducible science, which on the simplest level is conducting an analysis to answer a scientific question, and doing it again, was really challenging. It was ironic to me that I was sitting in the crux of technology and innovation in Silicon Valley, but that the actual practice of research involving programming and software was at least a decade behind. I also came to realize that standard, and elegant practices from software engineering, namely version control, testing, and open source development, were all that were needed to solve many of these issues. I didn’t see job security or even a clearly paved career path in front of me, but I decided that the need was dire enough, and inspiring for me that I wanted to pursue it. I abandoned a traditional academic route, and turned down those same industry roles, because I wanted to bring back a layer of missing software engineers from academia.
I work as a software engineer for the Stanford Research Computer Center with this goal as the prime focus. Over the last two years I have realized the power of open source development. It is more than opening a pull request on Github, it is dually thinking about the design of systems and incentives of the users that flow through them. What I’ve learned is that open source is powerful, and inspiring, but in a logical twist, the culture is being underutilized where it is most badly needed, in the development of strong, quality open source tools for academia.
There are two primary issues at hand. The first is that young scientists do not have provided in their training the basic skills that originate from software engineering to practice sound work. The second is that the flow of people and incentives to guide development of tools are not catered for the academic researcher. Thus, it seems logical to develop a framework of teaching that might drive development of these missing tools, and train new scientists in the process. My intention as a research software engineer and educator is to do impactful work to address these issues, and I realize that to have the impact that I desire, research software engineers must go beyond the expected role and bring vision fully into an academic department. Open source software engineering must be brought into a traditional university environment, so that students are familiar with the practices from day one. An institution should have a team of staff, researchers, and students that procure funding and take ownership of essential projects related to reproducible scientific software and infrastructure. Traditionally service departments like research computing should be involved in the process of research, and allowed to write grants to secure funding for it. There must be a wider selection of hats to wear beyond solely a PI, researcher, or teacher. This introduces the rationale for the Open Source Lab - a strategy and culture in computer science education that teaches by way of actual participation in open source development, and does collaborative work on real world problems.
This is a logical avenue to pursue. We are lacking tools and practices for reproducible science. These practices and the tooling that results from them are heralded by open source. Thus, we can kill two birds with one stone by bringing open source software engineering to inspire a team of students to learn and build valuable resources for the larger community.