How is LLNL doing in attracting external contributors?

We celebrate spack, flux, and some of our large successful projects. But what about the others? The plot below (and you will need to scroll and mouse over bars to see details) shows, for each project, the number of internal, external, and total contributors. We are primarily interested in looking at external contributors. What do you see? 🤔️ If you are interested in just looking at the raw counts of contributors for each repository in a table, see the interactive table.


Repository Summary

total repositories807
repositories with more internal contributors 63.0 %
repositories with more external contributors 33.0 %
repositories with equal of each type 3.0 %

Contributor Breakdown

It's a long tail! And if you answered "I can barely see the right side of the graph because the projects on the left are so much larger!" you are spot on. Note that spack/spack.io and LLNL/spotfe are also fakers, because both are derived from templates. We also are trying to plot all three of internal, external, and total contributors on the same plot, and the huge range of contributors (with some repos having over 700 and the bulk majority under 10) is making it hard to see. Let's remove the top values (with total unique contributors > 100) from the picture.

Filtered Contributors < total N = 100

This is slightly better, but now we can see that the bulk majority have less than 10 unique contributors. Now let's more cleanly separate internal, external, and total from this filtered set.

External vs Total

Filtered Contributors: External

Filtered Contributors: Total

Filtered Contributors: Internal

Interesting! So the most heavily worked on projects within the lab are not a complete overlap of what we might consider the most popular projects from the above (e.g., spack, mfem, zfs). Does this suggest there is room to grow the external contribution set of these projects? Possibly. Finally, let's look at buckets of contibutor counts.

Counting Project Contributors


Total Contributors

External Contributors

Internal Contributors

We see from the above that most projects are small, meaning having under 20 total contributors, and typically 10 or fewer external contributors. We can probably say that an "average" project has about 20 total contibutors, with an almost (but not quite) breakdown between external and internal, with obviously internal being the larger of the two. This metric might be off if a contributor uses a personal GitHub account that is not associated with LLNL (or if the data used from the LLNL open source repository is off). So is this problematic? Only if some of these projects are truly useful in the open source and acadmic communities but they aren't known about. A closer look at the individual projects (outside of the scope of this small exploration) is needed. The charts above are generated via this repository and show the subset of packages defined at LLNL open source