Categories
Research

Open Source Collaboration Practices in Commercial Projects

Lots of research in software engineering is using publicly-available data from GitHub nowadays. But do findings from such studies translate to closed-sourced development projects?

Lots of research in software engineering is using publicly-available data from GitHub nowadays. But do findings from such studies translate to closed-sourced development projects?

After all, open source development has established a few practices and social conventions that might not be found in companies. In addition, GitHub lends itself to certain modes of collaboration more than to others.

To explore this question, together with Eirini Kalliamvakou, Daniela Damian, Kelly Blincoe, and Daniel M. German I conducted an exploratory study. To get first-hand accounts of how people actually worked—or perceived to be working—, we surveyed and interviewed software developers using GitHub to work on closed source projects.

We found that many commercial projects adopted practices that are more typical of OSS projects—e.g. reduced and asynchronuous communication, independent work, and self-organization.

Disclaimer

Note: our study was exploratory and has no statistical power. We did not have access to the repositories or other tools our study’s participants worked with.

We have no reason to believe that people lied to us intentionally. However, there is a chance their memories were colored by their own biases and perceptions of themselves and their teams.

Nevertheless, our study contributes a data point to the discussion. Can research done on OSS projects on GitHub be translated to commercial teams that also use GitHub? How do practices flow between open source and commercial software development?

Independent Work

As a first insight, many study participants told us that they use a workflow with branching and pull requests. This helps keep their work independent (see Fig. 1).

oss-style-fig1
Fig. 1: The workflow used according to 79% of interviews. Another 17% used a more complex variation of this.

This is similar to what many OSS projects do: branches and the resulting pull requests isolate individual development. Code reviews are performed on pull requests before merging.

Awareness

The second insight from our study concerns awareness. Activity is very transparent on GitHub: there are discussions in pull requests and issues, and one can easily see who committed what and when to which repository.

Our study showed that commercial projects rely on GitHub’s transparency to support their communication and coordination needs. This especially benefits questions and problems during code review and merges.

Different interviewees told us about different tools to achieve this awareness: we heard about developers who relied on the issue list, the commit list, notification emails, or a chat client that integrates with GitHub.

Again, this use is similar to what open source projects do. Communication is very code-centric, and pull requests act as a central coordination mechanism.

Self-Organization

The third effect that GitHub has on commercial teams is that it enables self-organization.

Many interviewees reported that they use self-organization — as opposed to top-down task assignment — when choosing which tasks to work on and resolving conflicts. The decision what to work on is often based on a developer’s specific expertise and availability.

One interviewee told us:

“Sometimes there is a task that is very, very specific to an issue and one or two developers have worked on it in the past and they will take it up.”

Self-organization is an important part of achieving independent work. And while it seems to work well today, we believe that code analytics tools that help developers figure out where their expertise is needed the most could be an important next step in evolving this practice.

Challenges with Non-Technical Team Members

But it’s not all roses. One of the more frequently reported challenges with this GitHub-based and open-source-inspired development workflow concern the inclusion of non-technical team members. GitHub’s interface might make things incredibly easy for developers, but for others it might not be as accessible.

For example, interviewees reported that less technical people from their teams would use more generic task tracking tools such as Asana.

A related challenge is the intersection between developers’ activities and project management. This was perceived to create additional coordination and communication requirements, e.g. to help managers monitor progress. This was perceived as less severe when at least one member of the management team is also a programmer, but at least in our data set this was reported as being an exception rather than the rule.

However, most of our study participants were aware of and had reflected on such challenges, which helped them and their teams use coping strategies like the use of external task trackers.

Conclusions

Many companies from our study that use GitHub internally seem to have adopted ideas from open source: a workflow based on branches and pull requests with code reviews, lightweight communication and coordination, awareness through transparency, and a certain level of self-organization.

Fig. 2 shows a table of practices and what GitHub features enable them. The right-most column shows the effect the practice has on collaboration.

oss-style-fig2
Fig. 2: Practices reported in commercial projects, the corresponding enablers from GitHub, and the supported collaboration elements.

Our paper, accepted by ICSE 2015, discusses how GitHub’s transparency and workflow can promote open collaboration, allowing organizations to increase code reuse, and promote knowledge sharing across their teams.

Download a preprint of our paper here.

What Do You Think?

The most fascinating idea here is that practices that were developed in the open source world are increasingly spreading into industry. Open source developers advocate to use certain tools in their day jobs, which leads to companies using the same tools that OSS projects use. Those tools are built to work best with certain practices, so the practices themselves tend to be adopted in companies as well.

What are your thoughts on this? How do development processes differ between open source projects and the work done in companies? What are similarities? How do they influence each other? I’d love to hear about your own experiences in the comments below.