1. Learn about Github

Github is a version control system, meaning that you can store local versions of code (we can refer to this as “local”) and then also store a version with Github (called a “remote.” We do this so that if your computer explodes, your code is backed up. We also do it for reproducibility. By way of saving a record of changes over time, you could easily revert to a previous change, or make earlier versions of your software available for use. Github is also really great because it connects with several services that are going to help us by way of triggers called webhooks.

The basic set of actions that you will run locally is to “commit,” which is a statement that “I am happy with these changes and want to save them to my local repository” and then to “push,” which says “take my local changes and push them to the remote repository on Github.” This set of actions is part of the GitHub Flow, which is something that you should look over if you have never heard of it.

2. Fork this repository

Github is also great because you can take someone else’s code base (for example, this repository) and do an action called a “Fork.” A fork will take some repository and make a copy of it to your branch. You would want to do this given that you are contributing to the software and want to work on it, or if you want to use the repository as a template. For our purposes, we will do a combination of the two. By forking this repository to your branch you can develop your own recipe and associated software, but keep a pointer to this repository (we call this the upstream) in case the underlying template ever changes (and you want to integrate these changes). The forking operation is a button in the upper right of the repository page, and cannot be done programatically. Once forked, you will want to clone the fork of the repo to your computer. Let’s say my GitHub username is waffles, and I am using ssh:

git clone git@github.com:vsoch/wdl-pipelines.git
cd example.scif

3. Setup your config

The GitHub config file, located at .git/config, is the best way to keep track of many different forks of a repository. If you didn’t care about maintaining a link to the upstream (this repository) you could skip this step. But since we do, we instead want to edit the configuration file and add the upstream. I usually open it up right after cloning my fork to add the repository that I forked as a remote, so I can easily get updated from it. Let’s say my .git/config first looks like this, after I clone my own branch:

      [core]
              repositoryformatversion = 0
              filemode = true
              bare = false
              logallrefupdates = true
      [remote "origin"]
              url = git@github.com:waffles/example.scif
              fetch = +refs/heads/*:refs/remotes/origin/*
      [branch "master"]
              remote = origin
              merge = refs/heads/master

I would want to add the upstream repository, which is where I forked from.

      [core]
              repositoryformatversion = 0
              filemode = true
              bare = false
              logallrefupdates = true
      [remote "origin"]
              url = git@github.com:waffles/example.scif
              fetch = +refs/heads/*:refs/remotes/origin/*
      [remote "upstream"]
              url = https://github.com/vsoch/example.scif
              fetch = +refs/heads/*:refs/remotes/origin/*
      [branch "master"]
              remote = origin
              merge = refs/heads/master

In the GitHub flow, the master branch is the frozen, current version of the software. If you are making changes or adding a feature, you would checkout a new branch. In the case that you want to update your master branch, you can do:

git checkout master
git pull upstream master
git push origin master

More instructions for Github are provided in the Development docs.