This brief overview explains the overall functionality of git and GitHub so that users with little or no prior experience using version control systems may contribute to DObsiSS with relative ease. In addition to this page, these other handy resources may also be helpful:
Git is a version control protocol that allows people to modify or contribute to open source or open access projects in a non-intrusive and organized manner. GitHub is one particular service for git users that provides free hosting for smaller projects, and offers the most effective and streamlined collaborative environment for such work. Git is analogous to a common language that GitHub and other independent servers or centralized services are based upon and communicate through.
Git operates through a branching mechanism that allows users to fork already existing repositories as duplicates, which can then be modified as they see fit. A fork is considered to be downstream relative to the repository that is is based upon, which is complementarily upstream relative to the fork. Each fork is a repository in its own right, and may thus be further expanded upon by other users. In sum, forking allows a user to create their own copy of the original repository (hosted on GitHub), while also establishing a direct connection between them.
In order to modify the contents of a repository, one must download or clone the contents of their fork from GitHub (the remote server) to their local machine. Cloning not only involves copying all of the repositories' contents, but also configuring the local directory in which they are held to track any changes that a user makes. Any modifications that are made to files contained in this tracked directory are referred to as diffs, and are either defined by the addition or removal of information.
When a set of changes are made, the contributor must commit them along with a brief note. A commit is essentially a group of diffs amounting to one substantial modification. In a collaborative environment, commits amount to individual contributions. Once one or more commits are made, a user may want to push them to their repository on the remote GitHub server so that others may view them. Optionally, one can set up and push to multiple branches of their repository, which are independent and parallel copies. Maintaining various branches allows one to work on different aspects of a project independent of each other. For example, one branch could be used simply for regularly adding records to the dataset, and another could constitute changes amounting to a larger overhaul to the contents of the repository.
After a user has pushed changes to their repository, they can submit a pull request in order to compare the contents of their repository to the one that is is forked from located upstream. All differences are highlighted, and anyone can comment on specific changes or on the larger modifications as a whole. A dedicated URL is generated, and can be shared with colleagues who have not contributed data but still wish to participate in discussion. After reviewing the modifications and considering any raised issues, if DObsiSS moderators are content with changes they can merge the pull request into the main repository. If the modifications are inadequate, the pull request may be closed or left open indefinitely for further discussion.