An introduction to version control with git

Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Academic Computing:
 http://ac.seas.harvard.edu/
Support:

This document may be viewed as a slideshow.

Contents

Who are you?

Lars Kellogg-Stedman <lars@seas.harvard.edu>
Senior Technologist, SEAS Academic Computing

Please ask me lots of questions.

Interactive presentations are much more fun.

Why are we here?

What is version control?

What is version control?

What is version control?

Why use version control?

It makes your life easier.

Why use version control?

It makes it easier to work with others.

Types of version control

There are two version control models in common use:

Centralized version control

Centralized version control

Developers check out working copies.

images/central-1.png

Centralized version control

Someone commits bad code to repository.

images/central-2.png

Centralized version control

Changes are visible to everyone.

images/central-3.png

Distributed version control

Most recent version control systems use a distributed model.

Distributed version control

Developers check out working copies.

images/central-1.png

Distributed version control

Someone commits bad code to local repository.

images/dvcs-2.png

Distributed version control

Fixes locally and pushes to remote repository.

images/dvcs-3.png

Distributed version control

Everyone is happy.

images/dvcs-4.png

Distributed version control

There is no spoon.

images/nospoon.png

In the world of distributed version control, the idea of a central repository is a social construct rather than a technical one. While some projects may find it convenient to identify a central repository, git (and other DVC systems) do not enforce a hub and spoke configuration.

For some of my own projects I have something of an "inverted tree": my working copies push to two remote repositories. One is a "personal" repository, which I use to coordinate my work between my office, my laptop, and so forth. The other is a "public" repository, where I push my code when I want others to see it.

Centralized vs. Distributed

It may sound like I am suggesting that distributed version control is generally better than centralized version control.

In particular, some of the developers of Subversion have suggested that a distributed model makes it less likely that people will share code with others (while in a centralized system they are largely forced to if they want to take advantage of the version control system).

git: Installing git

git: Initial configuration

You will need to set your name and email address:

git config ––global user.name "Your Name"
git config ––global user.email "you@harvard.edu"

Configuration values can be set globally (using --global) and/or per-repository.

git: Initial configuration

This sets your preferred editor (for commit mesages, etc):

git config --global core.editor vim

This turns on some nice color output:

git config ––global color.ui auto

git: Getting help

git: Creating a repository

Use git init to create a git repository in your current directory:

$ git init myrepo

With older versions of git:

$ mkdir myrepo; cd myrepo; git init

[documentation]

git init creates a git repository (named .git) in your project directory. You will add files to this repository using git add. This gives you a "repository" (the .git directory) and a "working copy" (everything else).

If you are going to start tracking an existing project with git, you will often start like this:

$ git init
Initialized empty Git repository in .../.git/
$ git add .
$ git commit -m 'initial import'

git: Adding files

git add schedules files to be committed to the repository:

git add PATH [PATH ...]

[documentation]

If you modify a file you (generally) need to git add that file in order to make the changes part of the next commit. If the file is already part of the repository you can name it explicitly on the git commit command line:

git commit path/to/myfile.c

You can provide a directory to commit any modified files contained in that directory tree:

git commit path/to/

git: Committing changes

Use git commit to commit files to your local repository:

git commit [-a] [-m message] [PATH ...]

[documentation]

git commit by itself will commit any changes scheduled using git add:

$ git commit -m "added a file"
[master 65abe48] added a file
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 file1

If you would like to commit all locally modified files, use the -a option:

$ git commit -m 'lots of changes' -a
[master 2553f3a] lots of changes
 3 files changed, 3 insertions(+), 3 deletions(-)

You may also commit a subset of modified files by specifying paths on the command line:

git commit path/to/modified/file

git: What's changed: status

Use git status to see a list of modified files:

git status

[documentation]

The output of git status will look something like this:

$ git status
# On branch master
#
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#  modified:   Makefile
#
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#  modified:   workshop.rst
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#  examples/
no changes added to commit (use "git add" and/or "git commit -a")

Files listed under "Changes to be committed" will be committed to the repository next time you run git commit.

Files listed under "Changed but not updated" are files that you have modified but have not yet added (with git add) to the repository.

"Untracked files" are files that have not previously been added to the repository.

git: Correcting mistakes

You can use git reset to undo a git add or git commit operation (with some limitations):

git reset [--hard] REFSPEC [PATH]

With --hard, git reset will revert the contents of files.

[documentation]

Using git reset to "undo" an add operation:

$ git add newfile
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#  new file:   newfile
#
$ git reset HEAD
$ git status
# On branch master
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#  newfile
nothing added to commit but untracked files present (use "git add" to track)

Using git reset to revert to a previous version of your repository:

$ git reset 09fe2ee89cc19202bf4c12b61ae19c10a723bf67
Unstaged changes after reset:
M  Makefile
M  README.rst
M  git.rst
M  workshop.rst
$ git status
# On branch master
# Your branch is behind 'origin/master' by 12 commits, and can be fast-forwarded.
#
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#  modified:   Makefile
#  modified:   README.rst
#  modified:   git.rst
#  modified:   workshop.rst
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#  backmatter.rst
#  publish.sh
no changes added to commit (use "git add" and/or "git commit -a")

The repository has been reset to a previous version, but the working copy still contains the most recent version of the files. Using --hard we can reset the working copy, too:

$ git reset --hard 09fe2ee89cc19202bf4c12b61ae19c10a723bf67
HEAD is now at 09fe2ee renamed version-control.rst -> workshop.rst
$ git status
# On branch master
# Your branch is behind 'origin/master' by 12 commits, and can be fast-forwarded.
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#  backmatter.rst
#  publish.sh
nothing added to commit but untracked files present (use "git add" to track)

Files added since the target commit now show up as untracked.

NB: You should never use git reset to revert a change that has been pushed to a remote repository. Look at the documentation for git revert for an alternative.

git: Correcting mistakes

You can use git revert to undo changes made in a previous commit:

git revert REFSPEC

This will create a new commit that reverses the changes in the selected commit. git revert is safe to use when sharing repositories with other people.

Before running git revert:

$ git log
commit 518457aab20420b97eeaae5f0235dedf5be19443
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Wed Feb 22 21:58:24 2012 -0500

    made changes

commit e351a670bab6d6b4913f3eec83f37b9c84b408cb
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Wed Feb 22 21:58:10 2012 -0500

    initial commit

Revert the most recent changes:

$ git revert HEAD

This will result in a new commit:

$ git log -1
commit bc2cb9548032644869269afc60ec56af7a0cad66
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Wed Feb 22 21:58:42 2012 -0500

    Revert "made changes"

    This reverts commit 518457aab20420b97eeaae5f0235dedf5be19443.

git: Renaming files

Use git mv to rename files in the repository:

git mv SRC DST

[documentation]

git mv automatically adds your changes to the index. You need to commit the change to make it stick:

$ git mv file1 file4
$ git commit -m "renamed a file"
[master b26051a] renamed a file
 1 files changed, 0 insertions(+), 0 deletions(-)
 rename file1 => file4 (100%)

git: Removing files

Use git rm to remove files from the repository:

git rm PATH [...]

[documentation]

Like git mv, git rm automatically adds your changes to the index:

$ git rm file4
rm 'file4'
$ git commit -m "removed file4"
[master b1e4267] removed file4
 1 files changed, 0 insertions(+), 1 deletions(-)
 delete mode 100644 file4

git: What's changed: diffs

Use git diff to see pending changes in your working copy:

git diff

[documentation]

The output of git diff is standard diff output, e.g.:

$ git diff
diff --git a/version-control.rst b/version-control.rst
index e518192..b1c519a 100644
--- a/version-control.rst
+++ b/version-control.rst
@@ -243,6 +243,34 @@ commit`` to commit them to the (local) repository::
 Using git: What's changed?
 ==========================

+Use ``git status`` to see a list of modified files::
+
+  git status
+
+.. container:: handout
+
+   The output will look something like this::
+

You can also use git diff to see the changes between arbitrary revisions of your project:

  • Changes in working copy vs. previous commit:

    git diff <commit>
    
  • Changes between two previous commits:

    git diff <commit1> <commit2>
    

git: Viewing history

The git log command shows you the history of your repository:

git log [PATH]

[documentation]

git log with no arguments shows you the commit messages for each revision in your repository:

$ git log
commit 7c8c3e71893d7481fdd9c13ec8f53cb9c61fac50
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Thu Mar 18 12:46:46 2010 -0400

    changed GNU to Microsoft

commit 257f2f3ff44c2165c1182d3673a825fcadf121aa
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Thu Mar 18 12:46:46 2010 -0400

    made a change

commit 99c4fb8f37e48284d79c7396aaf755b514d6a249
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Thu Mar 18 12:46:45 2010 -0400

    made some changes

commit 20cc63576f7c88541f5b9471e20f4d1c5f8afcb9
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Thu Mar 18 12:46:45 2010 -0400

    initial import

git: Tagging and branching

git: Tags

Create a tag:

git tag [-a] TAGNAME

git: Tags

List tags:

git tag

Information about a specific tag:

git show TAGNAME

Listing tags:

$ git tag
v1.0
v1.1
v2.0

Showing the commit associated with a lightweight tag:

$ git show v1.1
commit c14a08cf97c2492ca315292f5cfce41ae45581df
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Wed Feb 22 11:55:31 2012 -0500

    added file2

diff --git a/file2 b/file2
new file mode 100644
index 0000000..d709a6d
--- /dev/null
+++ b/file2
@@ -0,0 +1 @@
+Wed Feb 22 11:55:27 EST 2012

Showing an annotated tag:

$ git show v2.0
tag v2.0
Tagger: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Wed Feb 22 11:56:31 2012 -0500

This is release v2.0, which has lots of exciting new features.

commit c14a08cf97c2492ca315292f5cfce41ae45581df
Author: Lars Kellogg-Stedman <lars@seas.harvard.edu>
Date:   Wed Feb 22 11:55:31 2012 -0500

    added file2

diff --git a/file2 b/file2
new file mode 100644
index 0000000..d709a6d
--- /dev/null
+++ b/file2
@@ -0,0 +1 @@
+Wed Feb 22 11:55:27 EST 2012

git: Branches

List branches:

git branch [-a]

Create a branch rooted at START and switch to it:

git checkout -b BRANCHNAME [START]

[documentation]

Some people have put a lot of thought into branching workflows. For example, read about gitflow.

git: Branches

Switch to an existing branch:

git checkout BRANCHNAME

For example, you want to enhance your code with some awesome experimental code. You create a new seas-workshop-dev branch and switch to it:

$ git checkout -b seas-workshop-dev

You make some changes, and when things are working you commit your branch:

$ git commit -m 'made some awesome changes' -a

And then merge it into the master branch:

$ git checkout master
$ git merge seas-workshop-dev
Updating 1288ed3..33e4a4c
Fast-forward
 version-control.rst |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

git: Merging

Use git merge to combine two (or more) development histories:

git merge BRANCH

Incorporates changes in BRANCH into the current branch.

You will find yourself merging changes if you take advantages of branches to pursue parallel lines of development. For example, let's say you create a branch to work on a specific feature:

$ git checkout -b feature/xyz

Meanwhile, work continues on the master branch. When you have completed your work on the xyz feature, you need to merge your changes back into the master branch:

$ git checkout master
$ git merge feature/xyz

git: Cloning a remote repository

Use the git clone command to check out a working copy of a remote repository:

git clone REPOSITORY [DIRECTORY]

[documentation]

git clone will clone the remote repository to a new directory in your current directory named after the repository, unless you explicitly provide a name with the DIRECTORY argument:

$ git clone git://github.com/seasac-ops/git-workshop.git
Initialized empty Git repository in /home/lars/tmp/git-workshop/.git/
remote: Counting objects: 331, done.
remote: Compressing objects: 100% (92/92), done.
remote: Total 331 (delta 228), reused 324 (delta 221)
Receiving objects: 100% (331/331), 556.03 KiB, done.
Resolving deltas: 100% (228/228), done.
$ ls
git-workshop

This is analogous to Subversion's checkout operation.

You can only clone the top-level repository; unlike Subversion, git does not allow you to clone individual subtrees.

git: Updating your working copy

Use git pull to update your local repository from the remote repository and merge changes into your working copy:

git pull [REMOTE [REFSPEC]]

Here REMOTE is a label specifying a remote repository and REFSPEC typically refers to a remote branch.

[documentation]

git pull by itself will pull changes from the remote repository defined by the branch.master.remote config option (which will typically be the repository from which you originally cloned your working copy). If there are multiple remote repositories associated with your working copy, you can specify a repository (and branch) on the command line, e.g, to pull changes from the branch master at a remote named origin:

$ git pull
remote: Counting objects: 7, done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 4 (delta 3), reused 4 (delta 3)
Unpacking objects: 100% (4/4), done.
From git://github.com/seasac-ops/git-workshop
   9e60e6e..6437a50  master     -> origin/master
Updating 9e60e6e..6437a50
Fast forward
 git.rst    |   46 ++++++++++++++++++++++++++++++++++++++++++----
 publish.sh |    3 +--
 2 files changed, 43 insertions(+), 6 deletions(-)

Behind the scenes, git pull is doing a git fetch followed by a git merge.

git: Pushing changes

Use git push to send your committed changes to a remote repository:

git push [REPOSITORY [REFSPEC]]

[documentation]

git push will by default push your changes to a matching branch in your remote repository:

$ git push
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 1014 bytes, done.
Total 4 (delta 3), reused 0 (delta 0)
To git@github.com:seasac-ops/git-workshop.git
   9e60e6e..6437a50  master -> master

If there are multiple remote repositories associated with your working copy, you can specify a repository (and branch) on the command line, e.g, to push your changes to branch master at a remote named origin:

$ git push origin master

If you are pushing to a new (empty) repository, you must specify the target branch name, because git won't find a matching branch.

Git doesn't like you pushing into a remote repository that is associated with a working tree (because this could cause unexpected changes for the person who checked out that working tree). You will generally want to create "bare" repositories for remote access (using git init --bare).

If you attempt to push to a repository that is newer than your working copy you will see an error similar to the following:

$ git push
To dottiness.seas.harvard.edu:repos/myproject
 ! [rejected]        master -> master (non-fast forward)
error: failed to push some refs to 'dottiness.seas.harvard.edu:repos/myproject'

To fix this, run git pull and deal with any conflicts.

git: Conflicts

A conflict occurs when you try to merge commits that make overlapping changes to a file.

If you attempt to pull in changes that conflict with your working tree, you will see an error similar to the following:

$ git pull
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 2), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /Users/lars/projects/version-control-workshop/work/repo2
   4245cb6..84f1112  master     -> origin/master
Auto-merging README
CONFLICT (content): Merge conflict in README
Automatic merge failed; fix conflicts and then commit the result.

To resolve the conflict manually:

  • Edit the conflicting files as necessary.

To discard your changes (and accept the remote repository version):

- run ``git checkout --theirs README``

To override the repository with your changes:

  • run git checkout --ours README

When you complete the above task:

  • add the files with git add
  • commit the changes with git commit.

git: the index

Git is not really just like Subversion (or most other version control solutions).

git: the index

git: the index

Refer back to this illustration if you get confused:

images/git-transport.png
(This image used with permission.)

git: Plays well with others

Git can integrate with other version control systems.

git: Generating ssh keys

The following command will create an ssh key for use with the SEAS code repository:

ssh-keygen -f ~/.ssh/seas_code_rsa \
  -t rsa -b 2048 -N '' -C 'code.seas.harvard.edu'

This creates a 2048 byte (-b 2048) RSA (-t rsa) key with no passphrase (-N '') and places it in ~/.ssh/seas_code_rsa (-f ~/.ssh/seas_code_rsa). The corresponding public key will be ~/.ssh/seas_code_rsa.pub.

You will need to provision your account on the SEAS code repository with the public key.

In order to have git use this key by default, add the following stanza to ~/.ssh/config:

Host code.seas.harvard.edu
  IdentityFile ~/.ssh/seas_code_rsa

This tells ssh to use your newly created private key whenever it talks to code.seas.harvard.edu.

Additional resources

Additional resources

Additional resources

About this presentation

You can view this presentation online.

This presentation was written using restructuredText, a lighweight markup language, and transformed into an HTML slide presentation using rst2s5.

The complete source for this presentation is available online from http://github.com/seasac-ops/git-workshop. You can download the source as a zip archive from here.

added some stuff