Git – Source Code Management Tool

Learning objectives
Introduction
- Graphical User Interfaces
Cloning a repository
Branches and tags
Adding files
Ignoring files
Moving and deleting files
Committing changes
Inspecting changes
Merging changes
Pushing changes
Pulling changes
Merge conflicts
Undoing changes
References

Learning Objectives

Understand what git is and what a concurrent version system does.
Be able to use git to clone an already existing repository.
Be able to add files to a repository.
Be able to specify which files to ignore in a repository.
Be able to edit files and commit them to a repository.
Be able to find bugs and revert changes to a repository.
Be able to add branches and tag versions in a repository.
Be able to see how files changed throughout the lifetime of a repository.
Be able to merge changes and fix merge conflicts.
Be able to push changes to the central repository.

Introduction

You can watch a quick video made by the Git team over here:

What is Version Control? https://git-scm.com/video/what-is-version-control
What is Git? https://git-scm.com/video/what-is-git
Get Going w/ Git: https://git-scm.com/video/get-going.
Quick Wins w/ Git: https://git-scm.com/video/quick-wins

A concurrent version system (CVS) allows multiple programmers to make changes to a central code base, which is known as a repository. Any changes are logged and given a unique commit identifier so that the changes can be viewed or reverted should it no longer be needed or if it was determined to cause bugs in the code.

There are several different programs and protocols for concurrent version systems, but the one we will explore here is called git. Git is known as a source code management (SCM) tool, and it utilizes a distributed concurrent version system (DCVS). The distributed part means that all programmers have a local copy of the repository, but the central server is where all changes are pushed.

Git is a great way to checkpoint your code. If you accidentally delete a C++ file or make radical changes that you wish to undo, this is nearly impossible without some form of checkpointing. Git allows you to make incremental changes, commit them, and move on to other changes. Should you want to undo a change, you can use git to revert (undo) the changes. Since the changes were incrementally added to the repository, you will be able to see how the code progresses through time.

To get GIT, please see the following link: https://git-scm.com/downloads.

Before using GIT, you should set your identity. Recall that the typical use of git is for a large team to work on a central code base. Therefore, it is important to know who wrote the code, made the changes, and who to blame for bugs :).

To set your identity, you can use the git command directly:

git config --global user.name "YourFirstName YourLastName"
git config --global user.email "your@email.address"

There are a ton of other settings, but generally you will want a few more settings, such as the editor you want to edit commit log messages, colorizing, and how to push.

git config --global core.editor vim
git config --global color.log auto
git config --global color.grep auto
git config --global color.diff always
git config --global color.ui always
git config --global push.default simple

If you are not comfortable with the terminal, there are several graphical options available.

Windows – https://git-scm.com/download/gui/windows
Mac – https://git-scm.com/download/gui/mac
Linux – https://git-scm.com/download/gui/linux

Cloning a Repository

A central repository maintains a go-to place for all programmers to get the most recent code base. If the central repository already exists, we need to clone it to get our own local copy of the repository. You can try to see if it works for you by cloning one of my repositories.

git clone https://github.com/sgmarz/ttrust.git
cd ttrust

Notice that the git repository is a uniform resource identifier (URI). In this case, you can see that it is the web-based protocol hyper text transfer protocol (secure) or HTTPS. When cloned, the repository will have its own folder. The default name is the name of the repository without the .git extension. However, you can specify any name you want by adding a name after the URI.

git clone https://github.com/sgmarz/ttrust.git myfirstclonedrepo
cd myfirstclonedrepo

Now you have the ttrust.git repository in a directory called myfirstclonedrepo.

Also, a cloned repo will automatically track the URI as the central repository. So, cloning a repository is the easiest way to set up GIT right out of the gate.

We can use the hydra and tesla machines to be the home for our central repository too. When you log into a hydra or tesla machine, you can create your own central repository by executing init:

~> mkdir myfirstrepo.git 
~> cd myfirstrepo.git
~> git init --bare --shared=all
Initialized empty shared Git repository in /home/smarz1/myfirstrepo.git/
~>

Now you have a new, but bare repository. You can now clone this by using the SSH (secure shell) protocol on your local machine.

~> git clone ssh://netid@hydraX.eecs.utk.edu/home/netid/myfirstrepo.git
Cloning into 'myfirstrepo'...
warning: You appear to have cloned an empty repository.
done.
~> cd myfirstrepo

You will get a warning that the repository is empty, but we know it is because we just created it.

Branches and Tags

Now that you have a freshly cloned repository, you will need to start working. A branch is a deviation from the master branch. Usually, when you’re working on your own code, you will work on a branch of the central code until you’re ready. When the code is tested and ready to be added to the master branch, you will need to do a merge, where the existing code is merged with your changes. We will cover merging below.

You can take a look at all of the different branches by typing the following.

~> git branch --list
* master
~>

The asterisk (*) tells you which branch you’re currently on. So, in this case, the only branch we have is the master branch, and it is the current branch we’re on.

We can create branches by checking out a new branch.

~> git checkout -b mynewbranch
Switched to a new branch 'mynewbranch'
~> git branch --list
master
* mynewbranch
~>

We use git checkout to switch branches. When I specify -b, it will create the branch if it doesn’t exist. Otherwise, you will get an error if you try to switch to a branch that does not exist.

~> git checkout somebranch
error: pathspec 'somebranch' did not match any file(s) known to git.
~>

So, we get an error because the only two branches we have so far are master and mynewbranch.

You can remove a branch from your local machine by:

~> git branch -d mynewbranch
Deleted branch mynewbranch (was 2ce3aeb).
~>

Adding Files

Now that we have our new branch, we can start adding files to this branch. If files are already added to it, we can modify the files in the branch. The nice thing about having a branch is that we can mess things up as much as we want without affecting the master branch.

Generally, we work on branches, fix all the errors, and then merge the changes into the master branch. We will cover merging a little bit later under merging changes.

We can add one file at a time using git add or we can add recursively using git add -A.

~> echo hello > somefile.txt
~> ls
somefile.txt
~> git add somefile.txt
~> git status
# On branch mynewbranch
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   somefile.txt
#
~>

We added a file called somefile.txt. However, git will not automatically commit the changes to your branch until you use git commit, which we will see in action under committing changes.

Ignoring Files

Sometimes there are certain files that we do not want to track. Typically, these are produced files, such as object files or executables. We can ignore certain files by creating a file named .gitignore.

Inside of this file, you can specify a wildcard or a specific file that you want to ignore.

~> touch ignoreme.txt
~> ls
ignoreme.txt    somefile.txt
~> echo ignoreme.txt >> .gitignore
~> ls -a
.gitignore     ignoreme.txt      somefile.txt
~> git status
# On branch mynewbranch
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   somefile.txt
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       .gitignore
~> git add .gitignore

As you can see, when we add a file to .gitignore, it is no longer tracked. When we type git status, the file doesn’t even show up because it is listed in .gitignore. However, the file .gitignore is itself under version control, as it can change among different branches.

Moving and Deleting Files

Sometimes we want to move files. GIT will try to figure out where your files have gone or if they have been renamed, but we can explicitly do so by using git mv (git move) or git rm (git remove).

~> echo file > file.txt
~> git add file.txt
~> git commit -m "Add file.txt" file.txt
~> git mv file.txt anothername.txt
~> git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#   (use "git push" to publish your local commits)
#
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       renamed:    file.txt -> anothername.txt
#

If we want to undo what we did, we can use git reset HEAD file.txt.

~> git reset HEAD file.txt
Unstaged changes after reset:
D       file.txt
~> git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#   (use "git push" to publish your local commits)
#
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   anothername.txt
#
# Changes not staged for commit:
#   (use "git add/rm <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       deleted:    file.txt
#
~>

Committing Changes

When you make changes to files under version control, the changes will not be “saved” to the branch until you commit them. This is you telling git that you’re ready to log the changes to the current branch. When you commit the changes, a new commit identifier (commit id) will be generated, and you will be asked to write a message about the commit. The message usually details what you did and why you did it.

We can commit one file at a time by specifying the file, or we can commit ALL changes by using the switch -a.

~> git log
# On branch mynewbranch
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   somefile.txt
#       new file:   .gitignore
#
~> git commit .gitignore -m "I added gitignore to ignore certain files."
[mynewbranch 660f43f] I added gitignore to ignore certain files.
 1 file changed, 1 insertion(+)
 create mode 100644 .gitignore
~> git status
~> git log .gitignore
# On branch mynewbranch
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   somefile.txt
#
~>

The -m above allows you to specify a message. If you leave this off, git will open vim or nano depending on your editor setting. This message will then be logged. You can see when I committed, we were given the partial commit identifier 660f43f. This is just a random identifier, but it is part of a much larger identifier that you can see with git log.

~> git log
commit 660f43f83033f9e0c49630a7ac4d3f696b40c3a3
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:04:27 2021 -0400

    I added gitignore to ignore certain files.
~>

You can see the much larger git identifier as well as the author, date, and message.

Now that the changes are committed, the branch is now updated. If you modify these files again, you will have to commit the new changes. Recall that you can use git status to see which files have changed.

Inspecting Changes

When you make changes to a branch, it might be important to see what changed. We can look at a high-level view, just showing what files changed by using git status. However, we can get much more detail by looking at what exactly changed. We can use git show to show all of the changes that were made up to and including the current commitment. We can use git diff to show all of the changes that have not yet been committed. However, we can also specify two commit identifiers to show the difference.

We can see what changed between commit identifiers by specifying them.

~> git commit somefile.txt -m "Add somefile."
[mynewbranch bc75c5e] Add somefile.
 1 file changed, 1 insertion(+)
 create mode 100644 somefile.txt
~> git log
commit bc75c5e999d551c1d8060a695232407202abbf46
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:09:17 2021 -0400

    Add somefile.

commit 660f43f83033f9e0c49630a7ac4d3f696b40c3a3
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:04:27 2021 -0400

    I added gitignore to ignore certain files.
~>

If you’re following along, your commit identifiers will be different.

You can now see that I have two commits. So, it would be helpful to see what each commit did and what the difference is between the current branch and the previous commits. We can do this by specifying the partial commit id to see what changed.

~> git diff 660f43
diff --git a/somefile.txt b/somefile.txt
new file mode 100644
index 0000000..ce01362
--- /dev/null
+++ b/somefile.txt
@@ -0,0 +1 @@
+hello
~>

I specified the earlier commit, id 660f43. It then shows us that the only difference is somefile.txt. The +hello means that the word “hello” was added to somefile.txt. If we change somefile.txt, we will see what happens:

~> echo goodbye > somefile.txt
~> git commit somefile.txt -m "Change somefile.txt to goodbye."
[mynewbranch 528d867] Change somefile.txt to goodbye.
 1 file changed, 1 insertion(+), 1 deletion(-)
~> git log
commit 528d867e35e746d207846c891425a7dd2e1b25a5
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:12:28 2021 -0400

    Change somefile.txt to goodbye.

commit bc75c5e999d551c1d8060a695232407202abbf46
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:09:17 2021 -0400

    Add somefile.

commit 660f43f83033f9e0c49630a7ac4d3f696b40c3a3
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:04:27 2021 -0400

    I added gitignore to ignore certain files.
~> git diff bc75c5e
diff --git a/somefile.txt b/somefile.txt
index ce01362..dd7e1c6 100644
--- a/somefile.txt
+++ b/somefile.txt
@@ -1 +1 @@
-hello
+goodbye
~>

Now you can see the difference is somefile.txt, but we removed hello, denoted by the – sign, and added goodbye, denoted by the + sign.

We can also use git show to show us the last commit’s change:

~> git show
commit 528d867e35e746d207846c891425a7dd2e1b25a5
Author: Stephen Marz <stephen.marz@utk.edu>
Date:   Thu May 20 15:12:28 2021 -0400

    Change somefile.txt to goodbye.

diff --git a/somefile.txt b/somefile.txt
index ce01362..dd7e1c6 100644
--- a/somefile.txt
+++ b/somefile.txt
@@ -1 +1 @@
-hello
+goodbye
~>

So, git show is essentially a git log and git diff looking at the latest commit to see what changed.

Merging Changes

When we make changes to the branch, we will eventually have to make all of those changes to the master branch. This means that we will have to merge the changes from our branch to the master branch. This can be done using git merge. First, we checkout the branch we want to merge into. We cannot switch branches until all changes have been committed, or they have been stashed.

~> git status
# On branch mynewbranch
nothing to commit, working directory clean
~> git checkout master
Switched to branch 'master'
~> git merge mynewbranch
Updating 78c5db4..528d867
Fast-forward
 .gitignore   | 1 +
 somefile.txt | 1 +
 2 files changed, 2 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 somefile.txt
~> git branch -D mynewbranch
Deleted branch mynewbranch (was 528d867).
~>

In the example above, you can see that we first change branches to master, then we merge the branch. Then, I deleted the branch using git branch -D.

Pushing Changes

Recall that git has a central server that stores the master branch. When we make changes and want to update the central server, we push those changes. Generally, after we commit and merge, we now have to push the changes.

We can see if our local copy is out of sync with the central server using git status.

~> git status
# On branch master
# Your branch is ahead of 'origin/master' by 3 commits.
#   (use "git push" to publish your local commits)
#
nothing to commit, working directory clean
~>

It tells us that our local copy is ahead of the central server. We expect this since we made the changes locally, and the central server has an old copy.

To fix this, we then need to push our commits to the central server.

~> git push
Counting objects: 10, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (9/9), 796 bytes | 0 bytes/s, done.
Total 9 (delta 2), reused 0 (delta 0)
To /home/smarz1/myfirstrepo.git
   78c5db4..528d867  master -> master
~>

So, this will use the network to send the changes to the central server. You will see the objects being written (sent) and then it will show you that our local copy and the central copy are now synchronized. We can double check again by using git status.

~> git status
# On branch master
nothing to commit, working directory clean
~>

Pulling Changes

Pull is the opposite of push. Instead of updating the central server, we want to update our local copy with that of the central server. Unfortunately, we will not be notified by git status that the central server has updated. Instead, we have to run git pull from time to time to keep our local copy synchronized with the central copy.

~> git pull
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /home/smarz1/myfirstrepo
   528d867..7cdec47  master     -> origin/master
Updating 528d867..7cdec47
Fast-forward
 somefile.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
~> git pull
Already up-to-date.
~>

If there are any changes, the changes will be made to your local copy. If you’ve made changes to a file that was also changed by someone else, then git pull will also run a merge. If you’ve worked on the exact same lines as someone else, you will have merge conflicts that you will need to fix, commit, and push again to the central repository.

Merge Conflicts

A version system is used for multiple people to work on the same base of code. However, it isn’t magic. It can’t figure out what to do if two or more people change the exact same lines of code. If you all worked on different lines of code or different files, then git can use a merging strategy to automatically merge without issue. However, if you didn’t, it will result in a merge conflict. These require you to manually intervene and fix the conflict.

~> git push
To /home/smarz1/myfirstrepo.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to '/home/smarz1/myfirstrepo.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first merge the remote changes (e.g.,
hint: 'git pull') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

We get the message above because the central repository changed. So, we need to first pull and merge the central (remote) changes before we can push our changes.

~> git pull
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /home/smarz1/myfirstrepo.git
   7cdec47..5a967b5  master     -> origin/master
Auto-merging somefile.txt
CONFLICT (content): Merge conflict in somefile.txt
Automatic merge failed; fix conflicts and then commit the result.

I purposely added a merge conflict here, but you can see that it tells you that there is a conflict, and it will tell you which file the conflict is in–somefile.txt.

Now, we edit somefile.txt and we will see some weird looking symbols.

<<<<<<< HEAD
abracadabra
=======
noway
>>>>>>> 5a967b54f9a1318680dea31f9bfddaaf79c3a4cf

This is showing that the central repository has the value noway, but our local copy (HEAD) has abracadabra. So, what we need to do is delete the lines and determine which of these is the real up-to-date-value. Let’s pick abracadabra.

~> git commit -a -m "Fix merge conflict"
[master fd462b9] Fix merge conflict
~> git push
Counting objects: 8, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 455 bytes | 0 bytes/s, done.
Total 4 (delta 1), reused 0 (delta 0)
To /home/smarz1/myfirstrepo.git
   5a967b5..fd462b9  master -> master
~>

So, we fixed the merge conflict, committed the merged version, and pushed it to the central repository.

Undoing Changes

Many times, we will make a change that either is no longer needed/wanted or it introduced a bug and we want to undo it. In GITspeak, it is called reverting because we revert to a previous commit. This is why making many, smaller, commits is a better approach than an all-for-one approach. Reverting a large commit means everything is undone!

A git revert requires a commit ID or a name to revert, such as HEAD~3. There is a list of different ways to name a commit: Git – gitrevisions Documentation (git-scm.com).

References

Visual Cheat Sheet: https://ndpsoftware.com/git-cheatsheet.html
Textual Cheat Sheet: https://training.github.com
Reference Manual: https://git-scm.com/docs
Dealing with merge conflicts: https://www.atlassian.com/git/tutorials/using-branches/merge-conflicts

Contents