The Pieces of Git
Git is made up of three different pieces:
- Repository
- Working Tree
- Index
Let’s review each one in a bit more detail. If you want to step back and learn some more Git fundamentals before moving forward, I recommend watching the Basics of Git course or the Intermediate Git course. Both are available right here from Mijingo.
Repository
Let’s start with the Repository.
The repository is a collection of commits — or changes to files the repository — and a history or archive of what the project looked like at one time.
A repository is organized into branches. These are forks in the history of the repository where a new set of changes was made. Typically, a branch will be merged back into a main branch (typically called master
) when its purpose is met.
Branches are created for all sorts of reasons. The most common reason for creating a branch is to isolate work so you don’t interfere or break the code on the master
branch.
Each repository has a HEAD. This is the current starting point of the repository. If you switch branches in the repository, the HEAD changes. If you make a change in the repository and commit, the HEAD changes again.
In summary:
- A repository has a HEAD that is the current starting point of the repository. More on HEAD later when we look at the lower level Git commands.
- A repository is into branches, with the main branch typically called
master
, which allow you to have different versions of the repository going on simultaneously (which would later be merged together) - Finally, the repository is a history or archive of the project’s Working Tree.
Working Tree
So, what’s a Working Tree?
This is a directory on your file system that is associated with a repository.
You can think of this as the file system manifestation of the repository.
It’s full of the files you edit, where you add new files, and from which you remove unneeded files. When you do your work on the project — like adding new code or assets — you do that in the Working Tree.
Any changes to the Working Tree are reflected in the Index, and show up as modified files.
Index
Okay, what’s the Index?
The Index is a middle area that sits between your Git repository and the data files on your file system (the things you edit and change).
You might have heard the Index also called:
- staging area
- staging
- stage
- cache
- Working Tree cache
I like the name “staging area” because it’s exactly what happens.
Changes are recorded to the Index before they are committed to the repository as commit objects. I like using “Staging area” because you can stage your changes — store them someone temporarily until you are ready to make a commit to the repository.
But I should clarify: the Git Index isn’t a place where actual data is stored — like the changed files or their contents. It only tracks the objects (files) that have changed so you can later bundle them up as a commit.
And it does that by keeping a list of all of the project files and then tracks new, removed, or changed files against it.
The Git Index is a binary file located at:
.git/index
To see the index you can run:
git ls-files
You might think that you use git-status
to see the Index. That’s sort of true. What git-status
does is determine the difference between the Working Tree and the Index and displays that difference to you.
A moment ago we used git ls-files
to see the Index. We can use that same command to mimic a similar output as the git-staus
command.
git ls-files --modified --deleted --others --exclude-standard
Now instead of running a standard ls-files
command we filter the output using a series of options.
- First, we want to show the modified files using
--modified
- then we also want to show any deleted files or directories using
--deleted
, - and to show new files — those that are still untracked by the repository — we use
--others
, - and, finally, we use
--exclude-standard
to honor the repositories standard excluded files and directories.
All of this together sort of recreates what we get when we run git-status
. Of course, the output isn’t similar styled but the content is the same.
The next step after knowing the status of our index — what’s stage and ready to be committed — is to create the commit. That’s when we start getting into commit objects.
Let’s jump in and talk about how Git stores data, including commit and tree objects.