Git init
Initializing your git repository
Introduction
On this post, I will explain what happens when you initialize a git repository and go through the files created within git repository and its usage.
Initializing git repository
To initialize a git repository, you can simply create a directory and use git init command which will initialize the repository.
% git init
hint: Using ‘master’ as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of ‘master’ are ‘main’, ‘trunk’ and
hint: ‘development’. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git
As you can see above, to initialize the git repository, you can simply enter into the directory and type git init. Git will use “master” as the name of the initial branch, you can change that but we don’t need to get into this topic right now.
Once the repository is initialized, you can check the files which are created on the repository, you can check it by using ls .git/ within the repository you just created.
% ls .git/
HEAD config description hooks info objects refs
One thing to note here is, as you start work on your repository this directory folder structure will change by adding few other files such as logs, info, index and commit_editMSG, look at my repository structure after committing few changes:
% ls .git/
COMMIT_EDITMSG config hooks info objects
HEAD description index logs refs
We will go through each of the files here and explain their meaning too.
HEAD
Looking at above files, you can see HEAD. The HEAD is mainly the pointer in that given moment (commit) of your current branch within your repository. When you switch between branches, you are pretty much transferring the HEAD to the other branch, see image below:
if you want to check where the HEAD is pointing to within your current branch, you can see by doing the following command, and you see that in this case, the HEAD is pointing to the MASTER and not to any specific commit_id:
% cat .git/HEAD
ref: refs/heads/master
ref is basically a reference where you can find where the HEAD is pointing to, you can compare that with the git log and see where the head is compared to the head from the log:
% git log --oneline
317c930 (HEAD -> master) second commit
e7bde04 new file added
As you can see above, the HEAD is on commit 317c930 and now you can compare with the HEAD from the refs.
% cat .git/refs/heads/master
317c930d9aa7cc66c77958d8945dcd201ee59fb4
To summarize this, HEAD is the pointer of your branch, you can move the HEAD around commits depending on your needs but bit extra careful when doing this to don’t mess up the changes on the files.
CONFIG
Config is mainly the git configuration file, this is where you define the configuration of your git installation, preferences, user info, behavior of a repository:
% cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
ignorecase = true
precomposeunicode = true
Git stores configuration options in three separate files, which lets you scope options to individual repositories (local), user(Global), or the entire system (system). Let’s have a look at each of them:
- — local: /.git/config — This is repository specific configuration. By default, git config will write to a local level if no configuration option is passed. Local level configuration is applied to the context repository git config gets invoked in.
- Global: /.gitconfig — User-specific configuration, this is where options set with — global flags are stored. User-Specific here means the configuration applies to an operating system user. Global configuration values are store in the user’s home directory.
- System: $prefix/etc/gitconfig — This is system wide settings which is applied across an entire machine. This covers all users on an operating system and all repositories. The system level configuration file lives in a gitconfig file off the system root path as showed above.
When git looks for a configuration value, it will start at the local level going to global and then system level configuration, so it will be local > global > system.
DESCRIPTION
The description file is used only by the GitWeb program, to display the description of the repository on the GitWeb page. Most of the time developers can write something that explains what the repository is about.
The description file can also be read by hook scripts or other scripts pertaining to the repository. For example, you can have a hook script “post-receive-email” which sends e-mail to all members when a commit is made to the repository, and uses the contents of the description file as the name of the repository in the email subject.
HOOKS
Git hooks are scripts that run automatically every time a particular event occurs in a git repository. They let you customize gits internal behavior and trigger actions at key points in the development life cycle. They are bash scripts which are trigger, for example, you have a hook for commit-message example where it automatically populates the git-commit message:
% cat .git/hooks/commit-msg.sample
#!/bin/sh
#
# An example hook script to check the commit log message.
# Called by "git commit" with one argument, the name of the file
# that has the commit message. The hook should exit with non-zero
# status after issuing an appropriate message if it wants to stop the
# commit. The hook is allowed to edit the commit message file.
#
# To enable this hook, rename this file to "commit-msg".# Uncomment the below to add a Signed-off-by line to the message.
# Doing this in a hook is a bad idea in general, but the prepare-commit-msg
# hook is more suited to it.
#
# SOB=$(git var GIT_AUTHOR_IDENT | sed -n 's/^\(.*>\).*$/Signed-off-by: \1/p')
# grep -qs "^$SOB" "$1" || echo "$SOB" >> "$1"# This example catches duplicate Signed-off-by lines.test "" = "$(grep '^Signed-off-by: ' "$1" |
sort | uniq -c | sed -e '/^[ ]*1[ ]/d')" || {
echo >&2 Duplicate Signed-off-by lines.
exit 1
}
Git Hooks are local to a repository and they are not copied over to a newly cloned repository. Because they are local, they can be modified by anyone who has access to the repository, which means that you would need to ensure the hooks are updates within your team.
Info/exclude
If you need to ensure that git is no tracking specific files, such as virtual environment files, files with credentials, then you would add the files into .gitignore.
There are 3 ways of excluding files in git.
- .gitignore: this applies to every clone of this repository (versioned)
- .git/info/exclude: only applies to your local copy of this repository, local not shared with other developers)
- ~./gitignore: it will apply to all the repositories on your local computer and it will not be shared with others.
The advantage of gitignore is you can have multiple gitignore files, one inside each directory/subdirectory for directory specific ignore rules, unlike git/exclude. Another advantage is that gitignore is applied across all clones, which means that in large teams all developers are ignoring the same kind of files, while if you’re using .git/exclude you would have to set this up locally.
LOGS
Logs are basically the history of the commits on your repository. You can see the commit id and who commit with the commit message, see below an example:
% cat .git/logs/HEAD
e7bde048443ef6e641851f8edebd74b60dc86b29 Renato Gentil <email@email.com> 1616434985 +0000 commit (initial): new file added
317c930d9aa7cc66c77958d8945dcd201ee59fb4 Renato Gentil <email@email.com> 1616538559 +0000 commit: second commit
INDEX
Git Index is where you place files you want commit to the git repository. The Index is also known as staged files. Before you commit file to the git repository, you need to first place the files in the git index.
As you can see in the figure below, git add “filename” will place the file into the index area and then when you commit, the file will be placed into local repository.
OBJECT
A git repository is a collection of objects, each identified with their own hash. When you add a file, you get a generated hash and this hash is used to uniquely point to that version of a file. Git stores content where all the content is stored as tree and blob objects. If you want to see the objects on your master branch you can simply issue git ls-tree master . as shown below:
% git ls-tree master .
100644 blob e5695bc8dc652b66c02f15b5bc8396c5c884a045 file.txt
As you can see above, this is a blob object. Blob object is a file which contains/stores content of the committed file. You can also see this is a blob object by using git cat-file -t hash_id:
% git cat-file -t e5695bc8dc652b66c02f15b5bc8396c5c884a045
blob
Blob object is just the sequence of bytes and git blob object will contain the same data as the file. The main difference here is the blob object is stored in the git object database where the file is stored on the file system. This means that if you create a file into your git repository “test.txt”, you will have two copies of the same file, one of the in the filesystem and the second one as a blob object in git object database. you can actually see the file content if you do a git cat-file commit_id.
Final Words
This post showed what files are created when a directory is initialized as git repository. It has basic explanation for each file in git object structure so you can easily understand them and work on them when working on your git repositories.
Hope you have enjoyed this post and feel free to commend and send me a message if you like it.