Getting Started – Git
In this article I’ll be covering some the things you may need to know about Git. I’ll try and go into the concepts that Git puts out there as well as how to start using it successfully. This article will be written in such a way that it flows into my next article which will cover Git-Flow for branching strategies. I hope you’ll enjoy both of these and that you’ll find this helpful in your own projects.
What I’ll be going through are as follows:
- Git – What is it?
- Git Setup
- Managing Git
- Creating a Repository
- Cloning a Repository
- Adding files to a Repository
- Committing files to a Repository
- Pushing changes to a Remote Repository
- Pulling changes from a Remote Repository
- Fetching changes to a Remote Repository
- Git Diff
- Checking Status
- Resetting your Staging Area
- Remove Files from the Repository
- Renaming Files on the Repository
- Working with Remotes
- Managing Branches
- Merging Branches
- Deleting Remote Branches
- Connecting to a Git Repository
- Git Ignore
- Per Repository – Shared
- Per Repository – Unshared
- Useful Git Links
Just remember that this article will be changed and things will be added and refined as I learn new things or as I find new things that are relevant for everyone to know about Git.
Git – What is it?
To quote the GREAT Wikipedia; “Git is a distributed revision control system with an emphasis on speed”. Now if that doesn’t answer your question, then maybe the following will J For me to understand Git I had to do a comparison between Git and the more legacy types of Version Control Systems like Team Foundation Server and Subversion. To make this simple I’ve come up with the following simplified image which shows the Version Control System in the form that we may know and understand it:
In this case we have a centralized Version Control Server to which all developers connect when they want to start working on a specific project. They would get a copy of the project source on their local machines and would open it in their favorite IDE and start making the changes they needed to make.
Now, when you look at the same type of scenario on a Git structure you would have the following simplified image:
In the above image I recreated the Repository system to duplicate the same type of structure as what we have on the Legacy Version Control System. You may have noticed that I switched the “Workstation” image for the “Data Base” image. If you compare this image with the first one you’ll see that I used this same “Data Base” image for showing Projects i.e. Project1, Project2, Project3 etc. This is where one of the biggest differences comes in when using Git. Every time you get a copy (clone) of a Git Repository, you essentially end up with a complete copy of the Repository on your local machine i.e. all branches, all commits that have ever happened on it and all notes that are linked to it. So, to show what Git is and why it’s so powerful I’m now going to change the Git image to look slightly different:
In this image you’ll see that Developer 3 now connects to a Repository on Developer 4’s machine and Developer 4 now connects to a Repository on Developer 2’s machine. From this you can now start seeing a slightly different image seeing that you now start getting a bit of a more distributed environment seeing that all developers won’t actually have to work against a central repository to be able to work on the same project at the same time. It also means that if the Central Repository goes down, you would be able to continue working in a synchronized manner without worrying too much about merging back into the Central Repository.
I hope your mind has started going down some dark alley ways when looking at the above, because with a system like Git you should see that it is fairly easy to do Distributed Development i.e. have Development Teams across the globe, of whom some may have good internet capabilities and others not. This brings me to my next point where I’ll now take the image above and change it even more by adding some tokens which will show the transport mechanisms that can be used to merge these distributed Repositories as well as an even more Distributed Environment where email could be used for Merging these Repositories seeing that email might be the only way to communicate. This is of course a hypothetical image, but I did it to make a point with regards to Git. The point is simple, Git allows you to do anything and it allows you to set up your Repository structures exactly how you want it. Let have a look at the image:
In this image I moved Developer 3 to his own locale where internet connectivity is not always available, so I’m using email to merge Project 3 and Project 4 back up to the various Repositories. I’ve also added an additional Developer and a new “Local Git Distributed” environment which hosts a “central” Repository for Developers 3 and 5, which could be set up to run a daily Jenkins job which emails the merge files to the main Git Repositories where ever their respective remotes might be.
To complete this section of “Git – What is it?” I’m going to leave you with this final thought:
“Git is a distributed revision control system with an emphasis on speed”
To install Git on an Ubuntu box is as easy as running the following command from a Terminal Window:
sudo apt-get install git
This will download and install Git for you and you should be able to start using it straight away…
In this section I’ll be covering some of the commands that you should know when you want to start using Git. This is written as a quick reference guide, but I’m sure there are other sites like the ones I have under Useful Git Links that will show you some of the other commands that you may want to use. I’m only going to cover the ones that I use in most cases and when I come across some new ones I’ll be adding them to this section.
Creating a Repository
If you already have a Project folder and you simply want to create a Repository in that folder you can run the following command from a Terminal Window inside the folder:
Once run it should say that the Git Repository was initialized as empty. If you run the following command you should see that Git has created a “.git” folder.
All the configuration and settings for your new Repository will be under this new folder.
Cloning a Repository
What you should know about the “clone” command is that it will do the following on your machine:
- Create a sub-folder with the Project’s name.
- Initialize and empty Git Repository.
- Download a Complete copy of the Repository you’re busy cloning which includes a complete commit history and all the branches in that Repository.
- Set the local Git Repository’s remote to that of the Repository you’re cloning.
To do this you can simply run the following command:
git clone git://github.com/Yakiloo/HelloWorld-GitHub.git
In this case it will clone my online Repository that is hosted under my organization called “Yakiloo” and the repository that will be cloned is named: “HelloWorld-GitHub”. In this case it will simply clone a Read-Only copy of the online Repository, but you can change this by putting your own Repository’s URL in it’s place. The structure of the clone command is:
git clone [url]
Adding files to a Repository
Let’s say that we have a file called “Gemfile” in our Project’s folder and we would like to add this file to our Repository. In this case you can simply run the following command:
git add Gemfile
If we had more than one file then you would be able to add it by running the following space-delimited command:
git add Gemfile Gemfile.lock
If you make a change to any of your files you should also add the files by running the command above. I feel that I should say that this “add” command doesn’t really add your files to the Repository, it simply adds the files to your Staging Area. To actually add these files to your local Repository you would have to run the “commit” command covered in the next section.
Committing files to a Repository
So, now you’ve added some files to your Staging Area and you would like to commit these files to your Repository then you should run the commit command as follows:
git commit -m ‘some commit message’
In this statement you’ll see the “-m” flag which simply means that the next bit of the command is the “Commit Message” that you would like to associate with your changes. Once the commit command has been run your changes will be committed to your local Repository. You may at this stage want to push these changes to your “remote” Repository.
Pushing changes to a Remote Repository
To push changes to a Remote Repository you can run the following command:
git push origin master
If you break this down then the format would be:
git push [alias]
So, in Git you can create aliases for remote Repositories to make it easier on yourself. I’ll cover this in a different section, but for now you should know that if you run the push command and some other developer has already made some changes to the remote Repository, then you will have to do a “pull” first, merge the changes that you’ve made into that Developer’s changes and only then will you be able to do a push. I’ll cover this in the next section.
Pulling changes from a Remote Repository
First thing that I would like to say before going into this section is that you shouldn’t do a pull… 🙂 Why? Well, when Git does a pull it actually does two things as follows:
- It fetches the changes that were done in remote.
- It merges the changes into your local source.
So, why is this bad? Let’s say that you made changes to your “Gemfile” locally and some other developer has also made changes to this file, then Git will add headers and footers to this file, which will mean that you would have to go into each file that was merged to remove these headers and also to ensure that the merges were done successfully. So, instead of doing a pull I prefer doing a fetch and then a merge myself. These commands give me more control about how the merge happens, but for this section let’s see how you do a pull from a remote Repository. To do this you simply run the following command:
git pull origin master
To break this down again you should follow the standard of:
git pull [alias]
Once you’ve run this command you should now have the latest version of the source in the remote Repository on your local Repository with all the commits that were done by the other developers in your company/team etc.
Fetching changes to a Remote Repository
To fetch new branches and changes from a remote Repository into your local Repository you can run the following command:
git fetch origin master
I’m sure you’re starting to see how all the commands in Git are structured. So, once again the above command can be broken down as follows:
git fetch [alias]
Git has a built-in tool that can be used to see what you’ve changed on your local Repository. This tool allows you to use it in different contexts as follows:
- git diff
This will show all changes you’ve made to source files locally that hasn’t been added to your Staging Area.
- git diff –cached
This will show you all changes that you’ve made on your local Repository that has been added to your Staging Area.
- git diff HEAD
This will show you all changes that have been made whether they’ve been added to your Staging Area or not.
- git diff –stat
The “–stat” flag can be run with any of the commands above to get a summary of the changes that were made in the relevant context.
You can check the status of your local Repository and the files it contains quite easily by running the following command:
This command will tell you which files were added, deleted and changed. This command also has a flag which I find relevant seeing that when used, it gives you the shortened version of the truth. You can use this flag as follows:
git status -s
This is one of those commands that will make you very happy 🙂
Resetting your Staging Area
So, let’s say that you added some changed to your Staging Area and you would like to undo these changes, then you can use the reset command as follows:
git reset HEAD
If you would like to only remove changes to certain files from you Staging Area then you can run the command as follows:
git reset HEAD — fileName1 fileName2
Remove Files from the Repository
To remove files from your Repository you can use the “rm” command as follows:
git rm fileName1
This command will effectively delete the file from both your Repository, you Staging Area as well as your local hard drive. So, in essence it deletes it completely. If you don’t want to delete it from your local hard drive, but still want to remove it from the Staging Area as well as the Repository you can run the following command:
git rm fileName1 –cached
Renaming Files on the Repository
This works more or less the same as the way it does in Linux. To rename a file, you move it. So, in the context of Git you run this command if you would like to rename the file:
git mv fileName1
This does the following:
- git rm –cached
- Move file on disk to new file name
- git add fileName1
Working with Remotes
To understand Git Remotes you have accept that Git is a Distributed Version Control System. This means that essentially there is no centralized server, there’s only your “remote” or “remotes” in some cases. So, when you’re doing development and you want to pull or push, you should know which remote you need to pull from or push to. If you run the following command, Git will come back with a list of current remotes that has been set up on your local Git Repository:
This is also the entry point into managing these remotes. So you can add remotes and remove remotes. So, let’s get into the actual commands you’ll need to know when you work with remotes. To add a Remote you can run the following command:
git remote add [alias] [url]
In this command “alias” is the name that you want to use to reference this remote and “url” is the actual remote’s Git URL for example “firstname.lastname@example.org:Yakiloo/HelloWorld-GitHub.git”. Git essentially allows you to add more than one remote and allows you to merge these multiple remote Repositories into your local one and push the changes you made up to the remote Repositories. Taking this example the complete command would look like:
git remote add GitHub email@example.com:Yakiloo/HelloWorld-GitHub.git
Once this remote has been added you can run the command:
git remote -v
This will bring up a list of remotes also showing what URL will be called to do a push and what URL will be called when you do a pull from it. Now that we were able to add a remote, we may decide to at some stage remove a specific remote. To do this you would run a command that looks more or less like:
git remote rm [alias]
If we take the example above then the complete command would look something like:
git remote rm GitHub
If you run the command below again, you’ll see that the remote with the alias “GitHub” was removed from your list of remotes:
git remote -v
What you need to get use to when you do any development against Git is creating, merging and removing branches. This leads us into the next section where we’ll be working with branches.
In this section we’ll cover some of the basics that you need to know about Git Branches. The branch command allows you to do different things including listing all branches, creating and deleting branches. Before we start with the different sub-commands you need to understand that if you want to work with Git you’ll need to get use to the way Git branching works. It will save you a lot of time and frustration.
To list all the current branches on your local Repository you can run the following command:
If you would like to see all local and remote branches you should run the command above with the “-a” flag as follows:
git branch -a
To create a new Branch you can run the following command:
git branch [MyBranchName]
To switch between branches you should use the following command:
git checkout [MyBranchName]
For a shortcut to creating and switching to a branch you can also use:
git checkout -b [MyBranchName]
At some stage you’ll want to delete some of your branches. In this case you can simply use the “-d” flag along with your branch name as follows:
git branch -d [MyBranchName]
In the next section I’ll be covering some the things you’ll need to know about merging branches. I’ll also go into the way that I work and how I use branches to make my life a bit easier and give me more control around pulling new code and merging my code into the new code.
In the previous section I said that branching is a way of life. Of course this is if you decide to use Git the way that I do. To take you through a high level of the way I do things and make it a bit more visual you can have a look at the following steps:
- git clone origin master
- git branch [MyBranchName]
- git checkout [MyBranchName]
- Change some code.
- Run my tests.
- Add my code changes to the Staging Area of my branch.
- Commit my code changes.
- git checkout master
- git pull origin master
- git merge [MyBranchName]
- Fix merge issues
- git push origin master
You may thing that this is overkill in your case, but working in a team environment where my team members are constantly pushing new code and changes to our Central Repository on GitHub I’ve learned that if I don’t do things this way I’ll only be able to use the pull command, which then does some automatic merging for me and adds certain header and footer text to the files that both myself and one of my team members have changed. This would then force me to go and fix these files manually and only after doing that will I am able to do a push to the Central Repository. In the steps that I outlined above I would do a pull from the Central Repository first and will then selectively merge the changes I have in my branch into the master branch. This makes the process a bit easier and to help me even more I’m also using Git-Flow, which I’ll be covering in a subsequent article.
To move into this section I’m simply going to explain how to use the merge command on a very high level and then leave you to play with it yourself. To merge two different branches in Git you should switch into the branch that you want the changes merged into. In my steps above you can see this being done in step 9. Once in that branch you should run the following command:
git merge [MyBranchName]
At this stage Git will try to merge the changes for you and will come up with a list of all the merges that didn’t happen due to conflicts. To help you with fixing these merge conflicts you can use the Git diff command which will highlight the issues on the files. Once the changes are made you should add them to your staging area again, commit them and then push your changes to your master. For more information on how this work I would suggest you go to the reference site as listed under my Useful Git Links section. It helped me a lot… 🙂
Deleting Remote Branches
If you have created a branch on a Remote Repository like for instance GitHub. Then you may at some stage decide to delete the remote branch. To do this is not quite so forthcoming as you would expect. In fact it’s quite simple… What you need to do is tell Git to push nothing to the remote branch in the following manner:
git push [alias] :[remotebranchname]
The colon “:” is important to remember when you do this.
Connecting to a Git Repository
In some cases you will probably want to connect to a Git Repository on another machine on your local network. I know that there are quite a few articles on the next that will take you through the steps of setting up a Git Server, but I find that this is more effort than is necessary seeing that Git has this built in. Because the Git protocol is a wrapper to the SSH protocol you can simply clone the Repository by running the following command:
git clone ssh://[User]@[MachineName]/[RepositoryDirectory]
Seeing that you can do this you can also now do the rest of the commands mentioned in the section above like add this Repository as a remote and all the other things that you want to do.
IN some cases you may decide to ignore certain files or put in a different way you may choose not to track certain files in Git. There are three ways to go about setting this up, you either choose to do it “Per Repository – Shared” which you can choose to track with your source and share amongst developers or you can choose to set up a global ignore rule which will then apply to all your Repositories or you could set it up so it’s not shared amongst developers and so it only applies on a “Per-Repository – Unshared” manner.
Per Repository – Shared
You can add a “.gitignore” file to any folder under your Repository. So if you had a simple folder structure as follows:
Then you would be able to put this “.gitignore” file at any level. If we take the above structure and you want to ignore the following “systemimage.ico” file as well as all files with the extension of “.dll” then you could add the following two lines to your ignore file:
This would then ignore those files as specified. You could also add comments to your file by using the “#” sign at the start of the line as follows:
# Ignore miscelanious files
# Ignore All DLL files
Once you’ve created your ignore file you could add it to your Repository so the rules you specified are shared amongst all the developers working on your Repository. This ensures that no-one commits files and code that should not be tracked by Git.
You could also set up a Global ignore file. This will ensure that none of the file types that you specify will be tracked by Git. A good case for this is system files or package files. To do this you should create a file called “.gitignore_global” in any location that suits you. This is usually done in your home folder. Once you’ve created the file you’ll have to configure Git to point at your created file and location as follows:
git config –global core.excludesfile [path]/.gitignore_global
Once the file has been created you could add your files to ignore in the same manner as what you had in the previous section under “Per Repository – Shared”.
Per Repository – Unshared
This option should be used if you’re using some kind of editor that is not used by your team members or you’re using some tool that generates files that will only ever be created on your machine. To do this you could add your files in the same manner as shown in the section “Per Repository – Shared” to the file found in the path “.git/info/exclude”. Please take into consideration that this file won’t be tracked by Git and will also be deleted if you delete your local Repository.
Useful Git Links
This should cover most of what you need to do to work with Git.
Git Reference Site
Working with remotes
I hope that after you’ve gone through this article that you’ll have some more fun when doing development on a Git Repository. Personally I find Git to be an extremely flexible system and I must say that after using it and getting use to the way it works, I’ll have some trouble switching back to the Central Repository types of Version Control Systems. I can see the use of something like Git in the world and with the added services that some hosting providers gives us for free or a very low fee, Git will be around for a while.