Skip to content

How Git Works

Frank Matranga edited this page Oct 7, 2019 · 5 revisions

Taken from Frank's GitHub Campus Expert application

Explaining Git

When I was a much younger and much less experienced coder (little 13 year old Frank!), I once tried making some sort of basic game in Java. Now I thought this wouldn't be too hard a task, for I was a pretty cocky kid, but I soon found I had at least two problems:

  1. I didn't know Java
  2. I didn't have any ideas for a game

You can imagine the resulting mess was... well, let's just say I never want to go back and see if I have it saved anywhere! However, I don't regret working on it as the issues I had when working on it led to me understanding the concept of git a year later right away. In fact, the issue I ran into that caused me to stop working on the project could've been totally avoided with the most basic function of git. You see, I was at the time just working inside some IDE, writing code, saving as I went, running, writing, saving, etc. I had a little folder on my desktop where the project resided and that was that. Now, since I was trying to learn Java as I went, I often had to go back to my code and make pretty large changes. At one point, I was trying to rework a large section of the game code when I realized I had kind of no idea what I was doing and I decided to give up. My only option was to undo the changes I was in the middle of. And Ctrl-Z only went back so far... I was unable to get my code back to the state it was before I tried changing it all. And so just like that, I had totally broken my code and I gave up... So what does git have to do with this?

Git is "a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency." Put simply, git keeps track of the changes you make to your code so that you can jump back to any past version of your code with ease! It also lets you collaborate with people on the same project by letting them work on different areas of the code on their own computers, and then letting git merge together the changes made into one version! These are only some of the capabilities of this amazing and complex program.

The basic flow of using Git is simple and easy to remember. You first create a repository (commonly called a repo for short) which is ostensibly just your project folder but behind the scenes holds all the extra info about your project, its past versions, and more! Whenever you add or remove code or assets (any other files really) in your repo, you make a "commit" detailing what you changed. In this way, you end up with a chain of commits as your project grows, detailing all the changes made from the beginning to its current state. At any point in time you can revert to a specific commit and have your project be returned to the exact state it was at the time of that commit. So, for my silly Java game attempt, if I was committing as I went, I could've just reverted to the last commit once I realized that my major refactor just wasn't going to work. It's important to know that git only does what you tell it to, so you must go through the steps of creating a repo, adding the files so that git is aware of them, making commits, and pushing and pulling, though modern software like IDEs make this easy and automated to use.

Another important benefit of using git is that you can store a copy of your repository remotely somewhere, for example GitHub, so that the remote copy functions as the single source of truth for your project. You can have the repo on your own computer, work on it, commit as much as you want to it, and then update the remote repo by "pushing" your commits to it. And now, you can be on another computer and you can simply "pull" the commits from that remote copy and continue working on your project! Or if you are collaborating, you don't need to worry about sending your copy of the code back and forth to each other. You and your teammates simply refer to the remote repo as the one single source of the project where everything eventually is pushed to and pulled from.

Now imagine you have a friend or two working on your project on different aspects of it. You two would have your own local copies of the repo on your computer and would commit whenever you make changes and eventually push to the same remote repo so you can access each others changes. Now, it might get pretty confusing if you are committing to the same place as each other back and forth, as your work in progress is mixed in with other collaborator's work in progress. To address this, git lets you "branch" off the main line of development and commit to this "branch" which is separate from the main line and other lines so you don't mess anything up. This can also be done if you are working on a project alone and just want to branch off when working on a large feature that will take many commits that you want to temporarily be separate from the main line of commits until you are done with it. For example, if I was adding a new feature to a website that added comments to pages I would create a new branch called feature-comments or similar and work and make my commits there until I was happy with the feature. At any point you can branch from any existing branch from any existing commit! This makes it really easy to develop on different branches at once as they do not interfere with each other.

Now, once you are happy with a branch, for example the feature it was dedicated to is complete, you will want to get all of those commits back into the main line of development. If this didn't happen, you'd end up with differing versions of your repo all with different commit histories on them! Git allows us to "merge" branches into each other, which does exactly what it sounds like. So once someone has finished their work on a branch, they can merge it back into the main branch they originally branched from, and git handles all of the changes and merges all of the files changed even if more work was done on the target branch in the mean time. That's pretty cool! As long as someone didn't modify the files on the same lines you were working on in your branch, merging is smooth and instant.

Occasionally, you will get merge conflicts and will have to resolve them yourself. This happens when you or your team has changed the same part of the same file differently in the two branches you are trying to merge. In this scenario, git prevents the merge, points out where the conflicts are, and you must manually edit the files and put them in the state you want them to be. You should always do your best to ensure nobody else is working on the same files in the same areas as you when you have made a branch and are working on files in that branch!

Now to get into a bit more technical detail, you should know that in git's view, files are always in one of 3 possible states.

  1. modified
  2. staged
  3. committed

You can view your repo as a database of "snapshots" of all the files in your project and their different versions. Even files that were deleted are kept in there, and files that are added later on are kept there.

Modified

Modified means that the file has been changed but git is purposefully unaware of it (at the moment) and a screenshot of it will not be taken in the next commit unless you "add" it. Once you add a modified file, it becomes staged.

Staged

Staged files have been modified and marked so that their current version with their changes will be included in the next commit snapshot.

Committed

Committed files have been modified, staged, and committed so that they are stored in the repo's database! Committed files have their history logged through commits so that the user can backtrack or fast forward to different versions of the file.


Now that we know the 3 stages, this is the final process we go through when working on a git repo:

  1. Add/modify/delete files
  2. Add the files from above that you want included in the next commit
  3. Commit the files
  4. Repeat!

Now how do you actually tell git to do these things? The simplest option is through the command-line, as git is primarily a command-line application. You must be in the folder of the repo and then you can run $ git <command> [options] (without the $).

For example, cloning a repository can be done with $ git clone https://github.com/Apexal/late.git which will create a new folder named after the project (in this case late/). Then inside of that folder you can run the rest of the git commands!

Here are some important ones:

  • $ git status Tells you what branch you are on, if anything has been edited but not added yet or not committed yet.
  • $ git add [filename] Adds a file, as explained before, so that git starts to track it.
    • $ git add --all Adds all files at once instead of one by one.
  • $ git commit -m "Short message about changes here" Commits the changed and added files with a short message.
  • $ git push origin master Pushes your new commits to another repo, usually your central GitHub repo
  • $ git pull origin master Pulls in all the changes from a remote repo, again usually GitHub, to get any changes you or others have made to the code
  • $ git checkout feature-comments Switches to the specified branch locally. This changes the files in your folder to match what the branch holds. Now any changes and commits made will be on that branch. You can switch back to the default master branch the same way.
  • $ git checkout -b feature-login Creates and switches to a new branch (due to the -b option) and works the same as above.

Conclusion

In summary, these are the steps you take when working on a simple git project on GitHub/GitLab/BitBucket/etc.:

  1. Create or clone repo to your computer
  2. Edit/create/delete files in the repo
  3. add the files that you've modified
  4. commit the changes with a message
  5. push the changes to GitHub
  6. Repeat 2-5!

If you have any questions about this or git in general, please reach out to me at [email protected] or Discord @Frank‽#0001