Experiences in the community

Just another WordPress.com weblog

The Git way

This will be 2-3 part series on using git to fork a repo, make changes and initiating merge requests to the main repository

Although I have shared about git and version control quite a few times already,before we do begin with the practicality of how to use git in a shared/forked environment, we need to understand why it came into being in the first place.

Long ago (about 50 + years back), the vendor who supplied the hardware was also the one who supplied the software so it was easy to blame and figure out who was at fault. Then when the Unix and later MS-Windows revolution came to computing, there was a rise of third-party vendors. Unlike before, these vendors had to work in a mixed and varied environment where one size fits all solutions didn’t work. To facilitate their own internal software development practices which copper networking at the times allowed, they started with making copies of software with versioning, also the way software development used to happen, only one software developer could work on a project or a file at any given point of time and the file or project was ‘locked’ for exclusive use by that developer. Also people have their times and styles of working which also hampered software development. Also as more versions happened, it quickly took a lot of space as each version was it own entity. To address these short-comings one of the first famous version control system came out and it was called CVSConcurrent Version Control’. While CVS is virtually not used anymore some of the more common and popular features were adopted by the two version control systems which followed soon after, each bettering the other.

Note :- While CVS is known as the granddad of all version control, arguably it became popular only when the source code of it was open-sourced in June 1986 where people like Brian Berliner worked on and eventually the code was licensed under the GPL in 1990.

Some of the unique features which CVS bought to the table were :-

a.Software-conflict resolution :- When two or more make changes to a file/s or a project there may arise an issue called a ‘merge conflict’. A merge conflict is nothing when two people had made changes to the same file but the changes themselves are incompatible with each other for one reason or another. The software via an algorithm (a resolver) tries to make sense of these changes and let the changes in. This was something that the developer had to manually before and still while there were deficiencies in the resolver and showing the merge conflicts it was much better than previously. Git made a huge jump in this which we will talk later.

b. Tracking third-party source releases and merging those changes :- Software development is not a static activity but as the name suggests ‘development’ it means constant change. Hence it’s more than likely that other people outside your immediate development team may make changes and improvements to your software. This was more true in the Unix, GNU/Linux distributions, the various BSD’s and Sun Solaris (a notable Unix derivative). The growth of these various Free and Open Source Software also led the changes in CVS as a product. While albeit incomplete, it still represented a sort of mini-revolution as well.

c. An independent database which tracks all the changes being made to the collection of files and repos :- While at times it could be bigger than the files, repository it ensured that a separate database was looking at the state of the repository hence ensuring robustness to the repository state.

d. Log support :- While still in its infancy, CVS introduced logging support with which you could understand who committed what and for what reason by having a short statement in a form of commit message.

While CVS was good, but because it was slow,database was not so efficient, the internal code structure was such that it was hack and couldn’t be re-factored so new releases became more and more distant and cvs couldn’t rise up to the continuing challenges to software development process.

Hence came the rise of Subversion. It’s really interesting if you look at the name per-se. The term Subversion (to subvert) is to make changes from within. Subversion tried to do as it’s keep-sake and was successful in making open software development much more easier. But there were challenges which became apparent as software became more larger, more varied development teams tried to use it.

One of the biggest issues which came to light was branching in subversion. Branching is when you have a crazy idea (or two, or three) and you want to try out the crazy ideas without disturbing the main development line . Just like in a simple tree, you have one root and can have many branches, the same was true here. So what you could do is have the sane, regular development happening on the main (HEAD) and have all the other different ‘feature branches’ happening. If that feature branch becomes something good, then you could merge it with the main line (HEAD). While in theory, this looks easy, in practice it is much more complex and subversion was costly both in terms of doing the merge (as in computing cycles) as well as the diffs generated. The whole idea of branching came much later in subversion and was kinda put on top as well.

One of the great things which was appreciated and became a hall-mark of FOSS was ‘dog-fooding’ . Dog-fooding (slang) in software development process is nothing but using your own software to do the work it advertises to others. So while Subversion was in a long line of software products who did dog-fooding to understand, optimize and share belief with others, it quickly became famous as it was easier to use than CVS (arguably), was faster (in some aspects) and developers loved it.  There were loads of sites which embraced this concept. You had sites such as Freshmeat, Sourceforge.net and many others who used and advertised Subversion as the end of all issues. Subversion had few issues, it was still slow and complex for large software projects and didn’t play nice with branches and as shared before didn’t play nice with branches and their merges.

Enter Linus Torvalds and git. Mr. Torvalds, a somewhat hobbyist programmer in his early teens, one fine day shares on a Usenet mailing list that he wants to do a toy kernel similar to the Minix kernel. Those words and actions in days and weeks transformed that toy kernel to what is now known as the GNU/Linux kernel licensed under GPLv2. Mr. Torvalds had worked with cvs and subversion and found it to be lacking. Being unhappy at the state of the things, he and his merry band of kernel hackers were using a commercial solution by the name Bitkeeper. Now the owner of BitKeeper has put a strange condition that nobody should attempt to reverse-engineer the code. Then when there were irreconcilable difference and Linus and his merry band parted ways with BitKeeper.

Mr. Torvalds then developed Git. The idea of git is and was to have all the good things of both CVS and Subversion and none the disadvantages. The code was re-factored quite a bit and written to be easily extended. Also it favored to be faster and cheaper than others as far as branching and merging branches were concerned. These were all part of the design. It took him some time to review other products before launching into Git development which eventually is having its day in the sun.

Some of the simplest commands and scenarios with respect to Git I would be publishing in the next couple of days. Adios till next time.

Single Post Navigation

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: