Preparing GitHub repository for open source
Lots of developers do pet projects besides their job. Things are pretty straightforward when you work alone. You code some functionality, then commit the changes and push it to the repository like GitHub, Bitbucket or Gitlab. Simple is that. But at some point, your code might turn into a full product. Folks start using it, new contributors come and your repository becomes their workspace as well. As you probably guess, if you want to keep the control over the appearing code, you need be well prepared because otherwise, you’d probably fail and stay with an unstable piece of code. So, how should you prepare your repository, therefore? Let’s find out!
Notice that in this article I use GitHub as the example, but all below rules apply for other providers;
As I mentioned at the beginning, working alone is fairly simple because we don’t need to care about one thing – managing our code. We can safely push to the master without any fear about conflicts, missing code, and other issues. But this approach is not going to work when you collaborate with other folks. First, since the developing is distributed and many people work on the same file, soon or later conflicts will appear. But besides that when people use your code in their projects, their expect it to be stable (more or less) anytime. So, how can you guarantee stability when any change can be pushed directly to the master branch? Well, simply you can’t. That’s why GitFlow became so popular. The idea is very simple. Start using more than one branch. Of course, there are some other rules:
- Your two permanent branches are master (for production code) and devlop (for all new features). This basically means that for most of the time develop should be N commits ahead of master.
- When you need to add some functionality, create a feature branch from develop and work on it until you’re done. Then merge new functionality into develop.
- The rule no. 2 also applies to all kinds of bugs that are not critical and don’t need to be fixed quickly.
- When all functionalities are finished and ready for release (so they are tested and stable) you can merge develop branch into master.
- If you need to do any hotfix, you create a branch directly from the master and work on it until you’re done. Then merge it into the master and develop (thanks to that you won’t lost hotfix changes after merging development into master).
That basically it! If you want to read a more detailed explanation, go to this article. I swear you won’t regret!
For all of you who are not used to that kind of workflow, this might look very complicated and unnecessary. But in fact, this approach gives you a lot of benefits. First, it makes working together on the same feature very easy. Since only part of the team do one functionality there are way fewer conflicts and mess on the feature branch. Then – stability. As I described it above, the only production code that people actually use, is on a master branch. So, if you want to refactor code, add proof of concept of functionality or actually whatever you want; simply create a feature branch and work on it. If the idea wasn’t good, you can delete it without affecting develop but more importantly – master. This approach is also very good when it comes to creating a release notes simply because you can easily spot all features that were part of the one release. Doing the same only with a master branch? Well, good luck.
Use Pull Requests
All right, we know the workflow but we miss something. How should we merge the changes from feature branches into develop and further to master? Some say that is fairly simple – use git merge command in bash! Well, of course, this will work but keep in mind that we discuss here preparing our repo for open source so probably a collaboration with other folks. We need to remember that anybody can join us. This basically means that the range of experience/skill might be pretty big. Starting from beginners, ending on uber-master developers. That is the beauty of open source but at the same time, it can put you in a trouble. Imagine that some inexperienced dev created a feature branch to work on great, new thing for your product. Unfortunately, his implementation causes N+1 problem which is very common when it comes to working with databases. Merging this into develop and then the master would probably seriously affect performance and poof, you’re screwed. Now, to be clear, I’m not saying that all the trouble comes from less experienced programmers because this is very common for all of us. I mean, I do a lot of mistakes which I simply can’t spot. Some are caused because of my haste, some of them are caused because of my lack of knowledge. That’s perfectly fine until it does not affect the user. So, the solution for that is very simple. Use pull request instead of just merging. If you’ve never done this before on GitHub don’t worry since is very simple 🙂 After the code is finished and feature branch is pushed to the repository, go to GitHub and click “New pull request” button right below commits counter:
Then select for which branch you want to create PR. Left side is a target, the right should be feature one:
Notice that GitHub detected that new branch cannot be merged until I resolve all the conflicts. It’s also quite important to create a proper title and description for PR, so the reviews get a context until he verifies it. Oh yeah, this one is also quite important. Pull requests are great moments for doing a code review simply because they aggregate all the changes made on the branch. So, if you spot something wrong, that’s good for you! One less issue to struggle! Just one more statement related to this paragraph. We should not be afraid of doing code reviews, no matter how experienced we are. Remember, we are all humans and we do mistakes. So if you detect some bad code, don’t be afraid and point that, but on the same time don’t blame other for pointing mistakes on you. Even if they are less experienced.
Protect your branches
So far, so good! We know how to work and how to merge new changes to already existing code. But, we need some kind of protection for our production/release code. Why? Simply because someone might not know that you use GitFlow and Pull Requests. What would happen if he’d push his changes to master directly? Yup, you’re screwed. Fortunately, GitHub offers few different protection mechanisms for your code. Go to the project settings and branches tab. Then select branch you want to protect and choose options that suits you the most. I’d suggest checking at least these one below:
This gives you quite a lot because no one can push directly to specified branch, but also it forces contributors to create PR and wait for the approval from someone else. Nice!
Create build pipeline
It seems like PR, code review and branch protection resolves all the trouble. But there’s still one problem. All the above works great for incoming changes. But none of us has a compiler in our eyes so we can’t determine whether the code actually builds. Moreover, even if we are sure that it does there might be a chance that new changes affected the old code and it changed its behavior. How can we verify it? Of course, the most obvious way is to check out into feature branch and run verify it manually (like running all source of tests). But imagine that you have 5 or even more PR waiting for approval. Checking each one manually would probably consume a lot of your time. The solution is automation of the entire process. Luckily a few weeks ago I described my personal favorite tool called Buddy, so before you go further read it 😉
All right, assuming that you know how to create a pipeline we can move on. So, how does it apply to our PR? Well, here’s the trick. Create a pipeline which will be triggered when some new branch will be pushed to your GitHub repo. To do that use wildcard like in the example below:
Of course, you can create more strict wildcard which will target only specific branches (like branches which name starts with feature/). Having this done we can go back to the GitHub and once again visit “Protected branes” tab :
If your pipeline was triggered at least once, you should see its status in the status checks table. If you choose it and save changes, your PR will become even more awesome. Not only because they will require code review approval but also will require Buddy’s pipeline to succeed. If the pipeline fails, this basically means that some mistakes are inside new code which needs to be repaired first.
One of the most important things about all kind of projects is README. Basically, because it gives the context of our product. So, reading it we can determine whether it’s a C# library or standalone web application written in ASP.NET Core. More importantly, it presents what the code actually does. So, spend a little bit of time and prepare clear examples which will cover your functionalities. If you are not a markdown master, don’t worry! There are lots of free, online tools which will help you a lot like Dillinger. And if your README will grow fast you can leave there the most important information and rest part move to a wiki.
Oh and one more thing here! README is also great for showing people the build status of your product. In Buddy, this one is child’s play. Simply go to the selected pipeline (so in this case that’d probably triggered by master branch) and click on “Badge” tab on the right. Then copy text described as “Markdown” and paste in your README. Result?
The last one is optional but it’s very handy. Not only knows but besides README, GitHub offers also three different markdowns:
This allows you to create default templates. Let’s try it out for PR. I created the following markdown with a proper name:
# Info This Pull Request is related to issue no. [<ISSUE-NUMBER>] ([<ISSUE-LINK>]) # Changes - ...
Let’s see what will happen after I’ll select branches for PR:
Thanks to that each PR description will have a similar structure, so it will be much easier to read them.Well, I hope this few tips will help some of you in the future. I wish you success and have a nice coding!