Tuesday, September 22, 2015

My First OpenStack Infra Patch

In August, I started a new job as an upstream contributor on the OpenStack Infrastructure team! The project is so huge, the average on-boarding time is at least 6 months. My goal for this on-boarding time is to document as much as possible while I still have my "new contributor" perspective. I have been focusing on documentation for various Infra projects, including Nodepool and Disk Image Builder.

I had my first OpenStack Infrastructure patch merged on September 2nd, 2015! I'd like to share my experiences writing, submitting, and debugging my first patch.

Step 1: Figure out what to patch


There's a talk I give about how to get started with contributing to the Perl community. Some of the advice I've given there holds true to pretty much any project, but especially any project that is big and overwhelming. Sure, you can submit code as your first patch. For most of us, however, there's a lot of context missing when we interact with a new code-base (and Infra has MANY contexts and code bases). I recommend what I like to call the "Wax-On Wax-Off Paint-the-Fence" approach.

Take on the most ridiculously boring and tedious tasks while you learn how things work.

Good examples are things like adding or improving documentation, adding, fixing, or otherwise improving unit tests, or small code tasks like variable or function renames, formatting changes, whitespace cleanup, or logging or other text output format changes. Don't do anything too invasive until you understand how all the pieces fit together. You can't debug a failure if you don't know where to look, and this is especially true for big projects like Infra!

The team I joined has a policy of assigning an onboarding buddy to new hires. I was lucky to get to work with Greg Haynes. He's done a great job of suggesting tasks like the ones mentioned above that exist in the code bases he's most active in.

My first patch involved reformatting text in a README file for one of the elements in Disk Image Builder. A proposal was made to move this from free text that was all over the place and inconsistent to a formatted table structure with clear items that should be listed for each element. This is a very tedious project and not particularly interesting to someone who is very familiar with the code base. For someone like me, however, it's perfect! So that's my current project, reformatting and improving the documentation for the environment variable overrides in Disk Image Builder's elements.

If you don't have a buddy to make recommendations, just pop into the #openstack-infra channel in IRC and just ask for suggestions! Alternatively, try the #openstack-doc and #openstack-qa channels.

Step 2: Set Up Gerrit and git-review


OpenStack has done an excellent job of documenting the steps for new developers and I recommend following those instructions after reading some of my notes here. There are more in-depth instructions provided by the Documentation team for first time contributors that may be a better fit if you don't have a lot of experience contributing to projects!

OpenStack uses a code review tool called Gerrit hosted at review.openstack.org and uses Ubuntu Single Sign On to manage login credentials. When I joined the Infrastructure team, I already had an Ubuntu Single Sign On account via Launchpad from contributions I'd made to Ubuntu a few years ago.

Additionally, to contribute to any OpenStack project (including Infra), you'll need to create a community account and sign the agreement. Make sure the email you provide for your OpenStack email address matches your Ubuntu Single Sign On email address! I ran into this issue myself and I've seen it enough that I'll bet it's one of the top new contributor issues! If you have any issues with your OpenStack community account, the first thing to check is that those email addresses match.

Finally, install git-review on your system. As a side note, for development projects, I generally try to minimize system-wide installs that could affect my entire system. This is especially important when dealing with several projects that may have conflicting version requirements. That being said, git-review is fine to install system-wide, just be aware in the future when some instructions tell you to install something system-wide and think about possible conflicts with any other projects you might be using. See the next section for suggestions on not littering dependencies all over your local file system.

Step 3: Get the Code


Infra code, like all OpenStack code, lives in a git repository hosted on git.openstack.org. To find the repository for the project you want to patch, look up the project by name at git.openstack.org using the Search box at the upper right corner. Copy the “git://” URI to clone the git repo locally.

For your local dev setup on your local machine, I recommend creating a folder for all of your OpenStack related projects. I have a folder called "dev" in my home directory that contains subfolders for various projects. Under that, I created a folder called openstack where I clone my git repos. For OpenStack infra, you will want to create a separate openstack-infra directory because openstack-infra can have different dependencies than other OpenStack projects. It should use its own virtualenv to manage these dependencies.

In general you should be using a local Python rather than a system Python. You know if you're using system Python if you have to type "sudo" anytime you want to do anything beyond running a program (like pip). Virtualenv makes doing this really easy. Learn more about using virtualenv here.

There's another great tool for managing environment variables that's written in bash, called smartcd. It lets you set up custom environment variables on a per directory basis. A lot of OpenStack projects depend on environment variables so having something to customize this without littering your environment is really useful.

I also recommend that you set up a virtual machine with a Devstack. If you're editing docs or unit tests, it's not too big of a deal to run things locally but when you're doing larger code patches, you want to mimic running tools against an OpenStack as much as possible. For the documentation patches I've done, running locally and using a virtualenv has been sufficient.

Step 4: Run the Tests


Once you've cloned the git repository, follow the instructions in the README at the parent level to do any necessary local setup. It will likely involve installing a bunch of python dependencies. See the section above for my recommendation involving using virtualenv for this.

If the project you're working on does not have a README, or the README doesn't mention anything about how to set up the repo, this is definitely an opportunity to add a patch. Every project should have a README and it should contain not only setup information but also bug submission information. These are good first patches as you get to know the infrastructure.

Some projects are complex, require a lot of setup and possess a huge range of dependencies. If your change doesn’t actually require running the project, don’t get overwhelmed by getting the project up and running in order to submit your first patch. You can easily run a subset of the unit tests locally to verify your change without having to locally test the entire project.

After configuring and installing, before touching anything else, I highly recommend running the tests to make sure everything works on your system. It's incredibly hard to troubleshoot when you're not sure if your patch broke something or if it was a setup issue that's causing test failures. To run the tests you'll need to set up tox if you haven't already. It's really straightforward (pip install tox), see the OpenStack Python Developer Docs if you're not sure how.

The tox.ini file at the parent level of the repo defines the different sections that tox recognizes. So if all you really care about is docs, there should be a section called "docs" and it will be clear in the tox.ini what tests are being run as part of that. If you are only touching a specific subset of things (eg, docs), then don't worry about troubleshooting if the entire test suite fails on your local machine.

Unit tests can fail for a variety of reasons, including versions of dependent libraries and OS-level conflicts. This is why it’s important to a) use a tool like virtualenv to separate Python language dependencies and b) to use a virtual machine or other container to avoid system level conflicts that are hard to debug.

Step 5: Hack Hack Hack


Before you start hacking, you should cut a local branch to commit to. However, if you suspect an upstream conflict is going to be merged while you’re working, you can hack on master without committing, and then cut the branch once you're ready to commit. The advantage of just hacking on master (without committing) is you can easily keep the branch current without dealing with merge conflicts. If you accidentally commit to master, Gerrit won't let you merge when you check in your commit. You’ll need to cut a local branch to submit your code upstream anyway, so it's generally just good practice to work on a local branch.

Step 6: Run the Tests


After you're done hack hack hacking, run the tests again to make sure you didn't break anything. Again, only run a subset if that's all you care about. This will just catch anything specific to what you're changing. Jenkins will run the full suite of unit and integration tests once you submit your patch for review.

Step 7: Compose a Commit Message and Rebase


Composing a well thought out commit message is important, even for small changes. The typical OpenStack commit style is to do a short description followed by a longer paragraph that provides more context for the change. If the change is really minor, you can get away with a short description. Learn more about writing good commit messages here.

Gerrit will create one review item per commit. If you have multiple commits, you need to rebase them down to one commit. My workflow is to write my real commit message for the first commit and then do short, less descriptive commits for anything subsequent. Once I'm done developing, I do rebase -i HEAD~n where n is the number of commits (inclusive) to rebase. When you do an interactive rebase, you can tell git what you want to do with each of the commits.

It might be tempting to just constantly rebase your work into one commit to save yourself the trouble. While doing your initial work, I don't recommend this because you might make a mistake that's hard to back out of without your full commit history.

To keep a remote copy of your branch without having it go through testing, you basically push your change to Gerrit but then mark it as a Work in Progress (WIP) through a workflow comment. Run git review, then open the patch's page in Gerrit. Click Review on your patch and mark it as "-1 WIP". This will let everyone (including Jenkins) know that this patch isn't ready for testing just yet.

Note that if you want to keep a remote copy of your work in Gerrit, you'll need to rebase it down to one commit. I haven't figured out an alternative solution that keeps the repo in sync with the master branch but lets you have a remote copy of your revision history without creating a bunch of extra patches in Gerrit. When I figure something out I'll be sure to talk about it in a future article.

Step 8: Submit Your Patch


Before running git-review, make sure the email address in your git config matches the one you have registered in Gerrit. Just type "git review" to submit your patch. If you want to submit the patch as a work in progress, see the above section about doing a workflow commit.

Once you’ve submitted your patch for review, the build system will run a huge suite of tests on the remote branch in Gerrit. Pay attention to your emails for the test results from Jenkins. If you need to troubleshoot a failure, do so by looking at the Jenkins console logs. Ask for help in the #openstack-infra IRC channel if you're not sure what something means, and try to provide as much information as possible including a link to the change in question and specific error messages you think are the problem. For longer pastes, use paste.openstack.org instead of cluttering up the IRC channel. Check if the failed tests are "non-voting" before digging into things. If the tests are "non-voting" their failure doesn't matter with regards to your patch being accepted because they may have known issues.

If it looks like a fluke, you can re-run the tests by adding a comment with the text "recheck". All automated tests must pass for code to be merged, and typically reviewers won’t even begin reviewing a new patch until the checks have passed.

Step 9: Get Your Patch Reviewed


Once the test have passed, you'll need to get your change accepted by 2 core members in order to get it merged. If your patch has been reviewed and says "Needs Workflow" it means another core needs to approve it before it can be merged.

To find out who the core members are that can +2 or approve your patch, visit the project's page in review.openstack.org. Click on "Access" at the top of the screen and then click on any of the project-core links to see a list of people. You can add names as Reviewers to your change from your change's url or you can ask in #openstack-infra if those individuals could review your change. Not everyone monitors their emails if you just add them as a Reviewer.

If you get comments and people want you to make changes, generally discussion should happen in Gerrit so there’s some history tied to the change. However, if you’re still uncertain or need more back and forth than Gerrit can offer, feel free to discuss the change publicly in the #openstack-infra IRC channel, or privately with the reviewer on IRC if you’re not yet comfortable speaking in the public channel. Many people in the community are very passionate about wanting the best for OpenStack, and they have strong opinions as a result. If you don't agree with specific feedback because you feel you had good reasons for why you did a thing the way you did, feel free to politely share those reasons. Chances are you'll have a nice discussion and you'll both learn something ;) If for some reason you're not satisfied with the result of the discussion, feel free to seek a second opinion from another core member. It's ok if you get overruled, don't be frustrated. Just try your best to understand why they want it done that way and move on to the next thing. When you're working with groups, sometimes you have to pick your battles. Once you get more established in the community and understand more of the context, you might better understand their reasons or you can be the core person making those decisions. It's all just part of being in a community with other humans :)

Once your patch has been approved, it will be automatically merged and you'll get an email notification (if you've opted for those)!

Step 10: Do the First Patch Dance!




You can watch your progress as an OpenStack contributor at Stackalytics!


Useful Links