Automatically checking the Python style and title format of Git commits

Published on 15 August 2020 in Administration / Version 4.8 - 4 minutes read

Hi Query Engineers,

The investment in Continuous Integration (CI) and automation continues in order to help scale the resource and quality of the Hue project. This past year saw a lot of improvements with:

Here is the latest about how to automatically check that the coding convention of the Python API is followed by everybody as well as the format of the git commit titles.

Python linting

Hue is leveraging Pylint which is an open source program for checking coding standards. It comes with a series of rules that can be selected and customized in a .pylintrc file.

To discover early if the changes in a commit adhere to the convention, most of the Code Editor can understand the Pylint configuration with some plugins. This is handy for seeing the error early directly in your editor, which most of the time will even offer to fix it for you.

Pylint indentation visual check

One step at the time

As Hue is a mature project of 10+ years and its code base size is mostly consisting in 50% of Python, it could be too much of a burden to fix all the code styling issues right away. This is why, an incrementat strategy was approached:

  • Only lint the files that were updated in your new local commits (it will lint the full content of the file, not just the diff)
  • Start only with a minimal styling convention (only basic rules, more rules to add later as the basics are finished)
    • C0326 (bad-whitespace)
    • W0311 (bad-indentation)
    • C0301 (line-too-long)

The command to run check_for_python_lint.sh:

tools/ci/check_for_python_lint.sh

Will then locally output the lines with failing checks:

[10/Aug/2020 16`:22:17 -0700] runpylint    INFO     Running pylint with args: /home/romain/projects/hue/build/env/bin/pylint --rcfile=/home/romain/projects/hue/desktop/.pylintrc --disable=all --enable=C0301,C0326,W0311 --load-plugins=pylint_django -f parseable apps/beeswax/src/beeswax/api.py desktop/core/src/desktop/management/commands/runpylint.py
************* Module beeswax.api
apps/beeswax/src/beeswax/api.py:144: [C0326(bad-whitespace), ] Exactly one space required after :
      {1:1}
        ^
apps/beeswax/src/beeswax/api.py:236: [C0326(bad-whitespace), ] Exactly one space required before assignment
    response['status']= 0
                      ^
apps/beeswax/src/beeswax/api.py:239: [C0326(bad-whitespace), ] Exactly one space required before assignment
    response['status']= 0
                      ^
apps/beeswax/src/beeswax/api.py:436: [C0326(bad-whitespace), ] Exactly one space required before assignment
    response['message']= str(e)
                      ^
apps/beeswax/src/beeswax/api.py:678: [C0301(line-too-long), ] Line too long (156/150)
************* Module desktop.management.commands.runpylint
desktop/core/src/desktop/management/commands/runpylint.py:66: [C0301(line-too-long), ] Line too long (255/150)
desktop/core/src/desktop/management/commands/runpylint.py:70: [C0326(bad-whitespace), ] Exactly one space required around assignment
    a={1:   3}
    ^
desktop/core/src/desktop/management/commands/runpylint.py:70: [C0326(bad-whitespace), ] Exactly one space required after :
    a={1:   3}
        ^
desktop/core/src/desktop/management/commands/runpylint.py:72: [W0311(bad-indentation), ] Bad indentation. Found 8 spaces, expected 6

------------------------------------------------------------------
Your code has been rated at 9.86/10 (previous run: 9.88/10, -0.02)

The styling configuration is saved in the .pylintrc.

Then, to make it available automatically to all the new changes, hook it in the run python lints section of the config.yml of CircleCi which is then easily integrated into the Hue CI:

- run:
    name: run python lints
    command: |
      cd ~/repo

      ./tools/ci/check_for_python_lint.sh /usr/share/hue

ci pyling success no change

Git Commit Format check

Similarly to coding convention, having everybody follow the same language for commit titles saves time in the long run.

To keep things simple, only two formats were picked:

  • The traditional Hue one with a Jira number
  • A github pull request with the standard id at the end

Both needs to have a category within brackets to describe the main area of the change. e.g. [docs], [hive], [docker], [ui]…

Example of valid messages:

HUE-9374 [impala] Use 26000 as default for thrift-over-http
[livy] Add numExecutors options (#1238)

And some invalid ones (it is easy to have many combinations):

[impala] Use 26000 as default for thrift-over-http
Use 26000 as default for thrift-over-http (#1238)
HUE-9374 Use 26000 as default for thrift-over-http
Add numExecutors options

The check logic is part of the commit-msg Git hooks.

For checking git commit message format automatically locally, just copy the hooks:

cp tools/githooks/* .git/hooks
chmod +x .git/hooks/*

And here is the script running the checks only on your new commits not yet pushed to the master branch:

./tools/ci/check_for_commit_message.sh

And then to 100% automated it, also add it to the CI:

- run:
  name: run commit title format check
  command: |
      cd ~/repo

      ./tools/ci/check_for_commit_message.sh

ci git title format fail

And that's it, more development time saved for later!

What is your favorite CI process? Any feedback? Feel free to comment here or on @gethue!

Romain from the Hue Team


comments powered by Disqus

More recent stories

23 September 2020
Hue 4.8 and its improvements are out!
Read More
15 September 2020
SQL Querying Improvements: Phoenix, Flink, SparkSql, ERD Table...
Read More
14 September 2020
REST API for sending SQL queries and Browsing files
Read More