Measuring Cyclomatic Complexity Of Python Code
Complex code is hard to manage, hard to isolate and hard to unit test. For these reasons it is more difficult and costly to modify. In other words you should try to avoid complex code.
Many software metrics exist to measure the complexity of code. One such metric is cyclomatic complexity. Cyclomatic complexity (CC) is the measure of linearly independent paths through a program. The algorithm was developed by Thomas McCabe back in the 1970s.
CC really measures the amount of branching in a suite of code. Suites with more than seven branches are considered suboptimal and should be looked at for refactoring. The number seven was chosen because it is believed to be the average number of things a human being can concurrently hold in their head. CC is well covered on the internet so if you want to know more Google it.
I implemented the CC algorithm using a very simple AST visitor. A CC number is calculated for each Module, Class, Method and Function in a file. The program currently calculates, but does not print the results for nested classes or nested functions.
Metric Frenzy
Use metrics as a guide to show where there may be a need for refactoring. Don’t take them too seriously. Just because the complexity number is slightly above optimal doesn’t mean the code sucks. Metrics are not the definitive answer on code quality. So take them with a grain of salt.
Getting The Code
The program and unit tests are available in my Subversion repository. Just download the files into any directory on your system. You will need at least pygenie.py and cc.py.
I am probably going to create a new home on Google code for this stuff. It will be announced in a follow up post.
Running The Program
The program expects one or more Python filenames or fully qualified module names to be passed in on the command line. For example:
./pygenie.py complexity mycode.py
- or -
./pygenie.py complexity mycode.py dir0/dir1/mod.py
- or -
./pygenie.py complexity dir0.dir1.mod
Running the program will print the results to standard output. This is a proof of concept and not a polished application so don’t expect real fancy output.
Interpreting The Results
The output is a table of three columns: suite type, suite name and the complexity number. The suit type could have the following values: X for a module, F for a function, C for a class and M for a method. The suite name is the fully qualified name of a suite. The complexity number is just a simple integer representing the suite’s complexity. The rows are sorted by the complexity number in descending order.
Only things that have a high complexity number are shown by default. If you want to see all of the complexity values you can use the –verbose option. For example:
./pygenie.py complexity –verbose dir0.dir1.mod
Here is an example of running the cc.py code throught itself:
dstanek% ./pygenie.py complexity example.py Module: example Complexity Chart: type name complexity M AClass.runtests 28 F fall_down 10 F run_away 9 X cc 8 M BClass.dosomething 8 F duck_and_cover 8
Code that is not shown because its complexity number is seven or less in not proven to be good. The design may be faulty, variables obfuscated or many other things.
Closing Thoughts
It is good practice to try to keep the complexity of code to a minimum. Code with a low complexity number is less risky to change and easier to test. This should not be the only way to judge your code, just a supplement.
Comments
5 Responses to “Measuring Cyclomatic Complexity Of Python Code”
Leave a Reply





This is a really interesting post and would make a great clepy presentation.
very cool. As an aside, you may want to get familiar with the new _ast module, http://docs.python.org/dev/library/_ast since it replaces the deprecated compiler module in python 2.6 and 3.0
Excellent work. Thanks for sharing it with the rest of us. Has it helped your code yet or is it too early to use for production work?
@Josh
I initially wrote the code a few weeks before PyCon. Since then I have been using it to keep my code simple. I feel that its worked, but thats really best left to poor programmer that has to deal with my code
There’s some recent work that suggests that the sweet spot for cyclomatic complexity is closer to 24 than it is to 7. See http://www.sdtimes.com/content/article.aspx?ArticleID=31820 (which, if you track down the study, is based on open source Java projects, so mileage may vary vs. Python).