GPAs explained

How grades are calculated, what they mean and why you shouldn't obsess about them

As you may have noticed your codebeat report comes with a number that can range from 0 (worst) to 4 (best), roughly mapping to an American GPA scale. It is understandable that higher scores are generally more desirable but while a useful guide we don't want that single number to become the only thing you care about in the project. In this article we are going to explain how grades are calculated and how we at codebeat approach them.

Calculating GPAs

For each namespace detected in your project we calculate a separate GPA by applying a number of penalties for different violations and subtracting them from a perfect score of 4.0. Some of these violations are detected on the function level and some of them can only be reported on namespace level. Penalties related to function-level violations are normalized by the total number of functions in a namespace and so generally will have a lower impact on it's GPA. For example if only a single function out of 10 has any violations to it the GPA will hardly be impacted.

Namespace-level penalties are applied after normalisation which will make them - by and large - more impactful. These include high total complexity, large number of functions in a namespace and - most importantly - code duplication. For duplicated code arbitrarily but deterministically decide which copy is the original and which one is a copy-paste job and we only apply a penalty to the latter's namespace.

An important thing to note is that most violations can have different weight levels. A function with the recommended maximum cyclomatic complexity exceeded by 1 will receive a far more lenient treatment than the one exceeding the threshold by 120. This however is not applied to strictly boolean style checks.

After we've separately calculated GPAs for each namespace we calculate a weighted arithmetic mean whereby the weight of each namespace is represented by it's function count. Hence, a namespace with 5 functions will be 5 times more impactful in terms of GPA than the one with just one function, and infinitely more impactful than an empty namespace (see below). We believe this incentivises developers to refactor larger namespaces first, with a good effect on the overall health of the project.

GPA algorithm artefacts

Every scoring algorithm creates it's own classes of artefacts and what we describe above is far from perfect. If you've read the above explanation carefully you can see that namespaces that contain no functions (e.g. empty classes), no matter how bad they may be, are not taken into account when calculating the final GPA since their GPA multiplier (number of functions) will be 0. Taken to an extreme a project with 1000 identical classes each with no functions will yield a GPA of 4. Congrats, you've just gamed the system.

Similarly you might have noticed that code duplication no matter how small will always impact the class pretty heavily. Three or four small copy-paste jobs in a large namespace with dozens of functions will heavily impact it's GPA and - by extension - the total GPA of the project. Such problems will often occupy top spots in the Quick Wins section of your project report as things you should tackle first. Thankfully code duplication is one of the easiest (and most rewarding!) code smells to approach so you can often improve your total GPA substantially by factoring out common code to one or two functions.

Changing GPA

Sometimes GPA will change without any changes to the code. These changes may vary in magnitude from small to substantial. This is basically us changing our algorithms by adding new checks and supporting more languages. To illustrate it with an example let's take a look at Ruby on Rails project. We currently support Ruby but not JavaScript so the substantial amount of JavaScript in the project remains invisible to our analysers. However, JavaScript support is in our pipeline and once we start analysing JavaScript code we will start showing issues with .js source files and take these into account when calculating the overall GPA. Your overall project score may then substantially change without you changing a single line of code!

The second class of changes to out algorithms involves new checks being added. We currently analyse source code for complexity and duplication but we have many other checks in the pipeline like the ones for design antipatterns (e.g. feature envy). Once we teach our analysers how to detect such violations you will start seeing these being reported in your projects and penalties for these will be subtracted from your total GPA.

Last but not least we may occasionally tweak our algorithms to ensure that we assign fair and reasonable grades to different types and magnitudes of their violations. We will try to limit these changes to an absolute minimum and when they're introduced we'll do our best to minimize their impact and explain their rationale.

We are aware that sudden changes in GPA might be annoying and confusing but we believe this is a fair price to pay for being able to show help you constantly improve your code and your skills as a developer. We think it would be a shame if we avoided improving our service and limiting the value we can offer you in order to maintain the stability of an abstract, artificial number.

In defense of GPA

While all we've said above might sound like GPA is worthless this is certainly not the case. All we're trying to convey is that it should not be the only thing you're looking at. Using a metaphor GPA is a mark on the compass indicating the direction you should be headed next. If you've ever been playing RPGs (who hasn't!) you know that quests can change or you can be given side quests but these generally won't impact the main quest. In the game of codebeat the main quest is the improvement of your code and you as a developer and GPA is just one of the tools to help you get there.