Classes, namespaces, files, packages - oh my!

The art and science of breaking your project into smaller pieces

In order to provide the most granular reporting possible we're trying to break a single project into namespaces which are then used as a basis for reporting. Breaking a project into smaller pieces is more art then science and varies from language to language. While for some languages there's more or less one obvious way to do it (take Java as an example) for others we needed to make more arbitrary decisions. Below we're explaining what exactly we're doing in each case and why we decided to do it this way.

Ruby

In Ruby everything is an object so the language is fairly easy to break into namespaces. In a nutshell we're treating classes and modules as valid namespaces in Ruby and we report each separately class and module if it contains it's own methods. Take this example:

module Bacon
  def taste?
    'delicous'
  end

  module Party
    class Tasting
      def initialize
        # ...
      end
    end
  end # module Party
end # module Bacon

What we're going to do here is report two entities - Bacon which has it's own method taste? and Bacon::Party::Tasting which has it's own method initialize. Note that the module Party is not reported at all since in this example it merely serves as a vessel for other namespaces.

This is all good and well but Ruby is a very flexible language. You can have classes inside classes (we report both separately), you can reopen classes and do really crazy stuff like this:

module Bacon
  class << ::String
    def baconize
      # ...
    end
  end # class String
end # module Bacon

You can also dynamically reopen classes and modules using instance_eval, class_eval and their ilk. As a matter of fact we probably spent more time covering all corner cases of how classes and modules are defined and redefined in Ruby than we spent writing algorithms for analysing complexity.

For reopened classes (including ones from the standard library) we're trying to report their proper namespace with their methods. Note that while no matter how hard we try there will always be cases of dynamically defined classes that we have no chance of properly detecting and reporting without actually executing your source code - something we don't and will never do.

Please note that if a namespace is spread between multiple files we take all of them into account for reporting purposes and you can see that in our reports.

Go

Go is much less obvious since it does not have classes or modules. It does have packages, interfaces and structs though. We don't analyse interfaces at all - the few violations we could potentially report for them (like too many arguments or too many return values) can be reported on interfaces' implementation so we believe there is no explicit need to double-report them. We're now left with packages and structs. We report them separately by looking at function's receivers. Take this example:

package gophergravy

func newParser() Parser {
  // ...
}

type parserImpl struct {
  // ...
}

func (p *parserImpl) AddFile(path string) error {
  // ...
}

The package gophergravy contains functions with two distinct types of receivers - nil (for the newParser function) and *parserImpl (for the AddFile) function. As a result we report two separate namespaces: gophergravy and gophergravy.parseImpl. We don't differentiate between struct receivers and pointer receivers though so functions with parserImpl and *parserImpl as their receivers will be reported under one namespace.

Depending on the receiver we choose to report or not to report certain types of violations on their namespaces. For struct receivers we report three code size violation: high total complexity, high total number of functions and high number of likes of code. We don't look at these for packages. Our reasoning behind this choice is that structs are forms of binding data similar to classes in other languages and so are subject to similar design guidelines. By suggesting that size of individual structs is limited we're trying to to avoid things like God Objects and encourage modularity and loose coupling.

As with Ruby, if a namespace is spread between multiple files we take all of them into account for reporting purposes.

Swift

Swift is a rather complex language in terms of the sheer number of data types it offers: classes, structs, enums, extensions and protocols. As with Go's interfaces we discard protocols as they alone contain very few things that can be meaningfully analysed. All other types are fair game and are reported as separate entities, provided they have methods defined which includes initializers and deinitializers. Nested entities are reported separately - take this example from the Swift book:

struct BlackjackCard {
    
    // nested Suit enumeration
    enum Suit: Character {
        case Spades = "♠", Hearts = "♡", Diamonds = "♢", Clubs = "♣"
    }
    
    // nested Rank enumeration
    enum Rank: Int {
        case Two = 2, Three, Four, Five, Six, Seven, Eight, Nine, Ten
        case Jack, Queen, King, Ace
        struct Values {
            let first: Int, second: Int?
        }
        var values: Values {
            switch self {
            case .Ace:
                return Values(first: 1, second: 11)
            case .Jack, .Queen, .King:
                return Values(first: 10, second: nil)
            default:
                return Values(first: self.rawValue, second: nil)
            }
        }
    }
    
    // BlackjackCard properties and methods
    let rank: Rank, suit: Suit
    var description: String {
        var output = "suit is \(suit.rawValue),"
        output += " value is \(rank.values.first)"
        if let second = rank.values.second {
            output += " or \(second)"
        }
        return output
    }
}

It may seem counterintuitive at first but for the above snippet of code we will report two namespaces: BlackjackCard and BlackjackCard.Rank. Note that while neither has functions per se they both contain computed properties (values and description) for which we can provide meaningful analysis. Note that BlackjackCard.Suit (an enum) and BlackjackCard.Rank.Values (a struct) are currently not reported because they contain no functions.

Java

Java's case is pretty simple, it has classes and interfaces. As with other languages, we don't report on interfaces but on their implementations. Each class including private subclasses declared within it's parent is reported as a distinct codebeat namespace. Also, we don't report enums as namespaces.

package com.bacon.vegetables;

class Bacon {  
  public void eat() {
    System.out.println("Fun");
  }

  private class BaconTaste {
    public void enjoy() {
      System.out.println("Tasty fun");
    }
  }
}

Above snippet of code codebeat will report two namespaces: com.bacon.vegetables.Bacon and com.bacon.vegetables.Bacon.BaconTaste.

Kotlin

Similar to Java, we do not report on interfaces or enums. Everything else - classes, objects and companion objects - is reported. We also report on package-level because - unlike Java - in Kotlin it is acceptable to have top-level functions.

Javascript

In Javascript we report each file as separate namespace unless there is a better namespace candidate in it like: class or a function Foo style constructor. Additionally we create a namespace whenever prototype is extended.

class Bacon {
  party() {
    console.log("Make it rain bacon!");
  }
};

function Salad() {
  console.log("Health");
}

Salad.prototype.eat = () => {
  console.log("Om nom nom");
}

Above snippet will report 2 namespaces: bacon.js.Bacon with method party and bacon.js.Salad with method eat.

Typescript

In Typescript, on top of what we detect for JS, we will additionally detect namespaces and modules.

Elixir

In Elixir, the only type of namespaces are modules. Both public (def) and private (defp) functions are reported. In some special cases we would also report on macros (defmacro) - if a macro contains no functions, we will treat it as a function. Otherwise, we report directly on functions contained in the macro. We realise this is suboptimal but among all the arbitrary decisions we considered this one looked the least bad.