..

You probably shouldn't test the leafs directly, except when...

Focus on unit tests, but relax the definition of what a “unit” is.

It shouldn’t be language constructs (such as classes or methods) that dictate how much a unit test is allowed to cover, but rather the complexity of the code that you’re testing. You can often achieve the exact same test coverage and the exact same mutation resistance by testing your program at a slightly higher level, and you will get a lot to show for it.

Besides proving that your program works right now, tests serve to aid future re-factoring, and to document your software. It is exactly those things that low-level tests against individual functions, methods and classes don’t provide. Every little bit of code that you move around will require you to change the tests, which is not only frustrating, but also creates a very real risk of breaking those tests in the process. A lot of low-level tests also make it harder to figure out how much your higher-level tests still have to cover and can cause you to pare down on them, but it is exactly the higher level tests that will provide you with the most valuable documentation of user-facing behavior.

graph TD
    A(( ))
    B(( ))
    C(( ))
    D(( ))
    E(( ))
    F(( ))
    G(( ))

    style A fill:#ffffff,stroke:#000,stroke-width:2px

    A --- B
    A --- C
    B --- D
    B --- E
    C --- F
    C --- G

If you can reasonably test the entire tree by testing the root, then do it. All future re-factoring will be trivial, and your test names will read like actual business requirements.

There is one thing that will force your hand though, and for that we need to remember that a program isn’t actually a tree, but rather a directed graph.

graph TD
    A(( ))
    H(( ))
    B(( ))
    C(( ))
    D(( ))
    E(( ))
    F(( ))
    G(( ))

    style A fill:#ffffff,stroke:#000,stroke-width:2px
    style H fill:#ffffff,stroke:#000,stroke-width:2px
    style G fill:#00ff00,stroke:#000,stroke-width:2px

    A --> B
    A --> C
    B --> D
    B --> E
    C --> F
    C --> G
    H --> G

If you only test your program through the white nodes, you will have two alternatives for testing the green node: Either you test it through both, or you test it through just one of them. The former leads to duplicated tests, and the latter means that the test coverage for the green node might inadvertently disappear if one of the white nodes gets re-factored away. This is where it makes sense to unit test the green node directly, even if it just describes an abstract concept.

sven [at] memcmp.org