std::stream of consciousness

15/10/2022

Libraries in C++ — Header-only, static, dynamic, and lazily loaded

Just how many times have you cried because of a linker error?

I have. And so many times!

In hindsight, it is a pretty simple topic, but if you're so used to just write code, and let the build system take care of compiling and linking, then you'll probably hit linker errors infrequently enough that you'll be tolerant of the frustration of those moments, as long as some trial and error with build configuration files will get you out of trouble.

But sometimes you can't just hold the frustration anymore, and you command yourself to dedicate some time to go to the bottom of this topic; or, if not to the bottom, at least as low as it takes for you to be happy with it, and no more frustrated by the linking failures.

So here's how I came to write this post: I started by creating a directory with an evocative name, moved into it, and started to play with C++ files in an attempt to reproduce the frustrating workflow to better understand it. When I started to erroneously delete the wrong file, to forget what edit I made to file such and such, and to feel more frustrated because didn't want to spend time creating a Makefile to deal with this crap on my behalf (after all, the Makefile would be the guy shielding me away from exactly the topic I was trying to understand, so probably using one would have killed the purpose of the exercise), I decided that writing a post to help myself track the progress of my understanding was maybe a good idea.

Eventually, this post turned out to be a personally reworded collage of all the info I've found on the web that helped me build a firmer understanding of the compilation process, as well as of the various types of libraries that exist. These two aspects are tightly bounded together, in the sense that libraries (yours or third party's) are exactly the expression of the fact that a program can be written in separate and independently developed, tested, and sometimes built pieces.

(Little but important note: I'm on Linux, both at work and in my personal time, so I will not even try giving Windows examples; but most of this post should be useful anyway.)

The steps from source code to executable

First of all, it is important to have an understanding of what "compiling" a source codes into an executable really means. I've quoted the word "compiling" because proper compilation is just one of the steps that a compiler executes to go from source code to executable program. This short but very good answer on StackOverflow outlines the phases that GCC goes through in the process.

For instance… For instance, if you have a single mysource.cpp file, when you execute

g++ mysource.cpp -o mysource

GCC in reality does the following (I'm barely rewording from the linked answer).

It preprocesses the source code, by stripping comments and expanding macros and #includes; you can obtain this intermediate product yourself by passing the -E to g++:
```
g++ -E mysource.cpp -o mysource.ii
```
It compiles that preprocessed source code to assembly, which you can do yourself by means of the -S option:
```
g++ -S mysource.ii -o mysource.s
```
It compiles the assembly into an object binary file, which you can do by means of the -c option:
```
g++ -c mysource.s -o mysource.o
```
Finally, it links the object into the executable; you can do this by passing no options:
```
g++ mysource.o -o mysource
```
(you can list other object files too, if needed).

For the sake of accuracy, the flags -E, -S, -c and none tell g++ at which step to stop the process, whereas where to begin from is dictated simply the file you throw at it (not because of the extension but because of what's inside it). To give a couple of examples,

g++ -S mysource.cpp -o mysource.s

performs steps 1 and 2, whereas

g++ mysource.ii -o mysource

executes steps 2, 3, and 4.

`#include` is copy-and-paste

As you might have guessed from its description, step 1 is what takes care of, among other things, #include directives; and you might also wonder how simple this preprocessing might be, considering that it happens so early in the process.

And it is indeed a very simple thing:

#includeing a file is equivalent to copying-and-pasting it into the includer.

(If you know it already, you can skip the current section.)

And I'm intentionally saying file instead of header: whatever you #include is just copy-and-pasted. There's truly nothing other than copy-and-paste when you #include something.

Indeed, so little is special about #include, that if you write a header without having that simple idea in mind, you can easily run in errors which become obvious, once you are aware that there's just a copy-and-paste going on.

Here's three flawed headers, each defining a different class, one of which is included by the other two, which are in turn included in the main program:

// Foo.hpp
#include "Bar.hpp" // because every
struct Foo {       // Foo needs a Bar,
  Bar b;           // doesn't it?
};

// Bar.hpp
struct Bar {
  // whatever
};

// Baz.hpp
#include "Bar.hpp" // Even Baz needs a Bar!
struct Baz {
  Bar b;
};

// main.cpp
#include "Baz.hpp"
#include "Foo.hpp"
int main() {
  // not even trying to make
  // any use of the headers
}

If you click on the last snippet, you'll see that the program can't be compiled because it results in a redefinition of Bar, the reason being that both Foo.hpp and Baz.hpp are #includeing, i.e. copying-and-pasting, Bar.hpp.

In other words, it's just like we tried to compile this source file alone:

// main.cpp
struct Bar {
  // whatever
};
struct Baz {
  Bar b;
};
struct Bar { // oops, I did it again
  // whatever
};
struct Foo {
  Bar b;
};
int main() {
  // not even trying to make
  // any use of the headers
}

The source above, in fact, is the known as a translation unit, which is the thing that the compiler compiles (or attempts to, in this case) and which is roughly "the source file, conventionally named with extension .cpp/.cxx/.c/whatever, that you feed to the compiler, in which all #includes have been copied-and-pasted (and all other #-directives have been "resolved", e.g. macro substituted, and so on)"

In light of the steps we've seen earlier, another, more direct way of verifying that #include means copy-and-paste, is to examine the output of the following command

g++ -E main.cpp -o main.ii

which is successful and produces the following (it's not a C++ file, but a file of type initng which contains a mix of C++ syntax and something else, as you can see; more info here):

# 0 "main.cpp"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "main.cpp"
# 1 "Bar.hpp" 1
struct Bar {

};
# 2 "main.cpp" 2
# 1 "Foo.hpp" 1
# 1 "Bar.hpp" 1
struct Bar {

};
# 2 "Foo.hpp" 2
struct Foo {
  Bar b;
};
# 3 "main.cpp" 2
int main() {
}

where you can clearly see the duplicate definition of Bar.

But then why doesn't this problem occur when we include standard headers?

The answer is simple: the standard headers use a device know as #include guards to prevent multiple inclusions of the same header in the same translation unit, and you should do so in every single one of your headers.

To see how an include guard looks like, open any system header, such as iostream, that on my system is located at /usr/include/c++/12.2.0/iostream at the time of writing, and you'll see it has this structure:

// comments only
#ifndef _GLIBCXX_IOSTREAM
#define _GLIBCXX_IOSTREAM 1

// the content of the header

#endif /* _GLIBCXX_IOSTREAM */
// EOF

And this is what gets copied-and-pasted in the source includer file, resulting in the translation unit that the compiler tries to compile.

When the compiler reaches the above snippet of code while compiling a translation unit, it will verify that the macro _GLIBCXX_IOSTREAM is not defined (because it's #ifndef, not #ifdef) before actually pasting // the content of the header; it will also, at that point, #define the macro that was not defined yet (the value 1 is irrelevant), so that, should this snippet of code happen to occur again in this same translation unit, #ifndef _GLIBCXX_IOSTREAM will not fire and the // the content of the header will not be copied-and-pasted again.

Another simpler way of enforcing an #include guard, even though not standard, but practically permitted by all major compilers, is to put #pragma once at the top of your header, and nothing else. What's the difference? Practically speaking, none. Anyway, you can find more info here and at the previous link.

Header-only libraries

Knowing that #included headers are simply copied-and-pasted, you can easily imagine you can provide all the functionality you want to a client of yours by providing them with one or more header files (and by "header" I mean a proper one, with the #include guard) defining all functions and classes they need. That's not necessarily a good idea, but sometimes it's the only option, for instance when you are defining templates (unless you already know all possible usages of your template).

Using such a library is as easy as adding the #include you need into your source code,

// HeaderOnly.hpp
#pragma once
#include <iostream>
void foo(int i) {
    std::cout << "Header-only library speaking, "
              << i
              << std::endl;
}

// main.cpp
#include "HeaderOnly.hpp"
int main() {
  foo(101);
}

and it becomes part of your program as soon as during step 1, the preprocessing stage, as demonstrated by the following few commands

$ g++ -E main.cpp -o main.ii # preprocess main.cpp
$ rm HeaderOnly.hpp          # delete header to show it's not needed anymore
$ g++ main.ii -o main        # compile & link
$ ./main                     # execute
Header-only library speaking, 101

Boost.Hana is a beautiful example of a header-only library, designed to facilitate C++ meta-programming and suited for computations on both types and values.

Static libraries

One of the drawbacks of header-only libraries is that they #include all the headers their implementation needs; which translates to what is known as code bloat: your translation unit ends up being huge because of all the direct and indirect #includes. Indeed, in the example above, you can well see that the code inside <iostream> gets pasted into main.cpp, even though main.cpp doesn't need to know anything about <iostream>; all it needs to know is how to call foo, i.e. it needs to know its signature.

Whether you should fear this problem and how much is another matter that this post does not address at all, and it is influenced, among other things, by how smart the compiler is.

Anyway, what is the solution? There's basically no solution if you provide templates that will be instantiated with types you don't know of, and that's why templates are often accused of causing code bloat (see also §9.1.1 from C++ Templates - The Complete Guide - 2nd edition for more details), but for the simple case of non-template entities, such as the foo function above, the solution is simple enough. Indeed, all main.cpp needs to know about foo in order to be able to call it, is that it takes void and returns void, i.e. that its type is void(void), or more simply void(). And all we need to do to make it aware of this fact, is writing in it a declaration of foo, i.e. void foo();. As you might have guessed, we don't do that manually, but by including an header which contains that declaration.

So here's our solution in code:

MyLib.hpp declares foo,
```
// MyLib.hpp
#pragma once
void foo();
```

MyLib.cpp defines it,

// MyLib.cpp
#include "MyLib.hpp"
void foo() {
    std::cout << "Hello" << std::endl;
}

and main.cpp #includes the header and uses foo.

// main.cpp
#include "MyLib.hpp"
int main() {
  foo();
}

If you're wondering why MyLib.cpp #includes its own header MyLib.hpp the answer is that there is such a myriad of scenarios where doing so is necessary (e.g. MyLib.hpp defines an inline variable to be used in client code as well as in the implementation of some other exposed function), that not doing it in a rare scenario like the one above (more likely to happen in toy examples than in real code) is just not worth the risk of forgetting it when it's needed. You can read more about it in this question I asked on StackOverflow and in the associated answers.

The above example can be built like this

g++ -c main.cpp -o main.o   # compile main.cpp
g++ -c MyLib.cpp -o MyLib.o # compile MyLib.cpp
g++ main.o MyLib.o -o main  # link

where the first two commands can be executed in any order, because the translation units can be compiled independently, thanks to the fact that the only thing that main.cpp needs to know about MyLib.cpp, i.e. the declaration of foo, has indeed been "pasted" into main.cpp as well, via the inclusion of MyLib.hpp.

You can even use the program nm on both *.o to list the symbols (not the executable code) they contain, in order to observe how they relate:

$ nm main.o
0000000000000000 T main
                 U _Z3foov

$ nm MyLib.o
                 U __cxa_atexit
                 U __dso_handle
                 U _GLOBAL_OFFSET_TABLE_
0000000000000084 t _GLOBAL__sub_I__Z3foov
0000000000000000 T _Z3foov
0000000000000032 t _Z41__static_initialization_and_destruction_0ii
                 U _ZNSolsEPFRSoS_E
                 U _ZNSt8ios_base4InitC1Ev
                 U _ZNSt8ios_base4InitD1Ev
                 U _ZSt4cout
                 U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 b _ZStL8__ioinit
                 U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc

_Z3foov is a so-called mangled name (which will become important later on), in which the compiler encodes the signature of the specific overload of foo in use, and the line U _Z3foov in main.o is the undefined reference to `foo()' (that U stands for "undefined"); the 0000000000000000 T _Z3foov line in MyLib.o indicates that that file contains the definition of the same foo that main.o references.

Another way to visualize this bond between the two files that does not need an external program like nm is to look at their assembly code. Let's obtain that code first

g++ -S main.cpp -o main.s
g++ -S MyLib.cpp -o MyLib.s

and look for _Z3foov in the two files main.s and MyLib.s.

.file	"main.cpp"
	.text
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	call	_Z3foov@PLT
	movl	$0, %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (GNU) 12.2.0"
	.section	.note.GNU-stack,"",@progbits

	.file	"static.cpp"
	.text
	.local	_ZStL8__ioinit
	.comm	_ZStL8__ioinit,1,1
	.section	.rodata
	.align 8
.LC0:
	.string	"Hello world, static library speaking"
	.text
	.globl	_Z3foov
	.type	_Z3foov, @function
_Z3foov:
.LFB1761:
# more stuff

Going back to the building process, once again, it can be executed in one step, but I think it's instructive to see what happens if you try linking without telling the linker where MyLib.o is. What you get is an error from the linker, ld, complaining about an undefined reference to foo():

$ g++ main.o -o main # forgot MyLib.o
/usr/bin/ld: main.o: in function `main':
main.cpp:(.text+0x5): undefined reference to `foo()'
collect2: error: ld returned 1 exit status

And maybe it's even important to know how this error differs from the one you get if you forget to #include "MyLib.hpp" in main.cpp, which complains about foo being not declared. This is an error that happens at compile-time, rather than link time, because main.cpp doesn't even know foo's signature.

$ g++ -c main.cpp -o main.o
main.cpp: In function ‘int main()’:
main.cpp:3:5: error: ‘foo’ was not declared in this scope
    3 |     foo();
      |     ^~~

(More precisely, the error above happens during the compilation to assembly, i.e. you get the same error if you run the command $ g++ -S main.cpp -o main.s.)

At this point one question that might arise is: where did the code in MyLib.cpp end up being? You can verify, by getting rid of the *.o files, that main works well in isolation:

$ rm MyLib.o main.o
$ ./main
Hello

Therefore, you can also be sure that all the code it needs, including the one originally in MyLib.cpp and compiled in MyLib.o, is indeed embedded into the binary named main. In other words, the code of a static library becomes part of the executable file at link time.

To be accurate, by the name static libraries go those files conventionally named something.a, which are not one single *.o file, but an archive containing one or more of those.

For instance the following command creates a properly known static library containing one object file only

ar rcs libMyLib.a MyLib.o

which you can link like this

g++ main.o libMyLib.a -o main

The reason why I chose the name libMyLib.a instead of MyLib.a is that this is customary for static libraries, so much that there's another syntax for linking them which assumes the lib prefix in the filename:

g++ main.o -lMyLib -L. -o main

where -lname tells the linker to look for a file named libname.a and -Ldir tells it to look for that file in the directory dir. Indeed, if we had named the archive MyLib.a, the command above would error with a fairly understandable wording:

/usr/bin/ld: cannot find -lMyLib: No such file or directory
/usr/bin/ld: note to link with ./MyLib.a use -l:MyLib.a or rename it to libMyLib.a
collect2: error: ld returned 1 exit status

Dynamic libraries

So we've verified that when linking a static library against a client code, the library's and client's code end up in the same produced executable. Therefore, if the static library changes, i.e. if its author changes MyLib.cpp, recompiles it into MyLib.o and gives the latter to you, you still have to re-link your main.o against the new MyLib.o, otherwise you'll be still using the old one.

Is there a way that the library writer can compile the library such that they're free to change the implementation of the library without forcing us to re-link? Yes there is.

Let's make just one little change to the workflow described above. The source and header files stay the same, but we compile the library with a different command:

g++ -fPIC -shared MyLib.cpp -o libMyLib.so

Then we link using the same command as before:

$ g++ main.o libMyLib.so -o main

If we try to run the executable straight away as we did earlier, though, we get an error,

$ ./main
./main: error while loading shared libraries: libMyLib.so: cannot open shared object file: No such file or directory

This error tells us two things:

"error while loading shared libraries" tells us that linking doesn't embed the library executable code into the main executable;
"No such file or directory" tells us that the runtime system that attempted to start the program couldn't find the library, so we need a way to instruct it about where to look for it.

The solution here is easy: we can pass the missing information via the LD_LIBRARY_PATH macro,

$ LD_LIBRARY_PATH='.' ./main # . is the current dir, where the library is
Hello

Now it should be clear that if we delete MyLib.so and run main, we get the same error as before,

$ rm MyLib.so
$ LD_LIBRARY_PATH='.' ./main
./main: error while loading shared libraries: libMyLib.so: cannot open shared object file: No such file or directory

but a more interesting observation is that we can change MyLib.cpp (but leaving its interface intact, i.e. we don't change MyLib.hpp), recompile the library, and executing main will reflect those changes, without any need to re-link:


$ sed -i 's/Hello/Bye/' MyLib.cpp            # change Hello to Bye (you can do it any other way)
$ g++ -fPIC -shared MyLib.cpp -o libMyLib.so # recompile only the library
$ LD_LIBRARY_PATH='.' ./main
Bye

Therefore, an author of a dynamic library can do all the changes they like to the implementation of the library (but not to the interface they promised), share the binaries with the client, and the client will just have to overwrite the old shared objects with the new ones, and the executable will pick them up. In other words, the code of a dynamic library becomes part of your program at ~~runtime~~ startup time. And this means the *.so file must be there when you run main, even if main doesn't make use of it (try removing the call to foo() from inside main.cpp, recompile and relink it against libMyLib.so, then delete libMyLib.so, run LD_LIBRARY_PATH='.' ./main and you'll get same error as above).

Why have I specifically written "startup time" rather than "runtime"? Because the runtime system loads the library even before the actual program starts executing. How can we verify this? Simple: put a print statement in main.cpp before the call to foo, like this

// main.cpp
#include "MyLib.hpp"
int main() {
  std::cout << "before" << std::endl;
  foo();
}

and then

$ g++ -c main.cpp -o main.o
$ g++ main.o libMyLib.so -o main
$ rm libMyLib.so
$ LD_LIBRARY_PATH='.' ./main
./main: error while loading shared libraries: libMyLib.so: cannot open shared object file: No such file or directory

As you can see, no trace of the text before: the executable hasn't even started.

This also means, though, that once the executable has started, you can indeed delete the .so file, and the running process will not be impacted, which you can verify by repeating the above process, but changing this

  std::cout << "before" << std::endl;

to this

  int dummy;
  std::in << dummy;

re-compiling, re-linking, starting the program, deleting libMyLib.so while the executable is waiting for your input, and then entering the input. The program will do just fine.

At this point, if you cannot guess already what the next section is about, I think you probably will if I ask you two questions:

How much does loading a library cost?
How many libraries are we realistically loading in a non-toy application?

Clearly, the answers vary wildly across the use cases, but the point is that loading a library does have a cost, and for big programs, especially those with multiple functionalities that are not all necessarily needed at the same time (or not at all if the user chooses not to use them in a session of the program), there can be many of those costs; as many that they can add up at the expense of the startup time of the program and, consequently, of the user experience.

You might now be wondering whether a way exists to load a dynamic library only when it is needed. This is the question that leads us to the final section of this post…

Lazily loaded dynamic libraries

If the library is not loaded by the runtime system as soon as we start the program, then who takes of care of loading it? There's no other answer than "the program" itself. After all, that's what we want: the program logic to decide when and if to load the library. And the API to do so is part of the GNU C library.

Before showing the code, I think it is important to stress a few points:

we already have a dynamic library in place, and from the perspective of the clients of the library, that's not gonna change;
the library implementer can change the implementation of it (i.g. the MyLib.cpp file), but as long as they keep the interface (i.e. the MyLib.hpp) unchanged, for the clients it's just like the library doesn't change;
the lazy or eager policy in loading the library is a decision made on the client side, which the implementer of the library knows nothing about.

That being said, it should be clear that we will now only play the role of the library clients, with the aim of loading the library lazily, during run time, rather than eagerly, during launch time. For this reason, we'll only have to review two things:

the actual code of our application, main.cpp, to make it actually handle the loading of the library,
and the command to build our application (which is fairly straightforward, if not unsatisfying).

As regards the first point, main.cpp is now a bit more complicated, because it needs to deal with the possibility of failure when loading the library and/or reading the symbol(s) we are looking for, but the comments should make most of it understandable.

#include <iostream>
#include <dlfcn.h>

#include "MyLib.hpp"

int main() {

    // try opening the library
    void * lib = dlopen("./libMyLib.so", RTLD_LAZY);

    // verify opening was fine, or error out
    if (!lib) {
        std::cerr << "Error (when opening the library \"./libMyLib.so\"): " << dlerror() << std::endl;
        return 1; // if we return here, echo $? in the shell will print 1
    }

    // so far so good
    dlerror(); // reset the error status

    // try loading the symbol `foo`
    void * fooPtr = dlsym(lib, "foo");
    auto error = dlerror();

    // verify loading symbol was fine, or error out
    if (error) {
        std::cerr << "Error (when loading the symbol `foo`): " << error << std::endl;
        return 2; // if we return here, echo $? in the shell will print 2
    }

    // so far so good
    dlerror(); // reset the error status

    // cast foo to the function type we know and call it
    using Foo = decltype(foo);
    ((Foo*)fooPtr)();
}

As regards the second point, the solution is simple: we just don't link against the library, because we are delegating the task of "establishing a connection" with the library to our program itself.

However, surprisingly enough (at least for those of us who are new to this topic) …

$ g++ main.cpp -o main                       # compile and link main
$ g++ -fPIC -shared MyLib.cpp -o libMyLib.so # compile the library
$ ./main                                     # run
Error (when loading the symbol `foo`): ./libMyLib.so: undefined symbol: foo
$ echo $?                                    # print last return status
2

… the runtime doesn't seem to find the symbol foo, even though the library loading succeeds (indeed, we exited with return 2, not return 1).

Without dedicating enough time to think about why that was happening, I rushed at asking on StackOverflow to discover something which is obvious in hindsight: since in C++ functions can be overloaded, the name of function alone is not enough, in general, to unequivocally address what could be one of many overloads. To know what function to call, the linker has to know the function signature too, so the compiler has to provide it in the object file, and it does so by presenting each overload of foo with a differently mangled name (the thing I mentioned earlier), where the mangling encodes the signature. In short, if we change

    void * fooPtr = dlsym(lib, "foo");

    void * fooPtr = dlsym(lib, "_Z3foov");

the problem is solved.

This solution, however, is non-portable, as noted in a comment to the linked question. In the case that the library provides a single overload of foo, then an alternative solution (from the perspective of the library writer) is to declare foo like this

extern "C" void foo();

so its name will not be mangled in the shared object file (in C there's no function overloading, hence no reason to mangle names), and we can pass just "foo" to dlsym.

If, instead, the library provides more overloads, then extern "C" is not a viable solution, and one has to deal with mangled names. I'm not sure a portable solution exists, but here's a relevant question with some possible solution outlined (the lengthy comments to the question are very informative, so I suggest to have a read of those too). Incidentally, I'm not sure how much the necessity of inspecting a library for mangled names has to do, if anything at all, with the question of whether the application should still include a header from the library (there's an hint of this aspect of the matter in the comments here).

Leaving aside the topic of shared libraries providing overloaded functions, and staying in the realm of what is possible with extern "C" (or, alternatively, writing non-portable code by hard coding mangled names in the client code), it should be clear that with such a mechanism (I mean, dlopen+dlsym, not only can we load a library lazily, i.e. only when its needed, but we can also take into account the scenario where the library is missing, and provide a fallback alternative to it.

Below I've included one such example. It it is made up of

a main.cpp that tries to load a library libMyLib.so and, if that is not available, falls back on a default library;
a default library DefaultLib.cpp/DefaultLib.hpp;
a libMyLib library, MyLib.cpp/MyLib.hpp.

It can be compiled and used like this:

$ g++ -c main.cpp -o main.o                  # compile main
$ g++ -c DefaultLib.cpp -o DefaultLib.o      # compile default library
$ g++ DefaultLib.o main.o -o main            # link
$ ./main                                     # run (libMyLib.so is not available)
Slow, old, rotten version
$ g++ -fPIC -shared MyLib.cpp -o libMyLib.so # compile shared library
$ ./main                                     # run (libMyLib.so is available)
Fast version (you've I paid big money for it)

Here you can see that main can be run with or without the shared library being available, and its logic will act accordingly. (By the way, we don't need to prepend LD_LIBRARY_PATH='.' because we load the library from where we know it is, see "./libMyLib.so" in the code below.)

Below are the source codes.

// DefaultLib.hpp
#pragma once
namespace old {
void foo();
}

// DefaultLib.cpp
#include "DefaultLib.hpp"
#include <iostream>
namespace old {
void foo() {
    std::cout << "Slow, old, rotten version" << std::endl;
}
}

// MyLib.hpp
#pragma once
extern "C" void foo();

// MyLib.cpp
#include "MyLib.hpp"
#include <iostream>
void foo() {
    std::cout << "Fast version (you've I paid big money for it)" << std::endl;
}

// main.cpp
#include "DefaultLib.hpp"
#include "MyLib.hpp"
#include <dlfcn.h>
#include <iostream>

int main() {

    // try opening the library
    void * lib = dlopen("./libMyLib.so", RTLD_LAZY);

    // verify opening was fine, or use default version
    if (!lib) {
        old::foo();
        return 0;
    }

    // so far so good
    dlerror(); // reset the error status

    // try loading the symbol `foo`
    void * fooPtr = dlsym(lib, "foo");
    auto error = dlerror();

    // verify loading symbol was fine, or use default version
    if (error) {
        old::foo();
        return 0;
    }

    // so far so good
    dlerror(); // reset the error status

    // retrieve type of foo, as we expect it based on the header
    using Foo = decltype(foo);

    // and cast fooPtr to a pointer to that type before using it
    ((Foo*)fooPtr)();
}

Click on the last one to see the complete example on Compiler Explorer; comment the add_library(…) line in the CMakeLists.txt file to hide the library, thus triggering the use of the default one; also, consider that I had to change "./libMyLib.so" to "build/libMyLib.so" because that's where CE puts the shared object; I didn't know, I just asked.

Yeah, but how do I debug library code? — Demo

As far as header-only libraries are concerned, debugging is just as you would expect: you have the library code, you include it into yours, you compile the way you like, e.g. enabling debugging flags, and you step into the library code just as it was your own code. Easy-peasy.

As regards the other types of libraries we've reviewed, i.e. static and dynamic libraries, those normally come in compiled form (shared object, library archives, … whatever), is simply no C++ source code you can step through, … unless you do have the source code from which the library was compiled. So we can say that the possible scenarii are the following:

The library is not open source. In this case you just don't and can't legally have the source code, so you just can't step through it with a debugger. Is is a problem? No, as most likely you paid for the library, so you can report a bug and expect that the vendor will debug and fix the bug.
The library is open source and you installed it from pre-compiled binaries. In this case you can't step trough the code because, just as in case 1, you don't have it.
The library is open source and you have built it from source on your system. In this case you do have the source code you want to step through, with one caveat: most likely the library was built with full optimizations on, in order to target performance, so you'll probably see a lot of jumps when debugging, and you'll not be able to set breakpoints on every single executable line of code, because many have been optimized always. If that's the case, you might want to (temporarily) recompile the library with debug flags on (e.g. -g -O0 with GCC) before debugging.

A tricky bit about debugging a non-header-only library, though, is that the debug symbols have to be loaded during the debug session. Depending on what debug adapter you use, this could happen automatically or not, and the behavior of the same debug adapter can likely depending on how you configure it.

In the following screen cast I'll show how I step through the last example with the debugger; initially I'll use the vscode-cpptools adapter with setting "symbolLoadInfo": { "loadAll": false, "exceptionList": ""}, which fundamentally means "do not load debug symbols from libraries, and make no exception", so you'll see that I need to manually load the symbols from the library to enable a breakpoint in that library code. Then I'll show you the same workflow with CodeLLDB, which, for some reason I don't know, ignores the setting I mentioned earlier (maybe it uses another syntax and ignores this one because it doesn't understand it?) and loads the symbols as soon as dlopen returns.

Conclusions and summary

Creating all the examples above and playing with each of them helped me have a better understanding of what compiling and linking actually are, how different parts of a program know about each other, and what the difference is between static and dynamic libraries.

Here are the main points.

#include <foo> just means "copy-and-paste the file foo right here".
A header-only library is a library which is made up entirely of… headers, which you #include in your files, and that becomes part of your program as soon as the preprocessing stage of the building process happens.
A static library is an already compiled piece of code (often in the form of an object file, someObj.o, or of an archive, someLib.a) which you link against your compiled program; the result is one executable that contains everything it needs.
A dynamic library is, too, an already compiled piece of code, but at link time it doesn't become part of the generated executable; the latter, in fact, only has references to the entities it needs from the library, and those references are resolved when your program is launched.
A dynamic library, alternatively, can be loaded at run time by the program itself, via dlopen and dlsym, removing entirely the necessity of liking your program against it.

Referenced sources

Below are most of the sources (some of which already linked off the text above) I found invaluable in helping me understanding the topic I've been writing about today.

18/06/2022

Strategy pattern lie

A programming language that lacks lambdas, with two of its design patterns (one of them them probably being the strategy pattern).

I'm slowly but inexorably growing the idea that design patterns are, fundamentally, established workarounds to languages' weaknesses.

(If you are particularly sensitive, scroll to the bottom of this post, and see how much the strategy pattern can be shrunken in C++.)

Indeed, if I search the web for "design patterns" + weakness + language, I get a lot of results, the first one of which is a question on StackOverflow, and that question links to a blog, which I picked the following quote from:

Progress in programming languages

Had the "Design Patterns" movement been popular in 1960, its goal would have been to train programmers to recognize situations in which the "subroutine" pattern was applicable, and to implement it habitually when necessary. […], it would have been vastly inferior to what really happened, which was that the "subroutine" pattern was codified and embedded into subsequent languages.

[…]. The problem with the "Design Patterns" movement is […] programmers are trained to identify and apply the patterns when possible. Instead, the patterns should be used as signposts to the failures of the programming language. […].

[…]. The correct place to implement a common solution to a recurring design problem is in the programming language, if that is possible.

The stance of the "Design Patterns" movement seems to be that it is somehow inevitable that programmers will need to implement Visitors, Abstract Factories, Decorators, and Façades. But these are no more inevitable than the need to implement Subroutine Calls or Object-Oriented Classes in the source language. These patterns should be seen as defects or missing features in Java and C++. The best response to identification of these patterns is to ask what defects in those languages cause the patterns to be necessary, and how the languages might provide better support for solving these kinds of problems.

[…].

[…]. Helping hapless C++ and Java programmers is admirable, but it shouldn't be the end goal. Instead of seeing the use of design patterns as valuable in itself, it should be widely recognized that each design pattern is an expression of the failure of the source language.

If the Design Patterns movement had been popular in the 1980's, we wouldn't even have C++ or Java; […]. But the way to provide as much help as possible was not to train people to habitually implement Object-Oriented Classes when necessary; it was to develop languages like C++ and Java that had this pattern built in, so that programmers could concentrate on using OOP style instead of on implementing it.

I read those words and, even if I don't know the author personally, I can't help but agreeing with him, and be astonished when I realize that so many programmers disagree, instead.

Army of Darkness poster — A *programmer* (aka *"user"*) with two very powerful *programming languages* (aka *"tools"*), and the crowd behind him all use mediocre programming languages, so they don't stand a chance.

The reason for my astonishment is that those words, in my opinion, state something true by definition. Definition of what? Of what the programmer and the programming language are: the user and his/her tool, respectively. Why should they be happy with doing something over and over, when the alternative is that the language does that for them?

Nonetheless, many programmers (hugely experienced as well as newbies) strongly disagree with the above view, so praises of design patterns are sung in every corner of the internet.

Maybe the only way to convince them of the equation design patterns = weaknesses of a language is to review each single design pattern, and then cook an example written in a programming language that doesn't need that pattern because it's built into the language, or library provides a nice API for it.

To do so, I need a reference book, and I chose it to be the famous book "Head First - Design Patterns" (the one with the blond girl on the cover and many many drawings inside that look like coming from the comics), which is one of the most highly considered books on design patterns, maybe one step below its more earnest counterpart "Design Patterns - Elements of Reusable Object-Oriented Software" (at least based on what I've heard in my … ehm … 3 years in software development). The reason I chose the former is dual:

I don't have the latter;
the former is written in a language (Java) other than my main language (C++), so I'm in a better position to show that changing the language can make a big difference on whether you need a pattern or not.

Among all those patters, I'll start from the easy one, The Strategy Pattern.

Review of the "Strategy Pattern"

What is it, and how is it defined?

Let's go first to the definition (page 24), which, in all honesty, seems pretty uncontroversial:

The Strategy Pattern defines a family of algorithms, encapsulates each one, and makes them interchangeable. Strategy lets the algorithm vary independently from clients that use it.

The good thing about it is that it does not refer to objects or to the object-oriented paradigm at all, let alone inheritance and class hierarchies.

Nonetheless, there's room for questions already! For instance,

What does it mean for an algorithm to be encapsulated? An algorithm is just a sequence of instructions, and it surely has a beginning and an end, so one can tell where it is. Why do I need to encapsulate it? Who is requiring such a thing? An most importantly: how do I actually encapsulate an algorithm?
If two or more algorithms reach the same goal (read "same types of inputs and same type of output") via different means (read the implementation differs), aren't they interchangeable already? I can give that very input to each of them, and I'll get the same result every time. That's the definition of interchangeability, no?

And it's upon the answers to those questions, that my point is based, so let's go back to the beginning of the chapter, and find them.

At page 2, the authors throw the acronym OO at you within the first five lines, before even stating what the problem they want to solve is, so if I had any hope you could find a solution without resorting to OOP, I've already lost it. They've already converted you. I really hope not. Not totally. My point here is: how can you be so confident you'll find the best solution to a problem if you confine your search territory before even analyzing the problem? Can't you see that maybe there might be something useful out of the OOP world?

But the book is stuck into the religion of OOP, so after a few centimeters down from there, they throw at you inheritance, in the form of a UML diagram. At this point, in an attempt to resist to this dogmatic approach they use, I add another question to the ones above:

Why do we need inheritance? And do we need it at all?

And the ~~answer to this~~ intention is probably clear by the time Joe adds the fly() method to the Duck base class: inheritance is used to share a common behavior between all the classes that inherit from a common base class.

public abstract class Duck {
  public Duck() {}
  public void fly() {
    /* common way of flying */
  }
}

The shortcoming of these approach is soon found, because Joe had not anticipated that maybe not all classes inheriting from Duck should be able to fly().

// some ducks are ok with the parent's fly()
public class MallardDuck extends Duck { /* other stuff */ }
// but several might not!
public class RubberDuck extends Duck { /* override fly() to do nothing */ }
public class DecoyDuck extends Duck { /* override fly() to do nothing */ }

And here is another question coming up:

Why should I make a class derive from another class, if the former is not happy with one of the methods of the latter?

Joe answers this question, but rather than giving up on inheritance, he does so by moving the fly() method out of the Duck base class into an interface, and making only some of Duck's derived classes implement the Flyable interface.

public abstract class Duck {
  public Duck() {}
}
public interface Flyable {
  public void fly();
}
public class RubberDuck extends Duck { /* no fly() whatsoever */ }
public class MallardDuck extends Duck implements Flyable {
  public void fly() {
    /* fly the mallard way */
  }
}

This turns out to be a silly choice too, because all Joe's done, with that interface, is sharing the need for a fly() behavior across the classes that can actually fly, and not the fly() behavior, so he has to implement that behavior in each and every class that needs it. In other words, the code reuse attempted via inheritance from a base class is thrown away by the attempt of using a common interface. And I've written interface in mono-space, because it is a keyword in Java, where it has a very specific meaning, and that's precisely and admittedly the reason why the attempt fails: Java interfaces have no implementation code, so no code reuse.

So let's recap what's wrong with each of the two approaches:

With a Duck base class providing the fly() behavior and all other classes inheriting from it
- we have the benefit of sharing that behavior across all derived classes,
- and the drawback that we need to manually implement fly() in each and every class that wants to diverge from the common behavior
With a Flyable interface that requires all implementers to implement fly()
- We have the benefit that not all Duck's derived classes have a fly() method, but each of them can independently decide to implement Flyable only if they want to,
- and the drawback that we must repeat the code of fly() for all the classes that want it, even if they want the same flavour of fly()ing, because, as mentioned, Java interfaces have no implementation code.

Do you see anything? I hope that you've noticed how the two approaches are complementary to each other, as they have opposite advantages and opposite drawbacks. Therefore you might agree that a couple of additional questions come up naturally:

Can't I just have multiple base classes, instead of multiple interfaces? E.g. FlyableLikeBird, FlyableLikeRocket, FlyableNoWay, and so on, each with its default implementation of fly(), and every other class derive from the flavour they want? Doing so, I could take the first approach without the drawback.
What if my language allows the implementation of methods in an interface? If that's allowed, then I'd be removing the drawback from the second approach, making it winning.

It turns out that these two questions are actually the same question looked at from two different angles, because if interfaces were allowed to have implementations, they'd be base classes, and so we'd be just trying to inherit from multiple base classes, and point is that multiple inheritance is not allowed in Java.

On the other hand, both these things (inheriting from multiple base classes and implementing interface methods) are a thing in C++, which makes me hope that, at this point, the doubt about the real value of the Strategy Pattern should be crawling in your mind. At least, you might be thinking, the path from the problem to the solution is shorter in C++ than in Java, for the simple reason that C++ has a feature that Java lacks. By simple extension, you might agree that the added value of the strategy pattern depends on the language you are using.

Now let's jump to the proposed solution at page 22, which makes, at least in part, use of composition instead of inheritance:

create an interface for each of the behavior that you want to vary at will across the objects deriving from Duck, e.g. FlyBehavior and QuackBehavior;
every time you need a new flavour of a behavior create a class implementing the appropriate interface, e.g. FlyWithWings would implement FlyBehavior thus providing one flavour of fly()ing;
add two one member for each of the behaviors into the Duck base class, e.g. Duck would have members FlyBehavior flyBehavior and QuackBehavior quackBehavior.
Whenever you need a new type of duck, create a class inheriting from Duck and assigning appropriate values to flyBehavior and quackBehavior, those values being objects of class one of the implementers of the appropriate method, e.g. RubberDuck would inherit from Duck and assign flyBehavior the value of new FlyNoWay(), and something else to the other method.

public abstract class Duck {
  public Duck() {}
  // methods that derived classes can inject
  FlyBehavior flyBehavior;
  QuackBehavior quackBehavior;
  public void doFly() {
    flyBehavior.fly();
  }
  public void doQuack() {
    quackBehavior.quack();
  }
  // methods that every derived must implement
  public abstract void display();
  // methods that none must implement
  public void swim() {
    /* impl common to all ducks */
  }
}

public interface FlyBehavior {
  public void fly();
}
public interface QuackBehavior {
  public void quack();
}

public class FlyWithWings implements FlyBehavior {
  public void fly() { /* impl */ }
}
public class FlyNoWay implements FlyBehavior {
  public void fly() { /* impl */ }
}

public class MallardDuck extends Duck {
  public MallardDuck() {
    flyBehavior = new FlyWithWings();
    quackBehavior = new Quack();
  }
  public void display() {
    /* impl */
  }
}

Trying to slim down the pattern

But now look carefully at what Duck is: it is fundamentally a collection of methods, or behaviors, of which

two, doFly() and doQuack(), are expected to be injected by the derived classes via assignment of the members flyBehavior and quackBehavior respectively;
another one, display(), is expected to be implemented by every one of the derived classes;
and the last one, swim(), is expected to be the same across all derived classes.

It should be fairly clear that display() is not fundamentally different from doFly() and doQuack(): the fact that it is implemented differently across each derived class, and that instead several different classes might inject the same concrete behavior into doFly(), is purely incidental. Therefore, display() too can be factored out as the other two, and each derived Duck will inject its concrete implementation of it.

On the other hand, swim(), since it's meant to be common to all Ducks, is not exactly like the other three behaviors, but also not that different: we can also implement it as a call to a member, just being careful to initialize it in Duck's constructor, and making it private or final, so that it will stay the same across all Ducks no matter what.

Now, in view of the above, we can rewrite the Duck class like this:

public abstract class Duck {
  final SwimBehavior swimBehavior;
  public Duck() { swimBehavior = SwimNormally(); }
  // methods that derived classes can/must inject
  DisplayBehavior displayBehavior;
  FlyBehavior flyBehavior;
  QuackBehavior quackBehavior;
  public void doDisplay() {
    displayBehavior.display();
  }
  public void doFly() {
    flyBehavior.fly();
  }
  public void doQuack() {
    quackBehavior.quack();
  }
  public void doSwim() {
    swimBehavior.swim();
  }
}

public class MallardDuck extends Duck {
  public MallardDuck() {
    flyBehavior = new FlyWithWings();
    quackBehavior = new Quack();
    displayBehavior = new MallardDuckDisplay();
  }
}

public class MallardDuck extends Duck {
  public RubberDuck() {
    flyBehavior = new FlyNoWay();
    quackBehavior = new Squeak();
    displayBehavior = new RubberDuckDisplay();
  }
}

But now what's left of Duck? Just a shell, a container, a box, in which 3 methods can be injected from outside and 1 is fixed.

And what's left of the derived classes, such as MallardDuck or RubberDuck? They only take care of selecting a flavour for each of the behaviors, so why having those classes at all? We could just as well provide Duck with methods to set its behaviors (maybe the constructor, maybe actual setters), and then spawn objects of class Duck, which are all interchangeable because have the same class, and yet are different because we've injected different behaviors in them. You'd then have call sites like this (assuming the behaviors are set via Duck's constructor, appropriately rewritten):

Duck mallard = new Duck(MallardDuckDisplay(), FlyWithWings(), Quack());

See? No need for inheritance, so we have an answer to one of the questions.

In fact, we can also answer another question, the "why do I need (in Java) to encapsulate the algorithms and how do I do it?" The answer is that the algorithms (functions, e.g. .fly()) are encapsulated by wrapping them in a classes (e.g. FlyBehavior), and the advantage of doing so is that we can assign, change, swap, delete, …, the algorithms by assigning, changing, swapping, deleting, … objects of that class (e.g. flyBehavior) at run time.

So the classes wrapping/encapsulating the algorithm, only serve the purpose of dealing with the algorithms as they were objects.

But the story is not over in Java: now that algorithms are wrapped in objects, we can't call them anymore, i.e. we can't just do the following

Duck duck = new MallardDuck();
duck.fly();         // Duck has no fly() method!
                    // Yeah, it does have a flyBehavior,
duck.flyBehavior(); // but we can't call it either!

Instead, we have to go via a chain

duck.flyBehavior.fly();

which is precisely the reason why we've put doFly() in place:

duck.doFly(); // equivalent to the above

So it should be clear that the four members displayBehavior, flyBehavior, quackBehavior, and swimBehavior precisely mirror the four member functions doDisplay(), doFly(), doQuack(), and doSwim(), and that the latter only exist to ease the syntax you need to call the methods wrapped in the former.

Turning point: pick another language

This is the point that, in my opinion, should motivate one to look for another language.

For instance, in C++, you have operator overloading, and one operator that you can overload is operator(), which enables you to make the syntax duck.flyBehavior() valid in C++, by simply renaming FlyBehavior's fly method operator() (at this point it makes sense to change the name of the object of FlyBehavior class from flyBehavior to fly, so the syntax actually looks like duck.fly()):

struct FlyBehavior {
  virtual void operator()() const = 0;
};
struct FlyWithWings : public FlyBehavior {
  virtual void operator()() const {
    /* impl */
  }
};

struct Duck {
    FlyBehavior const& fly;
    /* … */
};

int main() {
    auto rubberDuck = Duck{FlyNoWay{}};
    auto mallardDuck = Duck{FlyWithWings{}};
    rubberDuck.fly();
    mallardDuck.fly();
}

The example above, however, is still not as flexible as the one presented in Java: for the syntax duck.fly() to be possible, I couldn't make the member fly a pointer (i.e., to be modern, by wrapping it into a std::unique_ptr), otherwise the syntax would be as ugly as rubberDuck.fly->operator()(), but at the same time I couldn't make it a value member (i.e. FlyBehavior fly;) because FlyBehavior is an abstract class, as it's got a pure virtual member function (i.e. one marked virtual and ending with = 0;), so an object of that class can't be instantiated; I had to make it const& (a non-const reference would not be possible either, because it wouldn't allow passing temporaries to Duck's constructor, as in Duck{FlyNoWay{}}).

To me, this is a symptom that we are still victim of the language; or, better, we are victim of our bad choice of what features we use to model a certain thing. Indeed, even though we gave up on inheritance in favour of composition on the "duck side" of the things, we are still using inheritance on the "behavior" side of it.

So let's try to get rid of inheritance here too. Again, the inheritance was brought in as a tool to share a common behavior between all the classes that inherit from a common class. But behaviors are just functions, and, in the context of the strategy pattern, behaviors have nothing to do with what the (member or not) functions actually do internally with their implementations, but with how they can be called, which is determined solely by they signatures.

This, incidentally, answers another question: yes, for two or more algorithms to be interchangeable, they just need to be callable in the same way, i.e. they have to share the signature. And this has nothing to do with inheritance. Sure, if we confine ourselves to Java, maybe the only way for two functions to share the same call syntax is that they are member functions of classes implementing a common interface (is that the case?), but what if we change language?

Indeed, if all we have to do is plugging one of several functions, which all share the same signature, in a slot of our Duck class, then C++ does offer a tool, std::function, which need not be const (fundamentally a std::function can have a "disengaged" state, unlike a member of an abstract base class, which has to point to a concrete class and, as such, can't be other than a pointer/reference). Since it's not const its target can be changed at run time, giving the same flexibility as the Java solution, but with a much slimmer code.

A bonus advantage of using std::function to store behaviors is that we can throw at it also any lambda, making it easy to assign behavior at run time, with no need for any behavior class at all.

Below is a working example

#include <iostream>
#include <functional>

constexpr auto flyWithWings = []{
    std::cout << "flyWithWings" << std::endl;
};

constexpr auto flyNoWay = []{
    std::cout << "flyNoWay" << std::endl;
};

constexpr auto swimBehavior = []{
    std::cout << "we all swim the same" << std::endl;
};

struct Duck {
    std::function<void()> fly;
    std::function<void()> quack;
    std::function<void()> const swim; // const, to make it the same across all ducks
};

int main() {
    auto rubberDuck = Duck{
        flyNoWay,
        []{ std::cout << "squeak" << std::endl; },
        swimBehavior
    };
    auto mallardDuck = Duck{
        flyWithWings,
        []{ std::cout << "quack" << std::endl; },
        swimBehavior
    };

    rubberDuck.fly();
    mallardDuck.fly();
    // cut off their limbs (cf. Dead Space)
    mallardDuck.fly = []{ std::cout << "they cut my wings!" << std::endl; };
    mallardDuck.fly();
    //mallardDuck.swim = []{}; can't change it, ok.
}

A note about the use of std::function: I'm not advocating it, in general, because I'm convinced (by questions, answers, other blog's posts, and stuff that I've read) that it has important drawbacks (only performance-wise?). However, at least in C++, if you want a member (or even a non-member) variable to be able to store one lambda now, and a different lambda later, you do need type erasure, because the lambdas have all different types (even if you write the same lambda expression twice, you get two different types, a.k.a. static_assert(std::is_same_v<decltype([]{}), decltype([]{})>) fails!), and type erasure is a technique that std::function makes use of, and it's also the reason (one of several?) why it's not to be used with levity. Now whether you go wild and simply use void* or let other abstractions like std::function take care of type erasure, type erasure is there.

Conclusions

What I wanted to show with the above review, analysis, and alternative solution, is the following:

The so-called "Strategy Pattern" is often implemented using inheritance (in the broad sense of implementing an interface or extending a class);

the tool of inheritance (properly said in Java) is used to share common behavior between different classes: different ducks are objects instantiated for different classes, and they all have slots to plug in behaviors;
the tool of implementing an interface is used to force different classes to provide an implementation of one specific behavior;
In some way, inheritance (via implements or extends) is used to distribute different functions (e.g. the .fly() and .quack() member functions) or different objects (e.g. the flyBehavior and quackBehavior members) across distinct classes (Duck's derived classes and behavior's implementers respectively).

Another approach is the one I propose, where we abandon inheritance, and only rely on composition and the language tools (whether those use internally inheritance, it's not our concern, because that stuff would be hidden under a useful abstraction that we need not reimplement from scratch). With this approach

different ducks are just different objects instantiated from a common class, and they have slots to plug behavior;
the tool of std::function is used to force different function objects to be callable the same way.
In some way, composition and std::function are used to distribute different functions across different objects.

And inheritance hierarchies are notoriously a hard thing to get right. Why not abandoning it entirely in favour of language features that give us already baked abstractions?

Most importantly, and this is truly the conclusion, if you go back to the previous snippet of code, and hopefully agree that the whole "Strategy Pattern" is contained in these 5 lines,

struct Duck {
    std::function<void()> fly;
    std::function<void()> quack;
    std::function<void()> const swim;
};

would you say it deserved an extravagant name and a whole chapter of even just one book?

Bonus

The solution looks pretty much the same in the purely functional programming language Haskell, where the strategy pattern is enabled by the presence of first class functions and lambdas (words of a research engineer at Facebook)

data Duck = Duck {
  fly :: IO (),
  quack :: IO (),
  display :: IO ()
}

Without going too much into detail,

IO () has fundamentally the same role as void();
Haskell's datatype are immutable, so you can't truly change the behaviors stored in mallard;

but you can indeed create a new mallard' from mallard picking only some of the latter's behaviors and setting new values; that's done via pattern matching, but to limit the ability to pattern match, for instance to prevent one from creating a mallard' with a display other than mallard's, you can put Duck in a module, and expose only the interface that you want to expose:

-- Duck.hs
module Duck (makeDuck, fly, quack, display, setFly, setQuack) where

data Duck = Duck {
  fly :: IO (),
  quack :: IO (),
  display :: IO ()
}

-- Duck, the ctor is not expeorted, otherwise client code could
-- pattern match on a Duck internals; instead, we expose a make
-- function equivalent to it
makeDuck :: IO () -> IO () -> IO () -> Duck
makeDuck = Duck

setFly f' (Duck f q d) = Duck f' q d
setQuack f' (Duck f q d) = Duck f' q d
-- but not setDisplay

-- usage in another_file.hs
import Duck
mallard = makeDuck (print "(normal flying)") (print "quack") (print "I'm a mallard duck")

fly mallard      -- prints "(normal flying)"
quack mallard    -- prints "quack"
display mallard  -- prints "I'm a mallard duck"

mallard' = setFly (print "they cut my limbs!") mallard
fly mallard'     -- prints "they cut my limbs!"

24/04/2022

Big-O complexity is not all you need

This article is about the time I discovered, the hard way, that there's much more than big-O complexity in performance analysis: even if two algorithms have the same big-O complexity, the difference in how they perform can be tremendous, because of the literally huge multiplicative constant that can exist between them, as a consequence of hardware optimizations.

A simple problem — string equality

One of the simplest expressions that sooner or later all programmers write is

str == "some string"

and sometimes we don't even care much about the time complexity of such a comparison.

For instance, if this comparison is done in the context of command line parsing to recognize whether the user has selected an option, so whatever the complexity of that comparison is, we'll pay for it only once, or at least a small known number of times.

String equality — many times

However there are times when this comparison is run for all elements of a possibly long collection of strings

auto validStrings = strings | filter(not_fn(equals("<invalid>"s)));

If strings contains N strings, we'll pay the cost of string comparison N times, so we do want to know how much that cost is. And if we make those comparisons inside an algorithm which is not linear in the number of strings like filter, but, say, quadratic, then knowing the cost of the individual comparisons is even more important.

So, how much is the time complexity of string comparison?

Well, I told myself, the best case scenario is when the strings have different length, in which case the algorithm can return in constant time.

And then if their length is equal, then all characters have to be compared one by one for as long as it takes to find a character that is different between the two strings

Rushed conclusions

In summary, my conclusion was that

"hello" and "hello world" are compared in constant time because their lengths are different. (This is the best case scenario.)
"hello" and "hello" are compared in linear time, and since they are equal, it takes 6 comparisons to conclude that they are indeed equal, because 1 comparison is to check their lengths are equal and 5 is the number of characters that we need to compare. (This is the worst case scenario.)
"hello" and "help!" are also compared in linear time, but it takes just 5 comparisons to conclude so, because we can bale out as soon as the second 'l' in "hello" compares different from the 'p' in "help!". (This is an average scenario.)

A more complex problem — whitespace-trimmed string equality

Fast forward a few months, and I had to consider that, in the dummy example described above, the invalid strings could contain some leading and trailing whitespace, e.g. an invalid string could be "␣␣␣<invalid>␣␣", where I've used the ␣ symbol to mean a whitespace.

Forgetful of my conclusions above, with confidence I changed the code to something like this

auto validStrings = strings | filter(compose(not_fn(equals("<invalid>"s)), trim(isSpace)));

where compose(f, g) is a function that first runs trim(isSpace) on its input to remove any leading and trailing whitespace, and then runs not_fn(equals("<invalid>")) on the result.

In fact, the above code is still more complex than it needs; indeed I forgot that ranges::views::filter accepts a second argument that projects each element of the range before passing it to the filtering predicate, so the above can be streamlined into this:

auto validStrings = strings | filter(not_fn(equals("<invalid>"s)), trim(isSpace));

It's neat, isn't it?

And what about the complexity? Well, I thought, the trimming is a linear operation too, just like checking strings for equality, as all it does is comparing the leading characters one by one with the space character, increasing a counter at the same time; when the first non-space is met, the counter is basically the offset from which the string needs to be compared with "<invalid>" (the same holds, modulo a slightly different wording, for trailing whitespace).

The whitespace-trimmed equality comparison is much slower!

It turned out that the code above performed poorly with respect to the original one, and I initially had no idea why, because based on my analysis both functions passed to filter should run in linear time with the length of the string they are passed.

Here's why

Stepping through the code with the debugger is what eventually helped me figure out what my mistake was:

when comparing "␣␣␣hello␣␣" with itself via std::equal_to<>{}, a.k.a. the plain == operator, the function __builtin_memcmp is called once to compare the two strings, and the function __builtin_memchr is never called.
when comparing "␣␣␣hello␣␣" with itself via equal ^on^ trim(isSpace), a.k.a. the plain == operator called on the whitespace-trimmed strings, the function __builtin_memchr is called 7 times! Once for each space that is trimmed, plus once for the 'h' and once for 'o'(i.e. the first and last non-space characters); only after all these calls, does the compiler run a string comparison between what's left, the two "hello" substrings, with pretty much the same efficiency as the previous point.

So it is evident that the trimming part of the algorithm is what really costs, for the reason that it requires the processor to truly process the trimmed characters one at a time.

Fundamentally, what's wrong in the conclusion I had initially drawn is that I didn't consider at all that in the non-best case scenario, e.g. when comparing two equally long strings for equality, the comparisons between corresponding characters need not be run in any specific order; actually, they can be run at the same time. If the hardware is capable of comparing those 5 characters at the same time, it'll do, thus dividing the cost of the comparison roughly by 5.

On the contrary, trimming is an operation inherently directional, because you want to find the first character which is not equal to the characters you're discarding, so you must walk the characters from the first onward, and ask for each single one of them should I discard it?.

Conclusions

I've learned that Big-O complexity is maybe where one has to start to assess whether an algorithm is good and how much, but it is certainly not where one should stop.

As regards the fact that a processor can make, fundamentally, string equality in (what to me looks like) constant time, I don't know much about it, probably because I lack an academic background about this stuff (I'm an aerospace engineer).

For this reason, I decided to ask a question on cs.stackexchange; at the time I'm writing this post it's only been commented, not answered, but the comments mention a CPU implementation detail called "superscalarity".

06/12/2021

Bitten byScottato dalla SSO

I was working on a text processing tool that accepts text in a form as simple as

std::string input{"this is the text the algorithm is fed with"};

and does simple stuff like filtering and transforming text (e.g. removing vocals, converting tabs to spaces, …, whatever comes to mind), eventually showing the string in some GUI with colored overlays based on the outcome of those text operations.

Instead of performing these operations eagerly, by dealing with actual std::strings (which would mean that potentially a lot of std::string are constructed, copied, and moved around), the algorithm works with views, and returns a collection of views on the input string, without actually instantiating, copying, or moving any std::string. Making the simplifying assumption that the algorithm truly returns one single std::string_view, the code could look like this

std::string_view algo(std::string const& s) {
    // very complex algorithm: chops off first and last characters
    return std::string_view(s.c_str() + 1, s.size() - 1);
}

and be used like this

auto out = algo(input);

Now, in the application I was working on, the out object played the role of the model in the MVVM pattern, and it was not directly used in the GUI.

What was used in the GUI were two things:

the actual text contained in input in order to… well, show the text on screen,
and the viewmodel which, in practice, just consisted of out.length().

And how were these two things made available to the front end? By wrapping them both in a class like this

struct TextAndViewmodel {
    std::string s;
    int vm;
    TextAndViewmodel(std::string&& s, int vm)
        : s{std::move(s)}, vm{vm}
    {}
};

and that's were a silent bug was waiting for me! Can you spot it?

The bug is in that TextAndViewmodel holds the text by value (rather than via a std::unique_ptr or some other owning wrapper with pointer or reference semantics).

Because of that, when such an object was initialized, for instance via

TextAndViewmodel tv{std::move(input), out.length()};

the input was moved from under the std::string_view returned by the algorithm (out), which end up being dangling.

To be precise, this was not yet a bug: the std::string underlying out was gone forever, but out.length() was not affected, so the TextAndViewmodel object was just fine.

It became a bug as soon as, for other reasons, I had to include out itself into tv, and make use of it to decide whether vm should be used or not for each of the TextAndViewmodel objects (it was a kind of a last-minute filtering right before sending stuff to the GUI).

struct TextAndViewmodel {
    std::string s;
    std::string_view sv;
    TextAndViewmodel(std::string&& s, std::string_view vm)
        : s{std::move(s)}, sv{sv}
    {}
    // if the client needs what was `vm` in the
    // previous snippet, they can go for `sv.length()`
};

At that point, indeed, I started noticing some weird things going on in the GUI:

the text was ok, because input was moved into tv,
and for "normally sized" text files everything was fine in all respects,
but whenever I tried with a few characters long files, the GUI behaved as the filtering I mentioned above never happened, even if it should.

After spending some time (I really can't unveil how much) debugging and banging my head on the monitor, I understood what was going on: SSO was biting me. Actually, the other way around: SSO is the only reason why I could see the bug, otherwise it could have lingered there, silent for a long while, waiting for somebody other than me, maybe a customer, to bite.

So what was going on?

When input was long enough and SSO wasn't kicking in, the char const* inside it was left intact, as the "movement" of resources simply consisted in the pointer held by input was copied into the one held by tv.s, and then set to nullptr (yes, std::move doesn't move anything).

The consequence is that out was "legally" invalid, because input was std::moved from under it, but "romantically" (aka SSO not kicking in) still looking onto the correct area of memory.

On the other hand, when SSO kicked in, then the actual chars stored in the "special short-string compartment" in input were truly copied, one by one, into the corresponding member of tv.s, and then freed, which for me resulted in input being left full of garbage characters and, in turn, the filtering I mentioned not happening at all.

In the process of debugging, I wrote this little code to print the string length for which the SSO is disabled (which is an implementation detail, by the way):

#include <iostream>
#include <string>

int main()
{
    std::string str1{""};
    bool sso{true};
    while (sso) {
        str1 += "x";
        char const* ptr = str1.c_str();
        std::string str2{std::move(str1)};
        sso = (void*)ptr != (void*)str2.c_str();
        std::cout << (void*)ptr << std::endl;
        std::cout << (void*)str2.c_str() << std::endl;
        str1 = str2;
    }
    std::cout << str1.length() << std::endl;
}

20/11/2021

The importance of Item 26 from Effective Modern C++

Foreword

Some time ago I tried to partially apply std::visit to a visitor, the reason being that I had to visit several std::variants with the same visitor (in different places of the code, so passing all the std::variants at the same time to std::visit was not viable).

Sure, I could quickly write a lambda to do the job:

auto visitWith = [](auto const& visitor){
  return [](auto const&... variants){
    return std::visit(visitor, variants...);
  };
};

but why should I, if there's already the Boost.Hana library out there that offers hana::partial? However, to do so, I had the problem that std::visit is a template function, so it can't be passed around just like if it was an object. Easy peasy, I wrapped it in a lambda:

auto visit = [](auto const& visitor, auto const&... variants){
  return std::visit(visitor, variants...);
};

Yeah, I hear you: if I'm happy with this lambda, why should the previous one bother me?

Well, that's true, but the point is that stubbornly trying to use hana::partial is what made me understand, once more, that Scott Meyers' book is a holy bible!

The problem

Wrapping std::visit in the visit object, however, wasn't enough to be able to partially apply visit to a visitor; not in the naive code I wrote,

auto visitor = hana::overload(
  [](auto arg)               { /* whatever */ },
  [](double arg)             { /* whatever */ },
  [](const std::string& arg) { /* whatever */ });

auto visitWithVis = hana::partial(visit, visitor);

which was failing to compile, with a pretty confusing error, the key part of which was (in hindsight!)

error: no matching function for call to ‘hana::overload_t<main()::<lambda(double)>, main()::<lambda(const string&)> >::overload_t()’

Therefore I asked a question on StackOverflow but nobody answered (for 5 months!), so today I decided to spend some time myself to find out the answer.

The investigation

Eventually, I was able to set up a very minimal example to reproduce the problem:

#include <boost/hana/functional/overload.hpp>
auto ovl{hana::overload(l1, l2, l3)};
using Ovl = decltype(ovl);
Ovl ovl2{ovl}; // this triggers the compilation error

From the comment you understand that there's something wrong with copying the ovl object. As suggested in a comment, however, making ovl a const object solves the problem, which gave me some food for thoughts.

By peeping for some time into hana::partial and hana::overload I could later make up a less minimal, but more meaningful example:

#include <boost/hana/functional/overload.hpp>

auto l1 = [](int){};    using L1 = decltype(l1);
auto l2 = [](double){}; using L2 = decltype(l2);
auto l3 = [](float){};  using L3 = decltype(l3);

using Ovl = hana::overload_t<L1, L2, L3>

Ovl ovl{l1,l2,l3};

Ovl ovl2{ovl}; // this triggers the compilation error

where hana::overload_t is the (template) class of the object returned by hana::overload.

To understand what's going on, I had to look once more, and for more than just a few minutes, into the implementation of hana::overload_t, and to play a bit the role of the compiler, by "manually instantiating" it:

struct overload_t<L1, L2, L3>
    : L1
    , overload_t<L2, L3>
{
    using type = overload_t;
    using L1::operator();
    using overload_t<L2, L3>::operator();

    template <typename F_, typename ...G_>
    constexpr explicit overload_t(F_&& f, G_&& ...g)
        : L1(static_cast<F_&&>(f))
        , overload_t<L2, L3>(static_cast<G_&&>(g)...) // error points here
    { }
};

The solution

It is then clear what's going on: the code Ovl ovl2{ovl};, which should make use of the copy constructor, is using instead the template member function above, with F_ = overload_t<L1, L2, L3>& and G_ = {} (i.e. G_ is an empty pack), which in turn causes an attempt to initialize overload_t<L2, L3>(), which is erroneous and nicely maps to the key part of the error I copied above.

But why does the compiler try to use the template constructor above, instead of the compiler-generated copy constructor (the generation of which is not suppressed, as per Item 17 from the same book)?

The answer is at the end of Item 26 from Scott Meyers' Effective Modern C++:

cloneOfP is being initialized with a non-const lvalue (p), and that means that the templetized constructor can be instantiated to take a non-const lvalue of type Person

[…]

Calling the copy constructor would require adding const […], but calling the instantiated template requires no such addition. […] the template is thus a better match

(In our case cloneOfP corresponds to ovl2, p to ovl, and Person to overload_t<L1, L2, L3>.)

Thanks again, Scott, always!

4/1/2022

CSS-only radar diagram Diagramma a radar con solo CSS

I still remember I was flying back or away from home when I started thinking about this webpage with these thoughts coming in sequence

Oh, I really enjoyed CSS in Depth,
I think I could really leverage that to make a good blog;
I should probably start by doing what I initially planned to do: putting my CV online…
… with my working experience and all my skills…
Yeah, especially all my programming skills!
A diagram with the skills would be nice…
I should try doing it without any JavaScript!
With CSS I can certainly do it!

Well, the reality was not that smooth though. As soon as I had put my feet literally back on earth and played around a bit, I realized that there's a fundamental problem with making CSS-only plots:

ideally, the data, e.g. the x and y coordinates of a piecewise linear plot, should be contained in the HTML markup, they are part of the content of the page, having nothing to do with style (CSS), let alone dynamics (JavaScript);
it is reasonable to think that each pair (x, y) should belong to one HTML element representing a "point" of the curve;
then CSS should take care of "drawing" lines, curves, colors, and what not.

So how should that ideal HTML look like? Probably something along these lines

<div class="plot">
  <!-- 6 points of the y = x² parabola -->
  <div class="point" style="--x: 0; --y: 0;"></div>
  <div class="point" style="--x: 1; --y: 1;"></div>
  <div class="point" style="--x: 2; --y: 4;"></div>
  <div class="point" style="--x: 3; --y: 9;"></div>
  <div class="point" style="--x: 4; --y: 16;"></div>
  <div class="point" style="--x: 5; --y: 25;"></div>
</div>

where the x and y coordinates are in the style attribute so that CSS can pick it up easily.

But now the point is the following:

How the heck do I draw anything, say a line, via CSS according to quantities specific to two separate elements?

Well, the answer is simply that I cannot, the reason being that, in order to draw lines or more complex stuff via CSS, one has to leverage things like borders (to draw lines), background-colors (to draw areas), and so on. But those things are specific to one element, and cannot bridge between two elements.

Regardless of how reasonable my self-argument sounded, I couldn't really just think I had the correct answer already, and so I wanted to search a bit online:

First, searching online for "CSS plot" et similia I generally found stuff containing .js written here or there, sooner or later down the page, which already made me feel like I could be right.
Second, I found this article from CSS Tricks, which links to Chart.js in the first few lines, and later shows a solution that features, for each HTML element representing a point, not just the --x and --y CSS properties, but also properties --hypotenuse and --angle in a nested HTML element (peep into the code and you'll see). If you tamper with those, you'll see the lines detaching from the points; in other words, the lines are not bound to the points, and you have to recompute the values of --hypotenuse and --angle by hand if you change --x and --y.
Finally, I found Chart.css a CSS data visualization framework that claims
Charts.css is a modern CSS framework. It uses CSS utility classes to style HTML elements as charts.
(By the way, the very chart I was looking for, the radar chart, is under development while I'm writing these notes.)

The last result was kind of illuminating. Indeed, if you look at the documentation page of the maybe simplest example, the line, you'll see that it states the following:

When you add data, you need to supply not only the data --size variable but also the --start variable that indicates the starting point.

Rephrasing it (and you can verify it by playing around with the code), each HTML element that represents a point in the graph must provide data to the CSS not just for that point, but also for the point before it. And this is in the simple scenario of piecewise linear curve; if you think about something more complex, there might be need for one HTML element to carry data about more neighboring points.

Given this (bad) news and going back to the HTML snippet above, we could rewrite it like this (where the trailing p of --xp and --yp stands for previous),

<div class="plot">
  <!-- 5 segments (between the 6 points) of the y = x² parabola -->
  <div class="point" style="--xp: 0; --yp: 0; --x: 1; --y: 1;"></div>
  <div class="point" style="--xp: 1; --yp: 1; --x: 2; --y: 4;"></div>
  <div class="point" style="--xp: 2; --yp: 4; --x: 3; --y: 9;"></div>
  <div class="point" style="--xp: 3; --yp: 9; --x: 4; --y: 16;"></div>
  <div class="point" style="--xp: 4; --yp: 16;--x: 5; --y: 25;"></div>
</div>

giving a chance to our CSS to pick the coordinates of the two points between which line has to be drawn.

As regards how that line is drawn, I don't think there's one single solution, and to be honest, I haven't bothered making up some random plots; I targeted full steam ahead the radar plot I wanted.

Here are the few points on which that chart is based (I'm referring to the data content of the graph, and not the meta content, aka the grid):

a radar diagram looks like a polygon, with as many vertices (and sides) as the number of quantities we want to show, say N, so we need to provide that N via a CSS property, say --N that we can encode in the HTML in the root parent of the chart;
that polygon is made up of N triangles;
all triangles have one vertex in common, the one at the center of the chart (so let's call that vertex the center), and the angle insisting on that vertex is the same for all triangles, being it equal to 350° divided by N; position, left, and right CSS properties can be used to achieve the translational positioning, whereas transform and its rotate value can be used to achieve the angular positioning;
for each triangle, the lengths of the two sides starting from the center are exactly the "current" quantity and the "previous" quantity, so we need to make those quantities available to the CSS, as I've already explained and shown in the previous HTML snippet;
any triangle can be thought of as half of an appropriate parallelogram, the half on one side of the appropriate diagonal, and any parallelogram can be thought of as a square subject to anisotropic scaling (that makes it a rectangle) and shear (that makes it "skewed"), so we can make use of the background-image CSS property and setting it to something along the lines of linear-gradient(45deg, black 50%, transparent 50%); to have a black orthogonal triangle, and the transform property with its skew and scale values.

That's mostly it, but the difficult part is actually writing CSS code such that it is reusable. Doing so goes through making appropriate use of CSS custom properties.

Thoroughly inspecting my Radar plot here, both HTML-wise and CSS-wise (via dev tools or by peeping into the repo on GitHub), together with the associated KSS documentation, should be enough to understand how the various bricks fit together. But it's not the easiest of the things…

A few external references:

the general idea about how to draw triangles came from eCCSplorations
the idea about including the "previous point" in the HTML, as said, came from Chart.css
for trigonometric computations, the built on this gist from stereokai
all the rest comes from the already mentioned CSS in Depth

However, here is an example of a plot with 12 skills (in hindsight, I should probably put some effort in drawing as many diameters in the canvas as the are skills on the plot, but that's easier said than done, considering that each of those lines is done with one dedicated linear-gradient in the CSS):

…

Notice that the markup is essentially very simple, as each entry of the plot has 4 classes and 3 CSS custom properties; here are the first two entries:

<div class="trigonometry triangle center-origin--after skill" style="--i: 0; --this-skill: 5; --next-skill: 3"></div>
<div class="trigonometry triangle center-origin--after skill" style="--i: 1; --this-skill: 3; --next-skill: 4"></div>
<div class="trigonometry triangle center-origin--after skill" style="--i: 2; --this-skill: 4; --next-skill: 1"></div>
<div class="trigonometry triangle center-origin--after skill" style="--i: 3; --this-skill: 1; --next-skill: 3"></div>

The markup is just a bit more complicated, with two more nested elements, when we want to add some "labels" to each entry; here's the entry labelled as B:

<div class="trigonometry triangle center-origin--after skill" style="--i: 9; --this-skill: 2; --next-skill: 3">
  <div class="polar-diagram-icon polar-position">
    <div>B</div>
  </div>
</div>

It is understood that using JavaScript we could dynamically set --next-skill instead of hardcoding it in the HTML, which clutters it with content which is not semantically relevant.

A few external references:

the general idea about how to draw triangles came from eCCSplorations
the idea about including the "previous point" in the HTML, as said, came from Chart.css
for trigonometric computations, the built on this gist from stereokai
all the rest comes from the already mentioned CSS in Depth

21/11/2021

This blog Questo blog

Before starting the adventure of setting up a personal blog, I had never really dedicated myself to a personal programming project that needs such an amount of work.

In the Beginning, There Was Nothing. The Keith J. Grant Wrote CSS in Depth.

…and that's how I truly considered that I could write a blog myself.

To be precise, it all started with the slightly less pretentious aim of putting my CV online. Here's the first sketch

But as soon as I started to feel comfortable with CSS and HTML, I also started to target somewhat higher… Why not writing a blog?

Therefore I had to approach JavaScript as well, including meeting JQuery.

Eventually, I asked several question around the topic of web development.

That's how it all started.

Well, for now nobody knows of it, but... you never know, maybe somebody will start reading it…, and maybe find it useful too, if they want to start a similar adventure themselves!

01/09/2021

Painless window navigation in Vim Navigazione facile tra le finestre di Vim

Today my Vim plugin for easy window navigation, which I named WinZoZ, has got its first star! Given this comment on StackOverflow, the star is from the user @yolenoyer.

With this plugin you can hit <leader>w to enter a sort of "CTRL-W mode", during which each key pressed will be assumed preceded by CTRL-W.

This can be very helpful, imho, if you're the kind of Vim user who works with a big monitor and splits the terminal several times both horizontally and vertically.

Indeed, if this is the case, chances are good that you hit CTRL-W a lot of times to move from one window to another. (Search for :help CTRL-W to see how many commands start by CTRL-W.)

Below is a GIF demonstrating the use of WinZoZ:

Oggi il mio plugin Vim per agevolare la navigazione tra le finestre, che ho chiamato WinZoZ, ha guadagnato la sua prima stella! A giudicare da questo commento su StackOverflow, la stella è dall'utente @yolenoyer.

Con questo plugin puoi premere <leader>w per attivare una specie di "modo CTRL-W", durante il quale ogni tasto premuto verrà interpretato come preceduto da CTRL-W.

Questo può essere molto utile, secondo me se sei il tipo di utente Vim che lavora con un grande monitor e divide il terminale molte volte sia orizzontalmente che verticalmente.

In questo caso, infatti, ci sono buone probabilità che tu sia costretto a premere CTRL-W tantissime volte per muoversi da una finestra all'altra. (Cerca :help CTRL-W per vedere quanti comandi iniziano con CTRL-W.)

In basso c'è una GIF dimostrativa dell'utilizzo di WinZoZ:

06/01/2022

Clean Code (?) - part 1

I didn't like that book.

…

(OK, maybe I should be more verbose than that.)

First of all, it's full of just Java code. With such a premise, I can't hope that the author expresses as general ideas as the title would imply, and indeed he doesn't; what he does instead, is trying to give a recipe made up of dogmatic rules for writing presumably clean Java code. And given that languages evolve (I guess Java does too, doesn't it? I don't know Java, unfortunately), that old book is probably not applicable anymore.

Secondly, the book does have something good: I do agree with several statements across the book, and those statements are often engraved in section titles…

… but as long as examples are concerned, I think they are flat out wrong: they mislead, and generally go against what the titles express, at least based on the few chapters I could read before giving up.

And yes, this is gonna be a rant. (And it's not just my rant: It's probably time to stop recommending Clean Code.)

Take Chapter 2, Meaningful Names.

After an Introduction that is as non-controversial as it is a tautology, a section named Use Intention-Revealing Names gives an example of a badly named variable

int d; // elapsed time in days
The name d reveals nothing.

followed by examples of presumably better names for variables representing time

We should choose a name that specifies what is being measured and the unit of that measurement:
int elapsedTimeInDays;
int daysSinceCreation;
int daysSinceModification;
int fileAgeInDays;

Now, I do understand and appreciate the suggestion of giving a name that specifies what is being measured; after all, if I don't know what a variable is, how can I make use of it or understand how the surrounding code uses it?

On the other hand, I can't help notice that all those variables are ints and that their names contain the word days.

What does that make me think? That daysSinceCreation is sitting there just waiting to be subtracted to secondsSinceTheEpoch, and the difference to be assigned to dayOfCreation, resulting in a wonderfully cinematographic temporal paradox that features us travelling to a very far future to create a new file that has always existed… (I hope the joke is not too obscure… the bold text should help)

So going back to the suggestion, my question is: why should I encode the unit of measure in the name of the variable? The age of a file is just time, whether I express it in days or in seconds. The fileAge part of the name fileAgeInDays is semantically relevant, as it tells me what that variable represents in the piece of code I'm reading. But the InDays part is just an implementation detail.

I hear somebody saying once I have decided that I use an int to represent time, I have to encode in the variable names what is the unit of measure of that time, otherwise I risk adding apples and oranges. My answer is that this approach is flawed since the beginning: if you are worried about an int representing a time in days being added to an int representing time in years, that means that int is just not the right type for you, and that you should use a type that handles the unit of measurement automatically.

(To know more about this, I suggest reading the series on strong types by Jonathan Boccara, which I find excellent.)

I hear somebody else saying come on, it was just an example!, and my answer is

this is a book about about writing clean code, so it should have been damn careful when giving examples
tell me what the cost of writing the above snippet like this would have been:
```
Time elapsedTime;
Time timeSinceCreation; // a better name is probably just "age"
Time timeSinceModification;
Time fileAge;
```
Time just screams no worries, I'm the one handling the units. (No, starting a variable name with time in this case is not duplicating information already contained in the type, but is disambiguating: using a standalone sinceCreation we wouldn't know if that is a time or a number of events, just to say; fileAge encodes the "time-ness" in the Age part, so we don't need time in its name.)

Then Martin claims that it's hard to tell what the following code is doing

public List<int[]> getThem() {
  List<int[]> list1 = new ArrayList<int[]>();
  for (int[] x : theList)
  if (x[0] == 4)
    list1.add(x);
  return list1;
}

which I can't disagree with, but I feel there are two candidates for why the code is not clear:

the names of the variables are badly chosen
the code mixes two levels of abstraction and hard codes stuff that should be named appropriately and taken as input

Uncle Bob obviously picks number 1 and goes full steam on his crusade, but we disagree even earlier than that, when he writes There aren't even any fancy classes or polymorphic methods, which for me just means that somebody is easily scared by generic code.

Anyway, he eventually shows us his shiny code

public List<Cell> getFlaggedCells() {
  List<Cell> flaggedCells = new ArrayList<Cell>();
  for (Cell cell : gameBoard)
    if (cell.isFlagged())
      flaggedCells.add(cell);
  return flaggedCells;
}
With these simple name changes, it's not difficult to understand what's going on. This is the power of choosing good names.

Incidentally, where does gameBoard come from? It's a global variable, which means getFlaggedCells() function has side effects: it will return different output without you changing the inputs you pass to it (yeah, it doesn't take inputs, which is a symptom of impurity). A few pages later Uncle Bob will suggest that code should not have side effect. What a undecided man…

I don't know how understandable that code is if you don't parse it, just as you were the compiler (or the JVM, or whatever it's needed to run that crap), but I know that Uncle Bob will write the following code somewhere else (modulo I-don't-know-Java-sorry)

public Set<Book> getProgrammingBooks() {
  Set<Book> programmingBooks = new ArraySet<Book>();
  for (Book book : libraryShelf)
    if (book.isOnProgramming())
      programmingBooks.add(book);
  return programmingBooks;
}

And guess what's the difference? None! The two codes are just doing the same thing: filtering a container according to a predicate (a.k.a. I've just renamed the variables/types/names to make the latter snippet from the former).

So why should we write those two functions, instead of writing just one generic function that we use in the context we need? Again, I don't know any Java, but I guess this pseudo-Java conveys what I mean:

public Container<Item> filter(Container<Item> container, Pred pred) {
  Container<Item> newCont = new ArrayContainer<Item>();
  for (Item item : container)
    if (pred(item))
      newCont.add(item);
  return newCont;
}

Notice that this piece of code above is closer to the initial version (of which Uncle Bob says that telling what it does it's not easy) than it is to the questionably refined one:

The original version
- used specific types, such as List, ArrayList, and int,
- hard-coded the predicate,
- depended on a global variable,
- but at least used names that were not bound to the specific domain of the caller (even though they uselessly hard-code informations which is already encoded in the type, e.g. why calling a List theList?);
uncle Bob's final version
- kept using specific types, such as List, ArrayList, Cell,
- kept hard-coding the predicate,
- kept depending on a global variable,
- used the caller's domain-specific names for the variables;
my version
- well, by Container I mean a generic type, but I don't know how to do that in Java,
- takes the predicate as an input argument,
- takes the input collection as an input argument,
- uses generic names that don't commit to any specific domain the caller could choose to use it for.

And if the C++ programmer that I am were to kick in, then I would say that if your language doesn't have a library function for filtering a container, then you'd better change the language.

Here's a C++11 solution, which uses the erase-remove idiom:

template<typename Container, typename Predicate>
Container filter(Container c, Predicate p) {
    c.erase(std::remove_if(c.begin(), c.end(), p), c.end());
    return c;
}

and finally here a C++17,

#include <range/v3/view/filter.hpp>
using ranges::views::filter;

and here a C++20 solution,

#include <ranges>
using std::ranges::views::filter;

What's that?! Just an #include and a using? Yeah, because in both cases the function for filtering is just there, in the appropriate header and in the appropriate namespace, and you can easily use it

auto w = filter(v, pred);  // these two
auto w = v | filter(pred); // are equivalent

Going back to my claim that Martin's code mixes two levels of abstraction, now it should be clearer: he prompts us to name every aspect that is inside the body of that function according to the use that the caller makes of it, which clearly depends on the domain of the caller: in his example a board game, in my counter example the old library down the road. In some way, the specific use-case that the caller makes of this function, which is at a given level of abstraction, has leaked into the function, the content of which, says Martin, should be one abstraction level lower than its call site (exact quote is Notice that the three steps of the function are one level of abstraction below the stated name of the function, at the top of page 36). This is clearly in contrast with page 33, first paragraph, where he (righteously) claims that mixing too many levels of abstractions in a function is a bad thing.

But the function's implementation screams I'm not so dumb, I can do much more!, and naming stuff as Martin suggests just makes us deaf to that call.

Clearly, another thing that deafens us is designing the function such that it takes no arguments (another dogmatic, unjustified, and flat out wrong suggestion that Martin dispenses in the next chapter, Chapter 3), implying that it writes or at least reads from some global state. Because of that, we could erroneously think:

I can filter "things"; if my function takes no "things" in input, how can it be a filter?
Filtering is done according to a predicate, but my function takes no predicate.

Therefore, blindly following Martin's prescriptions will indirectly lead us to make a lot of code duplication, because the names chosen his way will prevent us from seeing common patterns. This is clearly in contrast with what he writes at page 32, first paragraph, where he (righteously) refers code duplication as a bad thing.

Then the crusade goes on in a (slightly) less controversial way:

Avoid Disinformation

[…]

A truly awful example of disinformative names would be the use of lower-case L or uppercase O as variable names, especially in combination. The problem, of course, is that they look almost entirely like the constants one and zero, respectively.
int a = l;
if ( O == l )
  l = O1;
else
  a = 01;

I can't disagree, in general, but I'd say that the example is again silly: the IDE will show O1 and 01 in different colors, because the former is a variable name and the latter a numeric literal. What if my IDE doesn't? Then why are you still using it?

Oh, I appreciate a footnote related to the use of a and the prefixes for local variables and function arguments (~~arguments~~ parameters, Uncle Bob):

Uncle Bob used to do this in C++ but has given up the practice because modern IDEs make it unnecessary.

Going forward, about Make Meaningful Distinctions,

Number-series naming (a1, a2, .. aN) is the opposite of intentional naming. Such names are not disinformative—they are noninformative; they provide no clue to the author's intention.

… except if I'm coding a mathematical function that traditionally uses that well known "number-series naming" convention.

And then:

getActiveAccount();
getActiveAccounts();
getActiveAccountInfo();
How are the programmers in this project supposed to know which of these functions to call?

Now jump at page 51, Listing 3-7 (Chapter 3), and you'll find two member functions named includeSetupPages() and includeSetupPage() respectively. It is true that the former is public whereas the latter is private, so a user of that class won't have a chance to be in doubt of which one they have to call. But the current chapter is demonizing the coexistence of a singular with its plural for the sake of it.

Anyway, while I might agree that getActiveAccountInfo vs getActiveAccount can (and maybe is) confusing, why should getActiveAccount vs getActiveAccounts be confusing at all? It's clear that the former gives one account and the latter some collection of accounts. What's the problem with that? I mean, is the following not clear?

for (Item item : items)
    // do something with this item

Then the crusade goes on and targets single-letter names, luckily admitting that they can be acceptable in short contexts (the claim that the length of a name should correspond to the size of its scope is simply hilarious, though).

Once more into madness with this code,

for (int j=0; j<34; j++) {
  s += (t[j]*4)/5;
}

which is claimed to be unclear, versus this,

int realDaysPerIdealDay = 4;
const int WORK_DAYS_PER_WEEK = 5;
int sum = 0;
for (int j=0; j < NUMBER_OF_TASKS; j++) {
  int realTaskDays = taskEstimate[j] * realDaysPerIdealDay;
  int realTaskWeeks = (realdays / WORK_DAYS_PER_WEEK);
  sum += realTaskWeeks;
}

which, in my opinion, just hides the fact that the we have just a trivial application of transform/map followed by a reduce/accumulate (unfortunately not all languages agree on how these general purpose utilities are called, sigh).

Indeed, in JavaScript, you'd write the above as

taskEstimate.map(x => x*4/5).reduce((x, y) => x + y)

or even more clearly like

_.sum(_.map(taskEstimate, _.multiply(4/5))) // lodash
_.sum(_.map(_.multiply(4/5), taskEstimate)) // lodash/fp

if you use Lodash/lodash/fp.

But it was just an example! Good, thanks. And what do I get from a book on programming which has silly examples? Philosophy?

As regards using searchable names, I agree, but if you are in need of searching the codebase for the running index of a for loop on an array, then either you have no idea what debugging is, or you're dealing with much bigger problem with the surrounding code, so finding that index won't really help you much. After all, if I have a for loop running on the indices of a vector/array (why am I doing this in 2022, by the way?), then why should I name that index other than i or j? It is just a number to loop on the array/vector, it is an implementation detail.

Good news is that Martin suggests that we shouldn't use m_ prefixes for member variables anymore. Praise the lord!

Here's another couple of suggestions which are truly Java-specific:

Classes and objects should have noun or noun phrase names
Method Names Methods should have verb or verb phrase names

Sure, and what if a class exists for the sole purpose of providing a method (especially in a language like Java, where everything is a class)? What do I call it then? ClassForMethodMethodName? The point is that objects and functions are just the same thing, artificially made different by several (most?) languages; establishing different naming rules is equivalent to encoding the weakness of the language in code. Take std::equal_to in C++: it is a class, but it has a call operator, and we are just meant to use an object of that class, std::equal_to<>{}, as it was a function.

Eventually, there's one point I strongly agree with, in the Final Words of this chapter (my emphasis):

The hardest thing about choosing good names is that it requires good descriptive skills and a shared cultural background. This is a teaching issue rather than a technical, business, or management issue.

which prompts me to ask a question:

Should I write code as if my grandma was to read it? Or should you be willing to learn continuously during your life of programmer, thus lifting your cultural background up with time?

… and this was most I had to say about Chapter 2.

30/12/2021

Clojure Programming

In this post, I'm collecting my thoughts on the Clojure Programming book published by O'Reilly.

At page 165, in the middle of the section named Promises, the authors claim that

An immediately practical application of promises is in easily making callback-based APIs synchronous.

and then propose this asynchronous function,

(defn call-service
  [arg1 arg2 callback-fn]
  ; ... do stuff
  (future (callback-fn (+ arg1 arg2) (- arg1 arg2))))

this synchronous wrapper,

(defn sync-fn
  [async-fn]
  (fn [& args]
    (let [result (promise)]
      (apply async-fn (conj (vec args) #(deliver result %&)))
      @result)))

and this use case,

((sync-fn call-service) 10 7) ; (17 3)

What puzzled me of this solution, is the fact that sync-fn doesn't really preserve call-service's API: we can't even pass a callback function to call-service anymore, because the callback is encoded in sync-fn itself, specifically in the function literal (aka lambda) #(deliver result %&) which entangles together two businesses

executing a pure function on the inputs (the two inputs that call-service passes to callback-fn);
populating the result promise via the non-pure deliver utility.

Indeed, #(deliver result %&) is a variadic function that wraps all its arguments into a sequence, via %&, and then passes that sequence as the second argument to deliver, so we can rewrite it as the composition of those two businesses

; equivalent to #(deliver result %&)
(comp (partial deliver result) list)

It is apparent, at this point, that list is just one possible function that we want call-service to use on callback-fn's arguments, so there's no reason for hard-coding it into sync-fn.

In fact, we can make it an input to sync-fn:

(defn sync-fn
  [async-fn callback-fn]
  (fn [& args]
    (let [result (promise)]
      (apply async-fn (conj (vec args)
                            (comp (partial deliver result) callback-fn)))
      @result)))

which can then be used like this

((sync-fn call-service list)          10 7) ; (17 3)
((sync-fn call-service vector)        10 7) ; [17 3]
((sync-fn call-service #(apply + %&)) 10 7) ; 20
((sync-fn call-service #(apply * %&)) 10 7) ; 51

where we pass, as sync-fn's second argument, the same lambda we would pass to call-service as a third argument.

This way, sync-fn doesn't even need any change in the face of changes in what call-service expects the callback to look like (even though the fact that call-service expects the callback function as the last argument is still hard-coded in sync-fn). For instance, if call-service expected a callback-fn that takes a list of arguments rather than distinct arguments,

(defn call-service-2
  [arg1 arg2 callback-fn]
  ; ... do stuff
  (future (callback-fn (list (+ arg1 arg2) (- arg1 arg2)))))

then we'd only need to adapt the callback lambdas by making them accept one argument instead of many:

((sync-fn call-service-2 identity)          10 7) ; (17 3)
((sync-fn call-service-2 vec)               10 7) ; [17 3]
((sync-fn call-service-2 (partial apply +)) 10 7) ; 20
((sync-fn call-service-2 (partial apply *)) 10 7) ; 51

Programming skills

Conoscenze di programmazione

Self-assessment Autovalutazione

Stack Overflow

06/2020—present

C++ Software Engineer Sviluppatore C++

MathWorks, Cambridge (UKRegno Unito)

Design, development, and testing of text comparison back-end tools and algorithms.

Design, sviluppo, e test di algoritmi e strumenti di back-end per confronto di testi

Expertise:

Esperienza con:

C++17
template meta-programming
SFINAE
concepts
Boost, Boost.Hana, Range-v3
functional programming
programmazione funzionale
Google Benchmark
Google Test

04/2019—05/2020

Application Support Engineer Supporto tecnico

MathWorks, Cambridge (UKRegno Unito)

Main roles:

Phone/e-mail tech support
training assistant at JLR
MATLAB Expo
lead computer science weekly meetings
coaching new hires
technical interviews

Development C++ projects:

Implementation of Kuo&Cross longest common subsequence algorithm
Transition 3p library → internal library for HTTP requests

Compiti principali:

Supporto tecnico telefonico e tramite e-mail
Assistente trainer presso Jaguar-Land Rover
MATLAB Expo
Direzione di incontri settimanali su temi di computer science
Training di nuovi assunti
Interviste tecniche per assunzione

Progetti di sviluppo in C++:

Scrittura di algoritmo di Kuo&Cross per il calcolo della longest common subsequence
Transizione da libreria esterna a libraria interna per richieste HTTP

2015—2018

Ph.D. in Aerospace EngineeringIngegneria Aerospaziale

Federico II, Naples (Italy)Federico II, Napoli (Italia)

Thesis Lavoro di tesi

Development of a 3D fluid-dynamic solver for turbulent flows from scratch Sviluppo da zero di un solutore fluidodinamico tridimensionale per flussi turbulent

Object-Oriented Fortran 2003/2008 + MPI 3.1-parallel Fortran 2003/2008 orientato agli oggetti, parallelizzato tramite MPI 3.1
employed MPI three-dimensional domain decomposition Utilizzo di decomposizione tridimensionale del dominio

2012—2015

Master degree in Aerospace EngineeringLaurea specialistica in Ingegneria Aerospaziale

Federico II, Naples (Italy)Federico II, Napoli (Italia)

Final gradeVoto finale: 110/110 cum laudee lode

Thesis titleTitolo della tesi: Parallel implementation of compact schemes for Navier-Stokes simulations

Description: Compact finite difference numerical schemes are implicit in space, thus requiring the solution of linear systems in order to be used. Parallelization opportunities in applying such schemes have been investigated from the algorithmic point of view and tested on 2D benchmarks.
Contribution: An original method has been developed which uses explicit estimates of derivatives to decouple the algebraic systems.
Results: The method shows dissipation properties to be reduced by 75% with respect to more standard approaches.

Favourite courses:Corsi che ho preferito:

Multimedia signal processing
Control theory
Complements of gas dynamics (e.g. shockwave structure)

2008—2011

Bachelor degree in Aerospace EngineeringLaurea triennale in Ingegneria Aerospaziale

Federico II, Naples (Italy)Federico II, Napoli (Italia)

Final gradeVoto finale: 110/110 cum laudee lode

Thesis titleTitolo della tesi: Convective heat transfer measurements in transitional circular and chevron jets

Description: Study of "cooling" capabilities of Chevron impinging jets Descrizione: Studio della capacità di raffreddamento di un getto Chevron impingente
Contribution: Experimental campaign consisted in heat transfer measurements by means of a thermographic camera Contributo: Campagna sperimentale consistita in misurazioni di scambio termico tramite camera termografica

Favourite courses:Corsi che ho preferito:

Numerical methods (FD, FV, FEM) Metodi numerici (DF, VF, MEF)
Aerodynamics Aerodinamica
Thermodynamics Termodinamica
Gasdynamics Gasdinamica

2003—2008

High school diplomaDiploma di scuola superiore

Liceo "Colombo", Marigliano (Naples, Italy)Liceo "Colombo", Marigliano (Napoli, Italia)

Final gradeVoto finale: 100/100

AwardsRiconoscimenti:

Bronze medal at the National stage of "Olimpiadi della Matematica 2008" (90th classified out of 300 participants) Medaglia di bronzo alla fase Nazionale delle "Olimpiadi della Matematica 2008" (90esimo classificato su 300 partecipanti)
2nd classified at Provincial stage of "Olimpiadi della Fisica 2008" (among ~100 participants) 2° classificato alla fase Provinciale delle "Olimpiadi della Fisica 2008" (tra ~100 partecipanti)
Top 5 classification every year at the institute competition to select participants for Tra i primi 5 classificati ogni anno alle gare d'istituto per la selezione dei partecipanti alle
- Olimpiadi della Matematica
- Olimpiadi della Fisica
- Olimpiadi della Chimica