Three things that aren’t taught early enough #2: Dynamic Link Libraries
Continuing the series started with Three things that aren’t taught early enough #1: Unicode, I now present thing number two:
Thing Two: Dynamic Link Libraries
Modern operating systems live on dynamically linked libraries (abbreviated to DLLs from here on in, which refers to the general concept, rather than the Windows-specific implementation). Huge amounts of code is provided to applications (there’s over 800MB in my Windows Vista system32 folder) which simply cannot be included directly into any compiled application. Unfortunately, the most a second or third year (or later?) computer science student could tell you about is DLL hell, and they probably won’t get any further than the name.
DLLs are (still) the top step in increasing levels of physical code-separation. Early (very early) programs consisted of a single monolithic source file. (Even earlier ones consisted of a single monolithic punch card…) This gave way to separate files which were compiled separately and then linked (called “object files”, commonly COFF and ELF). Commonly used object files started being collected/archived into static libraries, allowing much greater code reuse than OOP has ever achieved. The big disadvantage of static libraries is all the code gets included into the executable. When 100 applications include the same 1MB of code, that’s 99MB worth of duplication.
Enter DLLs. When an executable starts, the loader determines whether the required DLLs are in memory or not. If they are, the loaded code is used. Otherwise, the DLL is found and loaded from disk. All advances on DLLs have mostly been aimed at reducing dependency issues, including through side-by-side installation of multiple versions (something that has been in *nix forever (or close enough)), tighter interface conventions (for example, ActiveX/COM) and stronger naming conventions (as in .NET assemblies).
<Windows Specific>
(The implementation of DLLs is operating system specific. I don’t have any inside knowledge of *nix style .so files (though I know that they are identical to MacOS .dylib files), so I will only discuss Windows .dll files. Presumably the concepts are applicable to all DLLs, otherwise they would be implementing different things.)
A .dll file is fundamentally identical to a .exe file. In fact, they are completely identical in implementation. An .exe file can be used as a DLL if so desired. All Portable Executable (PE) files include import and export tables. The export table contains a list of function names and addresses which may be called from outside the file. No information on calling conventions or parameters is included as this is meant to be shared separately. Name mangling is used in various forms to better enforce function signatures, though it usually doesn’t include enough information to make shared header files unnecessary. Unfortunately, different compilers perform different name mangling, making DLLs hard to share. For this reason, extern “C” is generally recommended, as this disables name mangling (as well as parameter overloading).
DLLs are used by another program in one of two ways. They are either dynamically loaded, using functions such as LoadLibrary() and GetProcAddress(), or automatically linked. Automatic linking is performed by the loader using the contents of the import table. Information is included in the application about functions that exist in another DLL, generally by use of a small static library. When starting the program, the loader will resolve these references, loading any required DLLs, and fix up function calls to point to the right memory location. If a function or library cannot be resolved, the program will not start. An advantage of dynamically loading a library is that the application can better handle a DLL error.
Dynamic loading also allows for better backwards compatibility. Newer versions of Windows provide API functions that didn’t exist in previous versions. If the loader is asked to resolve one of these functions and it cannot be found, the application will not start. If, however, the function is not essential, it can be dynamically resolved and called if available.
</Windows Specific>
So, what hasn’t been taught about DLLs? I would argue that the bare minimum is missing. The concept of a compiled software project that doesn’t actually run. The concept of exported functions and data (and also the concept of a linker). The issues arising from mismatched interfaces. A brief overview of name mangling and the issues that occur when mixing compilers. Exporting definitions for C++ applications (Microsoft’s dllimport and dllexport specifiers and gcc’s switches and conventions) (the trend is so strongly away from native applications that few other languages offer the ability to make native DLLs). Introduce and play with Dependency Walker and/or a *nix equivalent.
Issues such as redirection, forwarding, specifics about export tables, deployment and registration don’t need to be touched.
DLLs don’t require a huge amount of time dedicated to them, but no time is simply not good enough. Arguably they are not relevant in a pure computer science degree, but since nobody is teaching a pure computer science degree anymore DLLs should definitely get a showing. Along with Jeff Atwood, I also include version control in this sort of category. It needs to be taught, however briefly, before people get a piece of paper saying they can do stuff.
(Now I’ve mentioned version control it should be a big hint that it isn’t thing number 3. Number 3 is a huge one though, might be a couple of weeks off depending how much time I get. It will also include code samples a-plenty. Apologies for the delay, but hopefully it will be worth it.)






