What are linkers and loaders in programming?

911 views

If you had to explain in the shortest and simplest way possible, how would you put it?

In: Engineering

4 Answers

Anonymous 0 Comments

In some programming languages, program code – which is nothing but a text document, is transformed into object code. This is done with a program called a compiler.

Object code is *mostly* the binary CPU instructions that’s going to be your program, but there are omissions. You see, you have one object file over here, and it has to “call” a bit of program somewhere else (a function), but when making the this object file, the compiler didn’t know where that function was. No matter! The compiler just produces a placeholder, and this problem will be taken care of later.

Enter the linker. It’s going to consume object files and and resolve all those placeholders. That function better exist somewhere among all the object files, or it’s an error. The result is an executable binary, either a program, or a library.

Now programs you run all the time. What happens is a loader will read the program into memory, flag that memory as read-only and executable, establish some writeable memory called “the stack”, initialize some data, and coordinate with the operating system to run the program.

Libraries come in two flavors.

Static libraries are glorified object files, they have to be linked into other executables as those are being built by some vendor or producer. Static libraries are one way to distribute *parts* of a program as a product.

Dynamic libraries are far more standalone. It’s not a program by itself, but it can be loaded into memory and used BY programs. There is a boundary call an interface, the library has functions in it called XYZ-whatever, and the program that depends on that dynamic library has to be programmed to know those functions are there and how to use them. Likewise, a program is known to load a dynamic library and depends on an interface, so any library that conforms to that interface can be used by that program (there are layers of security involved, because this is how programs can be exploited). There are benefits to using dynamic libraries, in that you can use them as a plugin system for 3rd party modules and extensions, for whatever such a program might do, you can upgrade just those modules, fix bugs, etc, without having to ship a whole new program, you can reload modules at runtime without stopping the program, and there are higher level things like locating that module in memory in a way to try and evade certain attack vectors (though there are other solutions to this problem, too).

The original idea behind dynamic libraries is that you could save memory by loading a common module into memory once and multiple programs could use it. In practice… I don’t think any operating system actually does this anymore – too many exploits, too vulnerable, too many dependency issues. You can upgrade the common version to fix one program, but it might break another. Many programs will maintain their own copy of otherwise common dynamic libraries, so things don’t get changed and broken under their noses. And the way operating systems handle memory these days makes this an unnecessary practice.

Anonymous 0 Comments

A linker is a program that takes many small components of a program in progress and combines them into one larger program. In many cases this is actually the final .EXE file, or it may just be used to combine smaller files into a bigger file. Also some of these components may be provided by the system and not written by the programmer. Huge programs are written as many many individual components for organizational reasons as well as the effort involved in updating and rebuilding them, so they have to be combined at some step. That’s the linker’s job.

The loader is the program that is actually used to run the program with all its needs. Windows programs need DLLs to run, so a loader might be responsible for finding the DLLs needed, loading them into memory, and then combining everything together so the program can properly run. The details vary from system to system, for example the Linux loader is stored at /lib/ld-linux.so.2 for 32 bit programs, but only if dynamically linked. A statically linked program needs no loader, but then doesn’t benefit from upgrades made to system components and is a much larger executable program.

Anonymous 0 Comments

When you start a process it will create its own memory space called a virtual memory. The CPU actually have a map to translate between virtual memory and physical memory for the current process. This means that the process have full range over the entire memory address space and can put things wherever it wants, it does not even have to worry about the physical space available. The reason this is done is because the code very often refers to a fixed place in memory and this only works when that fixed memory address actually contains what you expect. So an executable file contains a map over all the memory areas it needs to load, some memory needs to be the code or data from the file and some memory needs to be available but not populated. However when the compiler compiles the code it does not know any of the memory layout. So whenever it needs to write a static memory location it just use a temporary value and notes the location in a table with a label as to where it should eventually point to. When you finally merge the code from multiple compiled sources together you need to go through and set the final values everywhere. This is what the linker does as it links the code together. The job of loading the process memory space with the correct data is what the loader does. However even at loading time you might not have everything linked together correctly. This is because applications often share common libraries which are their own executable files except they do not have an entry point. So the loader also have to link together different code. Different operating systems does this differently as you do run the risk that different libraries claim the same address spaces which does not work as they have to run in the same process. Windows will go ahead and relink the libraries whenever this happens while Linux require all shared libraries avoid the use of static memory addresses and rather use dynamic linking which use relative memory locations and lookup tables so it can run at any memory address.

Anonymous 0 Comments

Compiler: Takes high level code and turns it into assembly

Assembler: Turns assembly code into machine code

Linker: Combines different pieces of machine code into executable machine code (some piece of code may reference another piece, the linker Combines them)

Loader: Loads the executable code into memory so it can be run