Linux ELF binary data sections

30 January 2020
Back in the days I did Win32 programming it was possible to embed binary data into a program executable via a resource file, but the only way I was aware of doing the same thing in Linux was to convert the data into an array of hexadecimal values, but that has scalability problems. There were alternatives such as appending data to the program and then opening it like a file, but that requires knowledge of how long the compiled program will be. I recently came across a trick that is much like what I used to do with Win32, and this is described here.

The ELF (Executable and Linkable Format) program binary type, which is Linux's equivalent of Windows PE (Portable Executable), divides a program into multiple sections. Typically there is .text for program code, .data for initialised global data, and .bss for data that has no initial value, and a few others for things like debugging symbols. The basic idea is to have a small data-structure that is compiled into its own section, and this can be replaced post-compilation using various ELF utilities.

A simple program

The program below has a data-structure that is put into its own section .elfdata — the name is arbitrary but use objdump -h to see which ones are already in use — which consists of a length value and a payload.

#include <stdio.h> #include <inttypes.h> __attribute__((section(".elfdata"))) struct { uint16_t size; uint8_t payload[]; } Data = {4, "Data"}; int main(int argc, char **argv) { uint16_t idx; printf("Data size: %i\n", Data.size); printf("Data: "); for(idx=0; idx<Data.size; idx++) printf("%c", Data.payload[idx]); printf("\n"); return 0; }

Running the program prints out the hard-coded data:

./elfdata.exe Data size: 4 Data: Data

You can also use objdump to see the contents of the section:

$ objdump -j .elfdata -s elfdata.exe elfdata.exe: file format elf64-x86-64 Contents of section .elfdata: 601040 04004461 746100 ..Data.

Replacing the section

As an example, the commands below will fabricate the data for a new section that has an 8-byte payload instead of the compiled 4-byte data, and then it will substitute the existing section with the newly created one. In this case since the new section is larger than the old one, adjustments will be made to the ELF file:

$ echo 0x08 0x00 | xxd -r > new.bin $ echo "New text" >> new.bin $ objcopy --update-section .elfdata=new.bin elfdata.exe objcopy: stHEi05A: section .bss lma 0x601048 adjusted to 0x60104b

Once this is done running the program will now display the new data:

./elfdata.exe Data size: 8 Data: New text

It is also possible to extract a section to file:

$ objcopy --dump-section .elfdata=newdata.bin elfdata.exe $ hexdump -C newdata.bin 00000000 08 00 4e 65 77 20 74 65 78 74 0a |..New text.| 0000000b

Some notes and caveats

When substituting new ELF section data you will have to pay attention to things like byte packing and binary structure, as well as coming up with some way of letting the program know how much data is present, although these are issues beyond the scope of this article. There are a few other factors, briefly covered below, that may also be complications when using ELF sections for data that is substituted post-linking.

Unrecognised format error

It is unlikely to happen if you are doing plain Linux development targeting the computer you are working on, but if you get the following error you will need to specify the ELF sub-type manually:

objcopy: Unable to recognise the format of the input file `elfdata.elf'

This is because objcopy by default assumes the ELF file is the same format as the host, which is not the case if you do cross-compiling, so the format has to be specified manually. You can use objdump to find out what architecture the file actually is:

$ objdump -h elfdata.exe | grep "file format" elfdata.exe: file format elf64-x86-64

In this case the format is elf64-x86-64 although an embedded ARM chip will be something like elf32-little — for the latter the format can then be specified manually to objcopy using -I elf64-x86-64.

Stripped binaries

Switching around ELF sections may or may not work if the program binary has been stripped. From brief testing it will work on the ix86-64 (i.e. 64-bit Intel/AMD64 desktop PCs) but I found that with embedded ARM binaries stripping removes information needed by objcopy to substitute ELF sections. The result is errors like the following:

objcopy: error: the input file 'foo.elf' has no sections

I suspect this may be down to lack of relocation support in embedded targets, so sections are just location tags rather than something of importance to the runtime enviornment. This would also explain why increasing the size of a region does not work on at least some embedded devices.

Aggressive optimisation

I do not know what extent the higher optimisation levels might try to propagate initial values specified within the program code, resulting in a program that breaks when section data is substituted. For instance if the section data is a string and for whatever reason GCC decided to optimise away a strlen() call, particularly if data is marked as const. If this happens you could try either putting the section stub in its own source file that is compiled without optimisation, or possibly use of the volatile keyword.