Contents
- Learning Objectives
- Introduction
- ELF Structures and Defines
- Symbol Table
- Relocations
- Steps to Load an ELF File
- Examples
- Sample Code
Learning Objectives
- Understand what the ELF format is used for.
- Be able to read an ELF file.
- Understand the different sections of an ELF file.
Introduction
Executable files are just like any other binary file, except its structure is used to determine where to put data before running instructions. The Executable and Linkable Format (ELF) is used to store information for both executable files and linkable, object files.
We can write a program in the kernel directly and execute it in user space. However, what’s far more likely to happen is a user will download an executable in ELF format onto their block drive. So, we want to be able to use our block device driver to read an ELF file, and then use that ELF file to create a process and run it.
We will mainly concern ourselves with the “execution view” of ELF, which is depicted below.
ELF Structures and Defines
All of the structures can be found in the elf.h header file. (see /usr/include/elf.h on any Tesla/Hydra machine).
Elf Header Structure
typedef struct { unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */ Elf64_Half e_type; /* Object file type */ Elf64_Half e_machine; /* Architecture */ Elf64_Word e_version; /* Object file version */ Elf64_Addr e_entry; /* Entry point virtual address */ Elf64_Off e_phoff; /* Program header table file offset */ Elf64_Off e_shoff; /* Section header table file offset */ Elf64_Word e_flags; /* Processor-specific flags */ Elf64_Half e_ehsize; /* ELF header size in bytes */ Elf64_Half e_phentsize; /* Program header table entry size */ Elf64_Half e_phnum; /* Program header table entry count */ Elf64_Half e_shentsize; /* Section header table entry size */ Elf64_Half e_shnum; /* Section header table entry count */ Elf64_Half e_shstrndx; /* Section header string table index */ } Elf64_Ehdr;
ELF Header Flags/Defines
/* Legal values for e_type (object file type). */ #define ET_NONE 0 /* No file type */ #define ET_REL 1 /* Relocatable file */ #define ET_EXEC 2 /* Executable file */ #define ET_DYN 3 /* Shared object file */ #define ET_CORE 4 /* Core file */ #define ET_NUM 5 /* Number of defined types */ #define ET_LOOS 0xfe00 /* OS-specific range start */ #define ET_HIOS 0xfeff /* OS-specific range end */ #define ET_LOPROC 0xff00 /* Processor-specific range start */ #define ET_HIPROC 0xffff /* Processor-specific range end */ /* e_machine */ #define EM_RISCV 243 /* RISC-V */ /* Legal values for e_version (version). */ #define EV_NONE 0 /* Invalid ELF version */ #define EV_CURRENT 1 /* Current version */ #define EV_NUM 2
Program Header Structure
typedef struct { Elf64_Word p_type; /* Segment type */ Elf64_Word p_flags; /* Segment flags */ Elf64_Off p_offset; /* Segment file offset */ Elf64_Addr p_vaddr; /* Segment virtual address */ Elf64_Addr p_paddr; /* Segment physical address */ Elf64_Xword p_filesz; /* Segment size in file */ Elf64_Xword p_memsz; /* Segment size in memory */ Elf64_Xword p_align; /* Segment alignment */ } Elf64_Phdr;
Program Header Types/Defines
/* Legal values for p_type (segment type). */ #define PT_NULL 0 /* Program header table entry unused */ #define PT_LOAD 1 /* Loadable program segment */ #define PT_DYNAMIC 2 /* Dynamic linking information */ #define PT_INTERP 3 /* Program interpreter */ #define PT_NOTE 4 /* Auxiliary information */ #define PT_SHLIB 5 /* Reserved */ #define PT_PHDR 6 /* Entry for header table itself */ #define PT_TLS 7 /* Thread-local storage segment */ #define PT_NUM 8 /* Number of defined types */ #define PT_LOOS 0x60000000 /* Start of OS-specific */ #define PT_GNU_EH_FRAME 0x6474e550 /* GCC .eh_frame_hdr segment */ #define PT_GNU_STACK 0x6474e551 /* Indicates stack executability */ #define PT_GNU_RELRO 0x6474e552 /* Read-only after relocation */ #define PT_LOSUNW 0x6ffffffa #define PT_HISUNW 0x6fffffff #define PT_HIOS 0x6fffffff /* End of OS-specific */ #define PT_LOPROC 0x70000000 /* Start of processor-specific */ #define PT_HIPROC 0x7fffffff /* End of processor-specific */ /* Legal values for p_flags (segment flags). */ #define PF_X (1 << 0) /* Segment is executable */ #define PF_W (1 << 1) /* Segment is writable */ #define PF_R (1 << 2) /* Segment is readable */ #define PF_MASKOS 0x0ff00000 /* OS-specific */ #define PF_MASKPROC 0xf0000000 /* Processor-specific */
Section Header Structure
typedef struct { Elf64_Word sh_name; /* Section name (string tbl index) */ Elf64_Word sh_type; /* Section type */ Elf64_Xword sh_flags; /* Section flags */ Elf64_Addr sh_addr; /* Section virtual addr at execution */ Elf64_Off sh_offset; /* Section file offset */ Elf64_Xword sh_size; /* Section size in bytes */ Elf64_Word sh_link; /* Link to another section */ Elf64_Word sh_info; /* Additional section information */ Elf64_Xword sh_addralign; /* Section alignment */ Elf64_Xword sh_entsize; /* Entry size if section holds table */ } Elf64_Shdr;
Section Types/Defines
/* Legal values for sh_type (section type). */ #define SHT_NULL 0 /* Section header table entry unused */ #define SHT_PROGBITS 1 /* Program data */ #define SHT_SYMTAB 2 /* Symbol table */ #define SHT_STRTAB 3 /* String table */ #define SHT_RELA 4 /* Relocation entries with addends */ #define SHT_HASH 5 /* Symbol hash table */ #define SHT_DYNAMIC 6 /* Dynamic linking information */ #define SHT_NOTE 7 /* Notes */ #define SHT_NOBITS 8 /* Program space with no data (bss) */ #define SHT_REL 9 /* Relocation entries, no addends */ #define SHT_SHLIB 10 /* Reserved */ #define SHT_DYNSYM 11 /* Dynamic linker symbol table */ #define SHT_INIT_ARRAY 14 /* Array of constructors */ #define SHT_FINI_ARRAY 15 /* Array of destructors */ #define SHT_PREINIT_ARRAY 16 /* Array of pre-constructors */ #define SHT_GROUP 17 /* Section group */ #define SHT_SYMTAB_SHNDX 18 /* Extended section indeces */ #define SHT_NUM 19 /* Number of defined types. */ #define SHT_LOOS 0x60000000 /* Start OS-specific. */ #define SHT_GNU_ATTRIBUTES 0x6ffffff5 /* Object attributes. */ #define SHT_GNU_HASH 0x6ffffff6 /* GNU-style hash table. */ #define SHT_GNU_LIBLIST 0x6ffffff7 /* Prelink library list */ #define SHT_CHECKSUM 0x6ffffff8 /* Checksum for DSO content. */ #define SHT_LOSUNW 0x6ffffffa /* Sun-specific low bound. */ /* Legal values for sh_flags (section flags). */ #define SHF_WRITE (1 << 0) /* Writable */ #define SHF_ALLOC (1 << 1) /* Occupies memory during execution */ #define SHF_EXECINSTR (1 << 2) /* Executable */ #define SHF_MERGE (1 << 4) /* Might be merged */ #define SHF_STRINGS (1 << 5) /* Contains nul-terminated strings */ #define SHF_INFO_LINK (1 << 6) /* `sh_info' contains SHT index */ #define SHF_LINK_ORDER (1 << 7) /* Preserve order after combining */ #define SHF_OS_NONCONFORMING (1 << 8) /* Non-standard OS specific handling required */ #define SHF_GROUP (1 << 9) /* Section is member of a group. */ #define SHF_TLS (1 << 10) /* Section hold thread-local data. */ #define SHF_COMPRESSED (1 << 11) /* Section with compressed data. */ #define SHF_MASKOS 0x0ff00000 /* OS-specific. */ #define SHF_MASKPROC 0xf0000000 /* Processor-specific */
RISC-V Specific
/* RISC-V ELF Flags */ #define EF_RISCV_RVC 0x0001 #define EF_RISCV_FLOAT_ABI 0x0006 #define EF_RISCV_FLOAT_ABI_SOFT 0x0000 #define EF_RISCV_FLOAT_ABI_SINGLE 0x0002 #define EF_RISCV_FLOAT_ABI_DOUBLE 0x0004 #define EF_RISCV_FLOAT_ABI_QUAD 0x0006 /* RISC-V relocations. */ #define R_RISCV_NONE 0 #define R_RISCV_32 1 #define R_RISCV_64 2 #define R_RISCV_RELATIVE 3 #define R_RISCV_COPY 4 #define R_RISCV_JUMP_SLOT 5 #define R_RISCV_TLS_DTPMOD32 6 #define R_RISCV_TLS_DTPMOD64 7 #define R_RISCV_TLS_DTPREL32 8 #define R_RISCV_TLS_DTPREL64 9 #define R_RISCV_TLS_TPREL32 10 #define R_RISCV_TLS_TPREL64 11 #define R_RISCV_BRANCH 16 #define R_RISCV_JAL 17 #define R_RISCV_CALL 18 #define R_RISCV_CALL_PLT 19 #define R_RISCV_GOT_HI20 20 #define R_RISCV_TLS_GOT_HI20 21 #define R_RISCV_TLS_GD_HI20 22 #define R_RISCV_PCREL_HI20 23 #define R_RISCV_PCREL_LO12_I 24 #define R_RISCV_PCREL_LO12_S 25 #define R_RISCV_HI20 26 #define R_RISCV_LO12_I 27 #define R_RISCV_LO12_S 28 #define R_RISCV_TPREL_HI20 29 #define R_RISCV_TPREL_LO12_I 30 #define R_RISCV_TPREL_LO12_S 31 #define R_RISCV_TPREL_ADD 32 #define R_RISCV_ADD8 33 #define R_RISCV_ADD16 34 #define R_RISCV_ADD32 35 #define R_RISCV_ADD64 36 #define R_RISCV_SUB8 37 #define R_RISCV_SUB16 38 #define R_RISCV_SUB32 39 #define R_RISCV_SUB64 40 #define R_RISCV_GNU_VTINHERIT 41 #define R_RISCV_GNU_VTENTRY 42 #define R_RISCV_ALIGN 43 #define R_RISCV_RVC_BRANCH 44 #define R_RISCV_RVC_JUMP 45 #define R_RISCV_RVC_LUI 46 #define R_RISCV_GPREL_I 47 #define R_RISCV_GPREL_S 48 #define R_RISCV_TPREL_I 49 #define R_RISCV_TPREL_S 50 #define R_RISCV_RELAX 51 #define R_RISCV_SUB6 52 #define R_RISCV_SET6 53 #define R_RISCV_SET8 54 #define R_RISCV_SET16 55 #define R_RISCV_SET32 56 #define R_RISCV_32_PCREL 57
Dynamic Section Table
typedef struct { Elf64_Sxword d_tag; /* Dynamic entry type */ union { Elf64_Xword d_val; /* Integer value */ Elf64_Addr d_ptr; /* Address value */ } d_un; } Elf64_Dyn; /* Legal values for d_tag (dynamic entry type). */ #define DT_NULL 0 /* Marks end of dynamic section */ #define DT_NEEDED 1 /* Name of needed library */ #define DT_PLTRELSZ 2 /* Size in bytes of PLT relocs */ #define DT_PLTGOT 3 /* Processor defined value */ #define DT_HASH 4 /* Address of symbol hash table */ #define DT_STRTAB 5 /* Address of string table */ #define DT_SYMTAB 6 /* Address of symbol table */ #define DT_RELA 7 /* Address of Rela relocs */ #define DT_RELASZ 8 /* Total size of Rela relocs */ #define DT_RELAENT 9 /* Size of one Rela reloc */ #define DT_STRSZ 10 /* Size of string table */ #define DT_SYMENT 11 /* Size of one symbol table entry */ #define DT_INIT 12 /* Address of init function */ #define DT_FINI 13 /* Address of termination function */ #define DT_SONAME 14 /* Name of shared object */ #define DT_RPATH 15 /* Library search path (deprecated) */ #define DT_SYMBOLIC 16 /* Start symbol search here */ #define DT_REL 17 /* Address of Rel relocs */ #define DT_RELSZ 18 /* Total size of Rel relocs */ #define DT_RELENT 19 /* Size of one Rel reloc */ #define DT_PLTREL 20 /* Type of reloc in PLT */ #define DT_DEBUG 21 /* For debugging; unspecified */ #define DT_TEXTREL 22 /* Reloc might modify .text */ #define DT_JMPREL 23 /* Address of PLT relocs */ #define DT_BIND_NOW 24 /* Process relocations of object */ #define DT_INIT_ARRAY 25 /* Array with addresses of init fct */ #define DT_FINI_ARRAY 26 /* Array with addresses of fini fct */ #define DT_INIT_ARRAYSZ 27 /* Size in bytes of DT_INIT_ARRAY */ #define DT_FINI_ARRAYSZ 28 /* Size in bytes of DT_FINI_ARRAY */ #define DT_RUNPATH 29 /* Library search path */ #define DT_FLAGS 30 /* Flags for the object being loaded */ #define DT_ENCODING 32 /* Start of encoded range */ #define DT_PREINIT_ARRAY 32 /* Array with addresses of preinit fct*/ #define DT_PREINIT_ARRAYSZ 33 /* size in bytes of DT_PREINIT_ARRAY */ #define DT_SYMTAB_SHNDX 34 /* Address of SYMTAB_SHNDX section */ #define DT_NUM 35 /* Number used */ #define DT_LOOS 0x6000000d /* Start of OS-specific */ #define DT_HIOS 0x6ffff000 /* End of OS-specific */ #define DT_LOPROC 0x70000000 /* Start of processor-specific */ #define DT_HIPROC 0x7fffffff /* End of processor-specific */ #define DT_PROCNUM DT_MIPS_NUM /* Most used by any processor */
Symbol Table
Labels (names given to memory addresses) are stored in symbol table. The symbol table is a mapping of the string label and the memory address that it refers to. We don’t usually care about the symbol table when it comes to loading executables; however, it might be useful if we implement an on-demand loading system, which is typically used in dynamic linking.
typedef struct { Elf64_Word st_name; /* Symbol name (string tbl index) */ unsigned char st_info; /* Symbol type and binding */ unsigned char st_other; /* Symbol visibility */ Elf64_Section st_shndx; /* Section index */ Elf64_Addr st_value; /* Symbol value */ Elf64_Xword st_size; /* Symbol size */ } Elf64_Sym; typedef struct { Elf64_Half si_boundto; /* Direct bindings, symbol bound to */ Elf64_Half si_flags; /* Per symbol flags */ } Elf64_Syminfo; /* Possible values for si_boundto. */ #define SYMINFO_BT_SELF 0xffff /* Symbol bound to self */ #define SYMINFO_BT_PARENT 0xfffe /* Symbol bound to parent */ #define SYMINFO_BT_LOWRESERVE 0xff00 /* Beginning of reserved entries */ /* Possible bitmasks for si_flags. */ #define SYMINFO_FLG_DIRECT 0x0001 /* Direct bound symbol */ #define SYMINFO_FLG_PASSTHRU 0x0002 /* Pass-thru symbol for translator */ #define SYMINFO_FLG_COPY 0x0004 /* Symbol is a copy-reloc */ #define SYMINFO_FLG_LAZYLOAD 0x0008 /* Symbol bound to object to be lazy loaded */ /* Syminfo version values. */ #define SYMINFO_NONE 0 #define SYMINFO_CURRENT 1 #define SYMINFO_NUM 2 /* Both Elf32_Sym and Elf64_Sym use the same one-byte st_info field. */ #define ELF64_ST_BIND(val) ELF32_ST_BIND (val) #define ELF64_ST_TYPE(val) ELF32_ST_TYPE (val) #define ELF64_ST_INFO(bind, type) ELF32_ST_INFO ((bind), (type)) /* Legal values for ST_BIND subfield of st_info (symbol binding). */ #define STB_LOCAL 0 /* Local symbol */ #define STB_GLOBAL 1 /* Global symbol */ #define STB_WEAK 2 /* Weak symbol */ #define STB_NUM 3 /* Number of defined types. */ #define STB_LOOS 10 /* Start of OS-specific */ #define STB_GNU_UNIQUE 10 /* Unique symbol. */ #define STB_HIOS 12 /* End of OS-specific */ #define STB_LOPROC 13 /* Start of processor-specific */ #define STB_HIPROC 15 /* End of processor-specific */ /* Legal values for ST_TYPE subfield of st_info (symbol type). */ #define STT_NOTYPE 0 /* Symbol type is unspecified */ #define STT_OBJECT 1 /* Symbol is a data object */ #define STT_FUNC 2 /* Symbol is a code object */ #define STT_SECTION 3 /* Symbol associated with a section */ #define STT_FILE 4 /* Symbol's name is file name */ #define STT_COMMON 5 /* Symbol is a common data object */ #define STT_TLS 6 /* Symbol is thread-local data object*/ #define STT_NUM 7 /* Number of defined types. */ #define STT_LOOS 10 /* Start of OS-specific */ #define STT_GNU_IFUNC 10 /* Symbol is indirect code object */ #define STT_HIOS 12 /* End of OS-specific */ #define STT_LOPROC 13 /* Start of processor-specific */ #define STT_HIPROC 15 /* End of processor-specific */ /* Symbol table indices are found in the hash buckets and chain table of a symbol hash table section. This special index value indicates the end of a chain, meaning no further symbols are found in that bucket. */ #define STN_UNDEF 0 /* End of a chain. */ /* How to extract and insert information held in the st_other field. */ #define ELF32_ST_VISIBILITY(o) ((o) & 0x03) /* For ELF64 the definitions are the same. */ #define ELF64_ST_VISIBILITY(o) ELF32_ST_VISIBILITY (o) /* Symbol visibility specification encoded in the st_other field. */ #define STV_DEFAULT 0 /* Default symbol visibility rules */ #define STV_INTERNAL 1 /* Processor specific hidden class */ #define STV_HIDDEN 2 /* Sym unavailable in other modules */ #define STV_PROTECTED 3 /* Not preemptible, not exported */
Relocations
The defines that start with R_??? are called relocation info types. A relocation means that the symbol needs to be moved prior or during execution.
#define ELF64_R_SYM(i) ((i) >> 32) #define ELF64_R_TYPE(i) ((i) & 0xffffffff) #define ELF64_R_INFO(sym,type) ((((Elf64_Xword) (sym)) << 32) + (type))
Steps to Load an ELF File
Basic Static Executable
Executable (ET_EXEC) files are the only ones loaded by the operating system directly. Other files are used for object files, shared objects (libraries), and other types of linkable formats.
- Read the ELF header (EH).
- Error check.
- Check EH.ident[0:3] (magic) against '\x7fELF'.
- Check EH.e_machine against EM_RISCV (243).
- Check EH.e_type against ET_EXEC (2).
- Go to the program header table (PH) by seeking to EH.phoff.
- Loop EH.phnum times seeking sizeof(PH) for each iteration.
- Check PH.p_type against PT_LOAD (1).
- Check PH.p_memsz to see if it is non-zero.
- Copy bytes starting at PH.p_offset into process image.
- Allocate full, mappable, pages for each 4,096-byte chunk.
- Map virtual addresses to PH.p_vaddr. (PH.p_paddr must be ignored).
- Set mapping permissions to PH.p_flags (PF_X, PF_W, PF_R).
- Be careful when mapping! Program addresses may overlap pages! Therefore, you might reset a PF_X, PF_W, or PF_R. Flags should be cumulative instead.
- Set the process image’s program counter to EH.e_entry (virtual memory address).
Examples
The following screenshots were taken of different files to show the wider range of options available.