Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding section to ARM ELF causes problem in file size #1085

Open
Miniman2718 opened this issue Jul 31, 2024 · 8 comments
Open

Adding section to ARM ELF causes problem in file size #1085

Miniman2718 opened this issue Jul 31, 2024 · 8 comments

Comments

@Miniman2718
Copy link

I'm trying to add a new section to an ARM ELF file. The section is added, but the ELF goes from about 300kB to more than 380MB.
Digging a little bit deeper, I noticed that the problems seems to be the paddr of the new segment to which the new section is mapped: paddr is set to be equal to the vaddr, and that causes the binary writing to create an unexpectedly big ELF file.

I noticed that I have this problem only on ARM ELFs, where vaddr and paddr of all the other segments are not equals. In x86 examples I tried, there is no such problem as vaddr and paddr are the same.

Looking inside LIEF code, I noticed that physical_address is set to virtual_address, and I figured it could be the cause of my problem.
Am I missing something?

To Reproduce
Code to reproduce the behavior:

#!/usr/bin/env python
from lief import ELF
from pprint import pprint

binary = ELF.parse("./blinking_led.elf")

new_section = ELF.Section(".instr")
new_section.content = [0x90] * 0x100
new_section.size = 100

binary.add(new_section)
binary.write("./blinkv2.elf")

Expected behavior
A new ELF files, which slightly bigger than the original one (and not 380MB starting from 300kB)

Environment:

  • System and Version : Arch Linux (6.10.1)
  • Target format (PE, ELF, Mach-O): ELF for ARM architecture
  • LIEF commit version: 0.15.1
@Miniman2718
Copy link
Author

Miniman2718 commented Aug 5, 2024

Looking deeper into the library, I started compiling and debugging it to find out what really happens. Turns out that the PHDR is not found because in my ELFs there is not a specific segment with type PHDR. This means relocate_phdr_table_v3 is called and the binary size explodes. I have some question regarding the PHDR relocation

  1. Why is segment type PHDR used to find the PHDR instead of using the ELF header field e_phoff?
  2. Why are virtual_addresses used to compute the offset where the PHDR is relocated to?

@romainthomas
Copy link
Member

Hi @Miniman2718

Sorry for the late reply. Actually the increased size is intended and if you can find an alternative that do not require to expand the .bss section, I will be happy to merge your PR.

Why is segment type PHDR used to find the PHDR instead of using the ELF header field e_phoff?

Because the kernel/loader is doing this.

Why are virtual_addresses used to compute the offset where the PHDR is relocated to?

Because this is what matter for the loader.

I close the issue but feel free to re-open it if you find a better approach or have suggestions to fix it.

@Miniman2718
Copy link
Author

Hi @romainthomas, thanks for the reply. However some points are still unclear to me.

Because the kernel/loader is doing this.

Which kernel/loader are you referring to? I'm working in ARM embedded and I have no problem in loading and executing ELF files with no PHDR segment. I also used the readelf command (x86 architecture), and PHDR is found even if there is no such segment type. I also found online some resources explaining that PHDR is not required, but they didn't seemed very reliable and they had no specification on the used loader version.

Because this is what matter for the loader.

I cannot really understand what you mean by this. My question is about the relationship between virtual addresses and file offset.

const uint64_t last_offset = virtual_size();
virtual_size() is used to compute a file offset, but I can't wrap my head around it. Given the specific nature of the ELF files I'm working with, this causes the last_offset to be unreasonably big. I'm now working on a solution using segment->file_offset() instead of segment->virtual_size() to keep the size of the binary small even if virtual addresses are very big but I'm afraid I'm missing the point of the current approach.

@romainthomas
Copy link
Member

Hi @Miniman2718

Which kernel/loader are you referring to?

Linux loader/kernel and yes I agree that it can works without a PHDR segment (in the case of static binary).

My question is about the relationship between virtual addresses and file offset.

Actually the ELF loader requires that va % pagesize() == offset this is a strong requirement that needs to be considered when relocating the segments.

Could you attach the binary you are trying to modify?

@Miniman2718
Copy link
Author

Hi @romainthomas

Actually the ELF loader requires that va % pagesize() == offset

I'm not sure I understood what you meant here. Quoting the documentation referred to from the LSB,

loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size.

which is different from what I think va % pagesize() == offset means. Could it be possible that you chose to use

const uint64_t last_offset = virtual_size();

because it was the easiest version to enforce the congruence?
I wrote a new function to find the first free offset for which the congruence still holds instead of using virtual addresses, and it is working fine for me while keeping the size of the binary pretty small.
May I also ask why PHDR is relocated at every new added segment, even if I open an already edited file? Is it done do avoid resizing problem which could exceed previously reserved space if too many segments are added?

Here is the file I'm trying to modify: elf_file.tar.gz

@romainthomas
Copy link
Member

which is different from what I think va % pagesize() == offset means. Could it be possible that you chose to use

Yes you are right, I missed the modulus on the right hand side.

I took a look at your branch and your update breaks this binary
docker-init.elf.zip

To observe the issue (which can run on a regular Linux x86-64)

import lief
elf = lief.ELF.parse("docker-init.elf")

segment = lief.ELF.Segment()
segment.type = lief.ELF.Segment.TYPE.LOAD
segment.content = [1] * 12
elf.add(segment)

elf.write("/tmp/out.bin")

With the current implementation in LIEF /tmp/out.bin is running but with your updates, /tmp/out.bin segfaults.

@Miniman2718
Copy link
Author

Miniman2718 commented Aug 26, 2024

Hello @romainthomas,
yes, the error is caused by

phdr_load_segment->virtual_address(imagebase() + phdr_reloc_info_.new_offset);

The newly computed offset is used to compute the new virtual address, and since I changed const uint64_t last_offset = virtual_size();, it isn't safe to use anymore. The new address is in conflict with another LOAD segment.
I would guess that the same problem can be found wherever va % pagesize() == offset has been used instead of the modulo congruence, but I didn't have the chance to take a look to other parts of the library.

I fixed it by changing the line in phdr_load_segment->virtual_address(imagebase() + virtual_size()); but this is just a patch and I'm not sure it actually solves the problem even if it is working on your elf.

@romainthomas
Copy link
Member

Indeed it works and it passes the test suite. I re-open the issue and I'll take a closer look a these updates.

Thank you pushing your idea even though I was skeptical :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants