Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new relocation for landing pad #452

Open
wants to merge 8 commits into
base: complex-label-lp
Choose a base branch
from

Conversation

kito-cheng
Copy link
Collaborator

The R_RISCV_LPAD relocation can be used for PLT entry generation and also for
linker relaxation. Additionally, we defined a new mapping symbol type to help
users understand the function signature for the corresponding function.

The addend value is the label value, and it will point to the mapping symbol
placed at the beginning of the function.

e.g.

foo:         # void foo(void)
$sFvvE:
    lpad 123 # R_RISCV_LPAD $sFvvE + 123

We propose two linker relaxations for the landing pad. The first is removing
the entire landing pad, which can be used when symbols have local visibility,
and the address is not taken by any other reference. The second is a landing
pad scheme conversion, designed for backward compatibility (or as a workaround)
for legacy programs that may use functions without declarations.


NOTE: This is based on #434

The R_RISCV_LPAD relocation can be used for PLT entry generation and also for
linker relaxation. Additionally, we defined a new mapping symbol type to help
users understand the function signature for the corresponding function.

The addend value is the label value, and it will point to the mapping symbol
placed at the beginning of the function.

e.g.
```asm
foo:         # void foo(void)
$sFvvE:
    lpad 123 # R_RISCV_LPAD $sFvvE + 123
```

We propose two linker relaxations for the landing pad. The first is removing
the entire landing pad, which can be used when symbols have local visibility,
and the address is not taken by any other reference. The second is a landing
pad scheme conversion, designed for backward compatibility (or as a workaround)
for legacy programs that may use functions without declarations.
@kito-cheng
Copy link
Collaborator Author

cc @deepak0414 @ved-rivos @mylai-mtk

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
Copy link
Contributor

@aswaterman aswaterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this reloc also provides enough info for the linker to fix up direct jumps to skip over landing pads when the landing pad was not relaxed.

Consider I have call foo, and the linker sees that the target of that call still has an R_RISCV_LPAD reloc (i.e. not relaxed). The linker should be able to adjust call foo to call foo+4. This should work whether the target is the PLT or the real function.

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Show resolved Hide resolved
riscv-elf.adoc Outdated

Condition:: This relaxation can be performed without `R_RISCV_RELAX`, and
should not be enabled by default. The user must explicitly enable this
relaxation, and it should only be applied during static linking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: what happens in case of dynamic linking ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it some more thought, and this can actually be applied beyond just static linking. However, dependent shared libraries won’t automatically convert along with it. I’ve removed that limitation and added a NOTE to explain the situation.

riscv-elf.adoc Outdated
Comment on lines 2333 to 2338
Condition:: The associated function of this lpad must have local visibility, and
it must not be referenced by any relocation other than `R_RISCV_CALL` and
`R_RISCV_CALL_PLT`.
This relaxation can also be performed when the function has global visibility,
if the symbol does not have a corresponding PLT entry and is not referenced by
the GOT or by any relocation other than `R_RISCV_CALL` and `R_RISCV_CALL_PLT`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can explain the behavior more clearly if we avoid mentioning symbol visibility. The only important thing here is whether or not a symbol is visible to other ELF modules, i.e. whether or not the symbol is in the dynamic symbol table. Symbol visibility is just one way to control it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, applied 02546de :)

@@ -548,7 +548,9 @@ Description:: Additional information about the relocation
<| S - P
.2+| 65 .2+| TLSDESC_CALL .2+| Static | .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only
<|
.2+| 66-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use
.2+| 66 .2+| LPAD .2+| Static | .2+| Annotates the landing pad instruction inserted at the beginning of the function. The addend indicates the label value of the landing pad, and the symbol value is the address of the mapping symbol for the function signature, which will have the same address as the function.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the "label value of the landing pad?

riscv-elf.adoc Outdated
Comment on lines 2336 to 2338
This relaxation can also be performed when the function has global visibility,
if the symbol does not have a corresponding PLT entry and is not referenced by
the GOT or by any relocation other than `R_RISCV_CALL` and `R_RISCV_CALL_PLT`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we find a function symbol for a given R_RISCV_LPAD relocation? Should we just look for a function symbol at the same location as the relocation refers to?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, this design requires scanning through the symbol table and relocation table once to figure out which symbols have landing pads. I had previously thought about creating a new section to handle this, but I realized that when dealing with linker relaxation, the best way is still to use relocations to mark them. This approach also avoids introducing a new section format.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, this design requires scanning through the symbol table and relocation table once to figure out which symbols have landing pads.

Does this mean that O(R + F) time complexity and O(R) space complexity, where R is number of LPAD relocations and F is number of global function symbols in the same section, to build a map from function symbols to lpad labels is expected? This map is needed to synthesize PLT.

Note: The complexities come from the following algorithm:

HashMap<Address, LabelLabel> LpadRelocMap;
HashMap<FunctionSymbol, LpadLabel> LpadLabelMap;
for (auto R: LpadRelocations)
  LpadRelocMap.insert(R.Address, R.addend);

for (auto S: FunctionSymbols)
  if (S.Address in LpadRelocMap) lpadLabelMap.insert(S, LpadRelocMap[S.Address]);

Description:: This relaxation type allows an `lpad` instruction to be relaxed
into `lpad 0`, which is a universal landing pad that ignores the label value
comparison. This relaxation is used when the label value is not computed
correctly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the cases where a label may be computed incorrectly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some legacy programs don’t properly declare function prototypes before calling them. In these cases, the compiler will infer a function prototype based on the language standards, but it often ends up being incorrect. One common example is dhrystone[1]. In most versions you find online, Func_2 isn’t declared before it’s called, so the compiler will assume the prototype is int Func_2(char*, char*), but the correct prototype is actually void Func_2(char[31], char[31]).

[1] https://github.com/sifive/benchmark-dhrystone/blob/master/dhry_1.c#L164

Another common potential issue in C is with qsort. Function pointers can be compatible but not perfectly match the expected type. For example, here’s how qsort is declared:

void qsort(void* ptr, size_t count, size_t size, int (*comp)(const void*, const void*));

But in practice, you can pass in a compatible, but not exactly matching, type for the comparison function, and it works in most cases:

#include <stdlib.h>

int compare(int *a, int *b)  // The signature isn’t int (*)(const void*, const void*)
{
    return *(int *)a - *(int *)b;
}

void foo(int *x, size_t count, size_t size)
{
    qsort(x, count, size, compare);  // But in practice, this works fine
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how is the linker expected to know the incorrectness so it can perform this relaxation?

The Zicfilp mechanism is employed when issuing an indirect call through function pointers, and when calling functions through PLT:

In the first case (indirect calls through pointers), to know that an lpad insn needs to be relaxed to lpad 0 due to the original label being incorrect, the linker would need to know where the pointer points to, so the caller's label (the "correct" one) can be checked against the callee's label (the lpad insn). I'm not sure if this is the scenario you're targeting, but if it is, I think this (knowing where the pointer points to, or knowing where the call would come from) is an expectation too high for linkers. Besides, in this scenario, I would also wonder how linkers are expected to retrieve the caller's label (the "correct" one) so it can make the comparison?

In the second case (calls through PLT), the indirect call happens in the PLT, which is generated by linkers. The label which linkers use to generate PLT would come from the addend of the LPAD relocation, which should contain the same label as the one in its referenced lpad insn, so there would be no chance of mismatch and thus incorrectness identifiable by linkers.

The above is my guess and understanding of the intended usage of this relaxation. If we're not on the same page, please do let me know.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linker never know (or not always know), and also that's not the right layer to analysis (or guess:P ), so I expect that relaxation should only enabled when user pass something like -z force-simple-landing-pad-scheme to linker.

@@ -1582,6 +1584,7 @@ A number of symbols, named mapping symbols, describe the boundaries.
| $x.<any>
| $x<ISA> .2+| Start of a sequence of instructions with <ISA> extension.
| $x<ISA>.<any>
| $s<function-signature-string> | Marker for the landing pad instruction. This should only be used with the function signature-based scheme and should be placed only at the beginning of the function.
Copy link

@mylai-mtk mylai-mtk Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get the purpose of this mapping symbol: It looks like the only reference to these symbols come from the LPAD relocation, but for what? The LPAD relocation already have the 20-bit label stored in its addend, and its link to this %s mapping symbol provides only the additional information of the signature string, which is not needed (as of now) to link the object files.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

riscv/riscv-cfi#151 (comment)

It's kinda debugging propose only, so it safe to strip like all other mapping symbols

Copy link

@mylai-mtk mylai-mtk Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the purpose is to display function signatures when disassembling, this mechanism seems a bit incomplete (?) I suppose since the relocation is a static one, it would not stay in the binary after static linking, thus if a user disassembles a linked ELF, it's still the label numbers instead of signatures that get displayed?

Update: Assuming it's relying on the mapping symbol having the same address as the lpad insn to associate an lpad insn to a function signature (so that the signature can be displayed when disassembling a linked binary), why do relocations refer to these symbols?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the purpose of the mapping symbol is to provide debug/disassembling info, I think after the introduction of the .riscv.lpadinfo section, this purpose can be better served by the new section, since it contains the same information, and has a more size-compact format than the symbol table.

@mylai-mtk
Copy link

Q: Does the LPAD relocation and/or the $s<func-sig> mapping symbol serve the purpose of providing labels when generating PLT? If so, how is it expected to extract the label for a given function symbol from these new entities?

@rui314
Copy link
Collaborator

rui314 commented Oct 18, 2024

I don't think I fully understand this proposal, but from the linker's perspective, I believe we just need the following:

  • A R_RISCV_LPAD relocation, whose r_offset is at each removable lpad instruction location, and r_sym refers to a function symbol
  • If the function's address was not taken (i.e. the function symbol was not referenced by anything but R_RISCV_CALL or R_RISCV_CALL_PLT) and the function symbol is not in the dynamic symbol, the linker removes the lpad instruction.

I don't think we need a mapping symbol for the linker to remove a lpad instruction.

Rephase to make it clearly about it can remove instruction.
- Drop the restriction of static link
- Emphasis must be applied to all `R_RISCV_LPAD`
- GNU property and PLT entries must adjust too.
… table

- Updated the relaxation condition to apply only when the symbol is not
  exported to the dynamic symbol table.
@kito-cheng
Copy link
Collaborator Author

@rui314

I don't think I fully understand this proposal, but from the linker's perspective, I believe we just need the following:

A R_RISCV_LPAD relocation, whose r_offset is at each removable lpad instruction location, and r_sym refers to a function symbol
If the function's address was not taken (i.e. the function symbol was not referenced by anything but R_RISCV_CALL or R_RISCV_CALL_PLT) and the function symbol is not in the dynamic symbol, the linker removes the lpad instruction.
I don't think we need a mapping symbol for the linker to remove a lpad instruction.

The design of the mapping symbol is meant to make debugging easier for users, so it’s actually optional. This was discussed in the CFI spec issue [1]. If we set aside that purpose, I also think it’s a better approach to have r_sym point to the function symbol.

As for using the addend to record the landing pad value, it helps with PLT generation. The linker can get the value by scanning the instruction at that address, but encoding it directly in the addend avoids needing to read the instruction during relocation.

[1] riscv/riscv-cfi#151 (comment)

@@ -548,7 +548,9 @@ Description:: Additional information about the relocation
<| S - P
.2+| 65 .2+| TLSDESC_CALL .2+| Static | .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only
<|
.2+| 66-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use
.2+| 66 .2+| LPAD .2+| Static | .2+| Annotates the landing pad instruction inserted at the beginning of the function. The addend indicates the label value of the landing pad, and the symbol value is the address of the mapping symbol for the function signature, which will have the same address as the function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relocation only for the func-sig scheme? Based on its description, it looks like so, but the following LPAD relaxation that removes the lpad insn seems also applicable to the unlabeled scheme.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should also work for unlabeled scheme as well, let me think how to make it clearly.

@mylai-mtk
Copy link

mylai-mtk commented Oct 23, 2024

@kito-cheng

As for using the addend to record the landing pad value, it helps with PLT generation. The linker can get the value by scanning the instruction at that address, but encoding it directly in the addend avoids needing to read the instruction during relocation.

I think this doesn't provide the lpad label we need for PLT generation? For a call target to be generated in the PLT, the target should reside in a shared library or somewhere that the linker cannot determine during static-time linking. When the target resides in a shared library, it does not have the LPAD relocation associated with it, since the relocation is a static relocation and should be consumed by the static linker that created the shared library?

@kito-cheng
Copy link
Collaborator Author

@mylai-mtk

I think this doesn't provide the lpad label we need for PLT generation? For a call target to be generated in the PLT, the target should reside in a shared library or somewhere that the linker cannot determine during static-time linking. When the target resides in a shared library, it does not have the LPAD relocation associated with it, since the relocation is a static relocation and should be consumed by the static linker that created the shared library?

Hmmmmmmmmmmmmmm, yeah, do you have any better idea that creating a new section to recording the label value for those undefined symbols?

@mylai-mtk
Copy link

@kito-cheng

Hmmmmmmmmmmmmmm, yeah, do you have any better idea that creating a new section to recording the label value for those undefined symbols?

Nope. After skimming through existing sections, I don't have ideas better than creating a new section.

My previous prototype of generating extra symbols for every function may work, but there's too many drawbacks with this approach, including bloated symbol table size and slowed down symbol resolution performance, so I would not recommend this approach seriously.

@kito-cheng
Copy link
Collaborator Author

@mylai-mtk added new section. and also new asm directive riscv-non-isa/riscv-asm-manual#113

Comment on lines +1596 to +1597
Elf64_Word lpi_name; /* Symbol name (string tbl index) */
Elf64_Word lpi_sig; /* Signature for the symbol (string tbl index) */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use uint32_t here instead of the less clear ElfN_Word? It looks like other string table references also use uint32_t, e.g. st_name in struct Elf64_Sym.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's uint32_t, but using ElfNN_* types for the consistency with other ELF structure

/* Types for signed and unsigned 32-bit quantities.  */
typedef uint32_t Elf32_Word;
typedef int32_t  Elf32_Sword;
typedef uint32_t Elf64_Word;
typedef int32_t  Elf64_Sword;

{
Elf64_Word lpi_name; /* Symbol name (string tbl index) */
Elf64_Word lpi_sig; /* Signature for the symbol (string tbl index) */
Elf64_Word lpi_value; /* Landing pad value for the symbol */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lpad labels are only 20-bit wide, so we can use uint32_t. Same for the 32-bit version above.

Elf32_Word lpi_name; /* Symbol name (string tbl index) */
Elf32_Word lpi_sig; /* Signature for the symbol (string tbl index) */
Elf32_Word lpi_value; /* Landing pad value for the symbol */
} Elf32_Lpadinfo;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elf32_LpadInfo? (Big I instead of small i) (Same for 64-bit version below)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's kinda following the naming convention in the ELF stuffs, e.g. Elf64_Verdef, Elf64_Verdaux, Elf64_Verneed, Elf64_Nhdr...

NOTE: Using same encoding as mapping symbol aims to reduce the size of the
string table

Every symbol with global or weak bind must has a corresponding entry in this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "Every function symbol with global or weak"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, should just restrict to symbol with function type

Copy link

@mylai-mtk mylai-mtk Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we should restrict this to global or pointer-taken symbols, i.e. function symbols with an lpad instruction after relaxation

section, the `lpi_name` field must be the same as the symbol name string table
index.

This section can be discard after static linking stage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you mentioned that "Every symbol with global or weak bind must has a corresponding entry", I think it implies that the lpad labels are provided by the object file that defines the function, right? If this is the case, we can't discard this section after static linking when creating a shared library, since library users would expect to find lpad labels later when linking against this share library.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still know the signature/landing pad label value when we reference to a symbol which is undefined yet, because we always need declare the prototype in the source code.

"Every symbol with global or weak bind must has a corresponding entry" -> we didn't exclude the undefined symbol, so we can link to the shared library without that info

Copy link

@mylai-mtk mylai-mtk Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If labels can be provided by the object that uses but doesn't define the function, why require labels to be there in the defining object ("Every symbol with global or weak bind must has a corresponding entry")? For the sake of checking if the use-site and define-site agree on the same lpad label?

Comment on lines +1619 to +1621
Static linker should emit error if objects with same symbol but different
landing pad value are beging merged, however it may suppress the error if
linker enable the landing pad schem relaxation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it beneficial that we also implement this check in the dynamic linker for dynamic symbols? E.g. Abort the program if label mismatches are found at program startup or dlopen()

Comment on lines +1607 to +1611
The string hold by `lpi_signature` field is the function signature string, which
is encoded as same as the mapping symbol of the function signature.

NOTE: Using same encoding as mapping symbol aims to reduce the size of the
string table

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the signatures stored start with $ and is in the format of $x<function-signature-string>? (Note the additional x there)

If the goal is to save bytes in the string table, I think we can always use Symbol($x<function-signature-string>).st_name + 2 to specify the signature string to keep the referred signature precise without the $x prefix?

@@ -1582,6 +1584,7 @@ A number of symbols, named mapping symbols, describe the boundaries.
| $x.<any>
| $x<ISA> .2+| Start of a sequence of instructions with <ISA> extension.
| $x<ISA>.<any>
| $s<function-signature-string> | Marker for the landing pad instruction. This should only be used with the function signature-based scheme and should be placed only at the beginning of the function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the purpose of the mapping symbol is to provide debug/disassembling info, I think after the introduction of the .riscv.lpadinfo section, this purpose can be better served by the new section, since it contains the same information, and has a more size-compact format than the symbol table.

@@ -548,7 +548,9 @@ Description:: Additional information about the relocation
<| S - P
.2+| 65 .2+| TLSDESC_CALL .2+| Static | .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only
<|
.2+| 66-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use
.2+| 66 .2+| LPAD .2+| Static | .2+| Annotates the landing pad instruction inserted at the beginning of the function. The addend indicates the label value of the landing pad, and the symbol value is the address of the mapping symbol for the function signature, which will have the same address as the function.
Copy link

@mylai-mtk mylai-mtk Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this LPAD relocation intend to serve the "real" purpose of a relocation? That is, ask linkers to fill-in some value (in this case, the label of the lpad instructions) to some offset at link time.

When prototyping this relocation in LLVM, it appears to me that the LLVM backend assumes places to be relocated have a placeholder value 0 encoded, so to emit the LPAD relocations, 0s would have to be encoded at the label locations in relocatable files. This can of course be changed to encode the correct label along with relocation emitted, but I want to ask if this change is needed, or I can rely on linkers to fill-in the correct labels when relocating at static time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants