Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output executable crashes with relocation-model=static #62017

Open
Krysme opened this issue Jun 21, 2019 · 10 comments
Open

Output executable crashes with relocation-model=static #62017

Krysme opened this issue Jun 21, 2019 · 10 comments
Labels
C-bug Category: This is a bug. O-linux Operating system: Linux T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Krysme
Copy link

Krysme commented Jun 21, 2019

The following code crashes with "relocation-model=static" on linux x86-64, reproducible.

#![feature(nll)]

fn main() {
    fn divide(numerator: f64, denominator: f64) -> Option<f64> {
        if denominator == 0.0 {
            None
        } else {
            Some(numerator / denominator)
        }
    }
    let result = divide(2.0, 3.0);

    match result {
        Some(x) => println!("Result: {}", x),
        None => println!("Cannot Divide by 0"),
    }
}
@tesuji

This comment has been minimized.

@mati865
Copy link
Contributor

mati865 commented Jun 21, 2019

@lzutao you have to use relocation-model=static as mentioned in the issue.

LLDB backtrace
(lldb) r
Process 17646 launched: '/tmp/foo/t' (x86_64)
Process 17646 stopped
* thread #1, name = 't', stop reason = signal SIGSEGV: invalid address (fault address: 0x36a75)
    frame #0: 0x00000000004022ef t`t::main::h30886ce2ae13d6f9 + 127
t`t::main::h30886ce2ae13d6f9:
->  0x4022ef <+127>: movq   0x36a75, %rsi
    0x4022f7 <+135>: movsd  0x30(%rsp), %xmm0         ; xmm0 = mem[0],zero
    0x4022fd <+141>: movsd  %xmm0, 0x38(%rsp)
    0x402303 <+147>: leaq   0x38(%rsp), %rax
(lldb) bt
* thread #1, name = 't', stop reason = signal SIGSEGV: invalid address (fault address: 0x36a75)
  * frame #0: 0x00000000004022ef t`t::main::h30886ce2ae13d6f9 + 127
    frame #1: 0x0000000000402583 t`std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h819e5a92dacb5f12 + 3
    frame #2: 0x000000000040ad13 t`std::panicking::try::do_call::h37d577b51b92ce4b [inlined] std::rt::lang_start_internal::_$u7b$$u7b$closure$u7d$$u7d$::hc50706b6e8c189f3 at rt.rs:49:12
    frame #3: 0x000000000040ad07 t`std::panicking::try::do_call::h37d577b51b92ce4b at panicking.rs:294
    frame #4: 0x000000000040c76a t`__rust_maybe_catch_panic at lib.rs:82:7
    frame #5: 0x000000000040b7cd t`std::rt::lang_start_internal::h17a202bf41caec69 [inlined] std::panicking::try::hd2f41f2af573d056 at panicking.rs:273:12
    frame #6: 0x000000000040b78f t`std::rt::lang_start_internal::h17a202bf41caec69 [inlined] std::panic::catch_unwind::h1d1e6c4c3c3f0d9c at panic.rs:388
    frame #7: 0x000000000040b78f t`std::rt::lang_start_internal::h17a202bf41caec69 at rt.rs:48
    frame #8: 0x0000000000402568 t`std::rt::lang_start::h3d4b63434ee976c6 + 56
    frame #9: 0x000000000040240e t`main + 30
    frame #10: 0x00007ffff7d93b6b libc.so.6`__libc_start_main + 235
    frame #11: 0x000000000040216a t`_start + 42

@rustbot modify labels: +C-bug +O-linux

@rustbot rustbot added C-bug Category: This is a bug. O-linux Operating system: Linux labels Jun 21, 2019
@jonas-schievink jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 21, 2019
@nagisa
Copy link
Member

nagisa commented Jun 21, 2019

Likely cause is that PIC libstd and libcore are used by the reporter – this results in assembly such as this:

_ZN4test4main17h21e0b94bad749a9eE:
  subq  $104, %rsp
  movq	_ZN4core3fmt5float52_$LT$impl$u20$core..fmt..Display$u20$for$u20$f64$GT$3fmt17he52c1166ee322c16E@GOTPCREL, %rsi
  <snip>

Which after assembling then results in machine code like this (note a null pointer dereference):

0000000000000000 <_ZN4test4main17h21e0b94bad749a9eE>:
   0:	48 83 ec 68          	sub    $0x68,%rsp
   4:	48 8b 34 25 00 00 00 	mov    0x0,%rsi
   b:	00 
   <snip>

which then linker relocates to, hiding the obvious null dereference:

Dump of assembler code for function _ZN4test4main17h21e0b94bad749a9eE:
   0x00000000004022f0 <+0>:	sub    $0x68,%rsp
   0x00000000004022f4 <+4>:	mov    0x32900,%rsi
   <snip>

From what I can tell this is very likely to be user error.

@Krysme
Copy link
Author

Krysme commented Jun 24, 2019

@nagisa so, is there any way that I can get a version of libstd and libcore with relocation-model=static, or do I have to recompile it from scratch ? A link to the documentation would be helpful, thanks in advance.

@nagisa
Copy link
Member

nagisa commented Jun 24, 2019

You can use xargo or whatever the current maintained version of that thing is to build libstd and libcore. cargo currently does not support that.

@jsgf
Copy link
Contributor

jsgf commented Aug 15, 2019

I can reproduce this easily, but only with non-optimized builds. With any level of optimization this goes away. For example, this input:

fn main() {
  let _foo = format!("{}", 4);
}

will produce a crashing executable if compiled with rustc -Crelocation-mode=static --emit=asm,link --crate-type=bin reloc.rs, as it has the bogus

movq    _ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17hd0b7b005436cd7b8E@GOTPCREL, %rsi

non-rip-relative GOTPCREL relocation.

But if I add -Copt-level=1, it generates a proper rip-relative reference:

movq    _ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17hd0b7b005436cd7b8E@GOTPCREL(%rip), %rax

So I think there's more to it than just having static vs pic libstd.

@jesboat
Copy link

jesboat commented Aug 16, 2019

I think this might be in LLVM. I made a self-contained (i.e. doesn't pull in any library stuff like format!) example, the tldr of which "call a function where an argument is a function pointer to another crate":

extern crate a;
  
pub fn takep(arg: fn() -> ()) -> () {}  

pub fn example1() {
  takep(a::voidfn);
} 

The last LLIR before the instruction selection phases is

define void @_ZN4main8example117h149bc290ecddb5b8E() unnamed_addr #0 {
start:
  call void @_ZN4main5takep17h851b97fbf6407c1bE(void ()* nonnull @_ZN1a6voidfn17h7d790e74ed3ff940E)
  br label %bb1

bb1:                                              ; preds = %start
  ret void
}

and immediately after the instruction selection phases, the call is lowered to:

  %0:gr64 = MOV64rm $noreg, 1, $noreg, target-flags(x86-gotpcrel) @_ZN1a6voidfn17h7d790e74ed3ff940E, $noreg
  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def $rsp, implicit-def $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp
  $rdi = COPY %0:gr64  
  CALL64pcrel32 @_ZN4main5takep17h851b97fbf6407c1bE, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b   $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit      $ssp, implicit $rdi
  ADJCALLSTACKUP64 0, 0, implicit-def $rsp, implicit-def $eflags, implicit-def $ssp, implicit $rsp, implicit $ssp

Note how the MOV64rn has $noreg.

In contrast, when the example is modified slightly, so that it would need to unwind if takep throws:

pub fn example2() {
  let _v = Droppable{};
  takep(a::voidfn);
}

The LLIR right before ISel (excluding basic blocks which are only called if unwinding)

define void @_ZN4main8example217h4f443cbc5e17bbddE() unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*,    %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
start:
  %personalityslot = alloca { i8*, i32 }, align 8
  %_v = alloca %Droppable, align 1
  invoke void @_ZN4main5takep17h851b97fbf6407c1bE(void ()* nonnull @_ZN1a6voidfn17h7d790e74ed3ff940E)
          to label %bb2 unwind label %cleanup

bb2:                                              ; preds = %start
  call void @_ZN4core3ptr18real_drop_in_place17h6782efda6a962288E(%Droppable* nonnull align 1 %_v)
  br label %bb4

bb4:                                              ; preds = %bb2
  ret void

The LLVM language reference says about invoke

This instruction is designed to operate as a standard ‘call’ instruction in most regards. The primary difference is that it establishes an association with a label, which is used by the runtime library to unwind the stack.

However, here, the lowering is

  EH_LABEL <mcsymbol .Ltmp0>
  ADJCALLSTACKDOWN64 0, 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
  %0:gr64 = MOV64rm $rip, 1, $noreg, target-flags(x86-gotpcrel) @_ZN1a6voidfn17h7d790e74ed3ff940E, $noreg :: (load 8 from got)
  $rdi = COPY %0:gr64
  CALL64pcrel32 @_ZN4main5takep17h851b97fbf6407c1bE, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b   $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit      $ssp, implicit $rdi, implicit-def $rsp, implicit-def $ssp
  ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp
  EH_LABEL <mcsymbol .Ltmp1>
  JMP_1 %bb.2

where the MOV64rn has $rip.

A not-particularly-educated guess would be that FastISel has a fast path for call which has a bug, and that bug is not present in the slow path for invoke.

The full example, including source and unabridged LLIR and MIR is https://gist.github.com/jesboat/a88aa28f0a80c876872425901bdd9279

@jesboat
Copy link

jesboat commented Aug 17, 2019

I am reasonably confident at this point that this is a bug in LLVM's FastIsel. This LLIR

define void @_ZN4main8example117h149bc290ecddb5b8E() unnamed_addr #0 {
start:
  %ptr = alloca void ()*
  store void ()* @_ZN1a6voidfn17h7d790e74ed3ff940E, void ()** %ptr
  ret void
}

is sufficient to produce a broken binary on trunk LLVM with llc -O0 --relocation-model=static. Adding -fast-isel=false to the options produces a non-broken load.

https://godbolt.org/z/Ga7aJg

@Krysme
Copy link
Author

Krysme commented Oct 10, 2019

Thanks for all the replies, and I hope that LLVM developers will resolve this one as soon as possible.

@yshui
Copy link
Contributor

yshui commented Feb 25, 2021

I just hit this problem. Did anyone report this bug to LLVM?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-linux Operating system: Linux T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

9 participants