[AVR] Wrap out-of-bounds relative jumps #118015

Patryk27 · 2024-11-28T15:20:26Z

This commit improves the relative jumps, so that we are able to emit rjmp that wraps around the memory boundary.

Motivation

Over Rahix/avr-hal#585 it was found out that the AVR codegen sometimes crashes with the out of range branch target message - this turned out to be a fallout from #106722 related to the fact that linkers are smarter than they seem!

The issue stems from the rjmp instruction - even though it can only encode offsets in the +-4 kB range, it can be used to jump into further memory addresses by relying on the fact that AVR memory wraps around the boundaries. That is, for a CPU with 8 kB of flash, rjmp 5000 (illegal) can be actually encoded as rjmp -3192 (legal).

Linkers apply this trick automatically, but since over #106722 we've decided to go with the let's avoid emitting relocations route...

We would've caught this, but since the branch-relaxation-long.ll test generated only the textual assembly, it never went near the assertion - I've changed the test to use -filetype=obj | llvm-objdump -dr now (which causes it to crash on the main branch, but succeeds with my changes here).

Patryk27 · 2024-11-28T15:20:36Z

cc @benshi001

llvmbot · 2024-11-28T15:21:02Z

@llvm/pr-subscribers-mc

Author: Patryk Wychowaniec (Patryk27)

Changes

This commit changes shouldForceRelocation() to always return true for indirect jumps, direct jumps and calls.

Motivation

Over Rahix/avr-hal#585 it was found out that the AVR codegen sometimes crashes with the out of range branch target message - this turned out to be a fallout from #106722 related to the fact that linkers are smarter than they seem!

The issue stems from the rjmp instruction - even though it can only encode offsets in the +-4 kB range, it can be used to jump into further memory addresses by relying on the fact that AVR memory wraps around the boundaries. That is, for a CPU with 8 kB of flash, rjmp 5000 (illegal) can be actually encoded as rjmp -3192 (legal).

Linkers apply this trick automatically, but since over #106722 we've decided to go with the let's avoid emitting relocations route...

We would've caught this, but since the branch-relaxation-long.ll test generated only the textual assembly, it never went near the assertion - I've changed the test to use -filetype=obj | llvm-objdump -dr now (which causes it to crash on the main branch, but succeeds with my changes here).

Alternatives

We could re-do the wrapping logic directly within the codegen - in fact, that's what I started with! I've taught AVRDevices.td about the flash sizes (relying on data from avr-gcc):

class Device&lt;string Name, Family Fam, ELFArch Arch, int FlashSizeArg,
             list&lt;SubtargetFeature&gt; ExtraFeatures = []&gt;
    : Processor&lt;Name, NoItineraries, !listconcat([Fam, Arch], ExtraFeatures)&gt; {
  int FlashSize = FlashSizeArg;
}

def : Device&lt;"avr1", FamilyAVR1, ELFArchAVR1, 0x400&gt;;
def : Device&lt;"avr2", FamilyAVR2, ELFArchAVR2, 0x60000&gt;;
def : Device&lt;"avr25", FamilyAVR25, ELFArchAVR25, 0x2000&gt;;
def : Device&lt;"avr3", FamilyAVR3, ELFArchAVR3, 0x6000&gt;;
/* ... */

... created a new TableGen emitter:

static void EmitAVRTargetDef(RecordKeeper &amp;RK, raw_ostream &amp;OS) {
  OS &lt;&lt; "// Autogenerated by AVRTargetDefEmitter.cpp\n\n";

  OS &lt;&lt; "namespace llvm {\n";
  OS &lt;&lt; "namespace AVR {\n";
  OS &lt;&lt; "\n";
  OS &lt;&lt; "static const std::map&lt;StringRef, uint32_t&gt; FlashSizes = {\n";

  for (const Record *Rec : RK.getAllDerivedDefinitions("Device")) {
    OS &lt;&lt; "  { \"" &lt;&lt; Rec-&gt;getValueAsString("Name") &lt;&lt; "\", "
       &lt;&lt; Rec-&gt;getValueAsInt("FlashSize") &lt;&lt; " },\n";
  }

  OS &lt;&lt; "};\n";
  OS &lt;&lt; "\n";
  OS &lt;&lt; "inline uint32_t getFlashSize(StringRef cpu) {\n";
  OS &lt;&lt; "  return FlashSizes.at(cpu);\n";
  OS &lt;&lt; "}\n";
  OS &lt;&lt; "\n";
  OS &lt;&lt; "} // end of namespace AVR\n";
  OS &lt;&lt; "} // end of namespace llvm\n";
}

static TableGen::Emitter::Opt X("gen-avr-target-def", EmitAVRTargetDef,
                                "Generate the list of CPUs for AVR");

... adjusted adjustRelativeBranch() to access this brand new AVR::getFlashSize():

/// Adjusts the value of a relative branch target before fixup application.
static void adjustRelativeBranch(unsigned Size, const MCFixup &amp;Fixup,
                                 uint64_t &amp;Value, MCContext *Ctx) {
  // Jumps are relative to the current instruction.
  Value -= 2;

  // We have one extra bit of precision because the value is rightshifted by
  // one.
  Size += 1;

  if (!isIntN(Size, Value)) {
    uint64_t FlashSize =
        AVR::getFlashSize(Ctx-&gt;getSubtargetInfo()-&gt;getCPU().data());

    uint64_t ReverseValue =
        Value &gt; 0 ? (uint64_t)((int32_t)Value - (int32_t)FlashSize)
                  : (uint64_t)((int32_t)FlashSize + (int32_t)Value);

    if (isIntN(Size, ReverseValue)) {
      Value = ReverseValue;
    }
  }

  ensureSignedWidth(Size, Value, std::string("branch target"), Fixup, Ctx);

  AVR::fixups::adjustBranchTarget(Value);
}

... and it worked!

I've eventually settled on relocations, because that's both the simpler approach and what we'd like to have anyway (to resemble avr-gcc, as I remember from discussions from the past).

Patch is 30.77 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118015.diff

26 Files Affected:

(modified) llvm/lib/Target/AVR/AVRInstrInfo.cpp (+14-8)
(modified) llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp (+26-129)
(modified) llvm/test/CodeGen/AVR/branch-relaxation-long.ll (+38-36)
(modified) llvm/test/CodeGen/AVR/jmp.ll (+3-2)
(modified) llvm/test/MC/AVR/inst-brbc.s (+8-6)
(modified) llvm/test/MC/AVR/inst-brbs.s (+8-6)
(modified) llvm/test/MC/AVR/inst-brcc.s (+8-4)
(modified) llvm/test/MC/AVR/inst-brcs.s (+8-4)
(modified) llvm/test/MC/AVR/inst-breq.s (+8-4)
(modified) llvm/test/MC/AVR/inst-brge.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brhc.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brhs.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brid.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brie.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brlo.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brlt.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brmi.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brne.s (+8-4)
(modified) llvm/test/MC/AVR/inst-brpl.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brsh.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brtc.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brts.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brvc.s (+6-3)
(modified) llvm/test/MC/AVR/inst-brvs.s (+6-3)
(modified) llvm/test/MC/AVR/inst-rcall.s (+9-5)
(modified) llvm/test/MC/AVR/inst-rjmp.s (+25-15)

diff --git a/llvm/lib/Target/AVR/AVRInstrInfo.cpp b/llvm/lib/Target/AVR/AVRInstrInfo.cpp
index 7d58ece95c869f..6f6da56367e0cb 100644
--- a/llvm/lib/Target/AVR/AVRInstrInfo.cpp
+++ b/llvm/lib/Target/AVR/AVRInstrInfo.cpp
@@ -557,17 +557,23 @@ void AVRInstrInfo::insertIndirectBranch(MachineBasicBlock &MBB,
                                         MachineBasicBlock &RestoreBB,
                                         const DebugLoc &DL, int64_t BrOffset,
                                         RegScavenger *RS) const {
-  // This method inserts a *direct* branch (JMP), despite its name.
-  // LLVM calls this method to fixup unconditional branches; it never calls
-  // insertBranch or some hypothetical "insertDirectBranch".
-  // See lib/CodeGen/RegisterRelaxation.cpp for details.
-  // We end up here when a jump is too long for a RJMP instruction.
+  // When an instruction's jump target is too far away¹, the branch relaxation
+  // pass might split an existing, large basic block into two pieces - if that
+  // happens, this function gets called to create a new jump between the blocks.
+  //
+  // Since this new jump might be too large for an `rjmp` anyway, let's
+  // conservatively emit `jmp` and back off to `rjmp` only if the target arch
+  // doesn't support `jmp`.
+  //
+  // TODO maybe we could use `BrOffset` to determine whether it's safe to emit
+  //      `rjmp`? (it takes one byte less to encode, so it'd be a small win)
+  //
+  // ¹ "too far" in the sense of "the instruction encoding doesn't allow for an
+  //   offset that large"
+
   if (STI.hasJMPCALL())
     BuildMI(&MBB, DL, get(AVR::JMPk)).addMBB(&NewDestBB);
   else
-    // The RJMP may jump to a far place beyond its legal range. We let the
-    // linker to report 'out of range' rather than crash, or silently emit
-    // incorrect assembly code.
     BuildMI(&MBB, DL, get(AVR::RJMPk)).addMBB(&NewDestBB);
 }
 
diff --git a/llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp b/llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp
index 17c48c2fc35ff8..be3ca4cfcd94f5 100644
--- a/llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp
+++ b/llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp
@@ -27,129 +27,32 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/raw_ostream.h"
 
-// FIXME: we should be doing checks to make sure asm operands
-// are not out of bounds.
-
 namespace adjust {
 
 using namespace llvm;
 
-static void signed_width(unsigned Width, uint64_t Value,
-                         std::string Description, const MCFixup &Fixup,
-                         MCContext *Ctx = nullptr) {
-  if (!isIntN(Width, Value)) {
-    std::string Diagnostic = "out of range " + Description;
-
-    int64_t Min = minIntN(Width);
-    int64_t Max = maxIntN(Width);
-
-    Diagnostic += " (expected an integer in the range " + std::to_string(Min) +
-                  " to " + std::to_string(Max) + ")";
-
-    if (Ctx) {
-      Ctx->reportError(Fixup.getLoc(), Diagnostic);
-    } else {
-      llvm_unreachable(Diagnostic.c_str());
-    }
-  }
-}
-
-static void unsigned_width(unsigned Width, uint64_t Value,
-                           std::string Description, const MCFixup &Fixup,
-                           MCContext *Ctx = nullptr) {
+static void ensureUnsignedWidth(unsigned Width, uint64_t Value,
+                                std::string Description, const MCFixup &Fixup,
+                                MCContext *Ctx) {
   if (!isUIntN(Width, Value)) {
     std::string Diagnostic = "out of range " + Description;
 
     int64_t Max = maxUIntN(Width);
 
-    Diagnostic +=
-        " (expected an integer in the range 0 to " + std::to_string(Max) + ")";
+    Diagnostic += " (expected an unsigned integer in the range 0 to " +
+                  std::to_string(Max) + ", got " + std::to_string(Value) + ")";
 
-    if (Ctx) {
-      Ctx->reportError(Fixup.getLoc(), Diagnostic);
-    } else {
-      llvm_unreachable(Diagnostic.c_str());
-    }
+    Ctx->reportError(Fixup.getLoc(), Diagnostic);
   }
 }
 
-/// Adjusts the value of a branch target before fixup application.
-static void adjustBranch(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                         MCContext *Ctx = nullptr) {
-  // We have one extra bit of precision because the value is rightshifted by
-  // one.
-  unsigned_width(Size + 1, Value, std::string("branch target"), Fixup, Ctx);
-
-  // Rightshifts the value by one.
-  AVR::fixups::adjustBranchTarget(Value);
-}
-
-/// Adjusts the value of a relative branch target before fixup application.
-static void adjustRelativeBranch(unsigned Size, const MCFixup &Fixup,
-                                 uint64_t &Value, MCContext *Ctx = nullptr) {
-  // Jumps are relative to the current instruction.
-  Value -= 2;
-
-  // We have one extra bit of precision because the value is rightshifted by
-  // one.
-  signed_width(Size + 1, Value, std::string("branch target"), Fixup, Ctx);
-
-  // Rightshifts the value by one.
-  AVR::fixups::adjustBranchTarget(Value);
-}
-
-/// 22-bit absolute fixup.
-///
-/// Resolves to:
-/// 1001 kkkk 010k kkkk kkkk kkkk 111k kkkk
-///
-/// Offset of 0 (so the result is left shifted by 3 bits before application).
-static void fixup_call(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                       MCContext *Ctx = nullptr) {
-  adjustBranch(Size, Fixup, Value, Ctx);
-
-  auto top = Value & (0xf00000 << 6);   // the top four bits
-  auto middle = Value & (0x1ffff << 5); // the middle 13 bits
-  auto bottom = Value & 0x1f;           // end bottom 5 bits
-
-  Value = (top << 6) | (middle << 3) | (bottom << 0);
-}
-
-/// 7-bit PC-relative fixup.
-///
-/// Resolves to:
-/// 0000 00kk kkkk k000
-/// Offset of 0 (so the result is left shifted by 3 bits before application).
-static void fixup_7_pcrel(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                          MCContext *Ctx = nullptr) {
-  adjustRelativeBranch(Size, Fixup, Value, Ctx);
-
-  // Because the value may be negative, we must mask out the sign bits
-  Value &= 0x7f;
-}
-
-/// 12-bit PC-relative fixup.
-/// Yes, the fixup is 12 bits even though the name says otherwise.
-///
-/// Resolves to:
-/// 0000 kkkk kkkk kkkk
-/// Offset of 0 (so the result isn't left-shifted before application).
-static void fixup_13_pcrel(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                           MCContext *Ctx = nullptr) {
-  adjustRelativeBranch(Size, Fixup, Value, Ctx);
-
-  // Because the value may be negative, we must mask out the sign bits
-  Value &= 0xfff;
-}
-
 /// 6-bit fixup for the immediate operand of the STD/LDD family of
 /// instructions.
 ///
 /// Resolves to:
 /// 10q0 qq10 0000 1qqq
-static void fixup_6(const MCFixup &Fixup, uint64_t &Value,
-                    MCContext *Ctx = nullptr) {
-  unsigned_width(6, Value, std::string("immediate"), Fixup, Ctx);
+static void fixup_6(const MCFixup &Fixup, uint64_t &Value, MCContext *Ctx) {
+  ensureUnsignedWidth(6, Value, std::string("immediate"), Fixup, Ctx);
 
   Value = ((Value & 0x20) << 8) | ((Value & 0x18) << 7) | (Value & 0x07);
 }
@@ -160,8 +63,8 @@ static void fixup_6(const MCFixup &Fixup, uint64_t &Value,
 /// Resolves to:
 /// 0000 0000 kk00 kkkk
 static void fixup_6_adiw(const MCFixup &Fixup, uint64_t &Value,
-                         MCContext *Ctx = nullptr) {
-  unsigned_width(6, Value, std::string("immediate"), Fixup, Ctx);
+                         MCContext *Ctx) {
+  ensureUnsignedWidth(6, Value, std::string("immediate"), Fixup, Ctx);
 
   Value = ((Value & 0x30) << 2) | (Value & 0x0f);
 }
@@ -170,9 +73,8 @@ static void fixup_6_adiw(const MCFixup &Fixup, uint64_t &Value,
 ///
 /// Resolves to:
 /// 0000 0000 AAAA A000
-static void fixup_port5(const MCFixup &Fixup, uint64_t &Value,
-                        MCContext *Ctx = nullptr) {
-  unsigned_width(5, Value, std::string("port number"), Fixup, Ctx);
+static void fixup_port5(const MCFixup &Fixup, uint64_t &Value, MCContext *Ctx) {
+  ensureUnsignedWidth(5, Value, std::string("port number"), Fixup, Ctx);
 
   Value &= 0x1f;
 
@@ -183,9 +85,8 @@ static void fixup_port5(const MCFixup &Fixup, uint64_t &Value,
 ///
 /// Resolves to:
 /// 1011 0AAd dddd AAAA
-static void fixup_port6(const MCFixup &Fixup, uint64_t &Value,
-                        MCContext *Ctx = nullptr) {
-  unsigned_width(6, Value, std::string("port number"), Fixup, Ctx);
+static void fixup_port6(const MCFixup &Fixup, uint64_t &Value, MCContext *Ctx) {
+  ensureUnsignedWidth(6, Value, std::string("port number"), Fixup, Ctx);
 
   Value = ((Value & 0x30) << 5) | (Value & 0x0f);
 }
@@ -195,8 +96,8 @@ static void fixup_port6(const MCFixup &Fixup, uint64_t &Value,
 /// Resolves to:
 /// 1010 ikkk dddd kkkk
 static void fixup_lds_sts_16(const MCFixup &Fixup, uint64_t &Value,
-                             MCContext *Ctx = nullptr) {
-  unsigned_width(7, Value, std::string("immediate"), Fixup, Ctx);
+                             MCContext *Ctx) {
+  ensureUnsignedWidth(7, Value, std::string("immediate"), Fixup, Ctx);
   Value = ((Value & 0x70) << 8) | (Value & 0x0f);
 }
 
@@ -213,7 +114,7 @@ namespace ldi {
 /// 0000 KKKK 0000 KKKK
 /// Offset of 0 (so the result isn't left-shifted before application).
 static void fixup(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                  MCContext *Ctx = nullptr) {
+                  MCContext *Ctx) {
   uint64_t upper = Value & 0xf0;
   uint64_t lower = Value & 0x0f;
 
@@ -223,25 +124,25 @@ static void fixup(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
 static void neg(uint64_t &Value) { Value *= -1; }
 
 static void lo8(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                MCContext *Ctx = nullptr) {
+                MCContext *Ctx) {
   Value &= 0xff;
   ldi::fixup(Size, Fixup, Value, Ctx);
 }
 
 static void hi8(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                MCContext *Ctx = nullptr) {
+                MCContext *Ctx) {
   Value = (Value & 0xff00) >> 8;
   ldi::fixup(Size, Fixup, Value, Ctx);
 }
 
 static void hh8(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                MCContext *Ctx = nullptr) {
+                MCContext *Ctx) {
   Value = (Value & 0xff0000) >> 16;
   ldi::fixup(Size, Fixup, Value, Ctx);
 }
 
 static void ms8(unsigned Size, const MCFixup &Fixup, uint64_t &Value,
-                MCContext *Ctx = nullptr) {
+                MCContext *Ctx) {
   Value = (Value & 0xff000000) >> 24;
   ldi::fixup(Size, Fixup, Value, Ctx);
 }
@@ -263,13 +164,9 @@ void AVRAsmBackend::adjustFixupValue(const MCFixup &Fixup,
   default:
     llvm_unreachable("unhandled fixup");
   case AVR::fixup_7_pcrel:
-    adjust::fixup_7_pcrel(Size, Fixup, Value, Ctx);
-    break;
   case AVR::fixup_13_pcrel:
-    adjust::fixup_13_pcrel(Size, Fixup, Value, Ctx);
-    break;
   case AVR::fixup_call:
-    adjust::fixup_call(Size, Fixup, Value, Ctx);
+    // Handled by linker, see `shouldForceRelocation()`
     break;
   case AVR::fixup_ldi:
     adjust::ldi::fixup(Size, Fixup, Value, Ctx);
@@ -330,13 +227,15 @@ void AVRAsmBackend::adjustFixupValue(const MCFixup &Fixup,
     adjust::ldi::ms8(Size, Fixup, Value, Ctx);
     break;
   case AVR::fixup_16:
-    adjust::unsigned_width(16, Value, std::string("port number"), Fixup, Ctx);
+    adjust::ensureUnsignedWidth(16, Value, std::string("port number"), Fixup,
+                                Ctx);
 
     Value &= 0xffff;
     break;
   case AVR::fixup_16_pm:
     Value >>= 1; // Flash addresses are always shifted.
-    adjust::unsigned_width(16, Value, std::string("port number"), Fixup, Ctx);
+    adjust::ensureUnsignedWidth(16, Value, std::string("port number"), Fixup,
+                                Ctx);
 
     Value &= 0xffff;
     break;
@@ -517,8 +416,6 @@ bool AVRAsmBackend::shouldForceRelocation(const MCAssembler &Asm,
     return Fixup.getKind() >= FirstLiteralRelocationKind;
   case AVR::fixup_7_pcrel:
   case AVR::fixup_13_pcrel:
-    // Always resolve relocations for PC-relative branches
-    return false;
   case AVR::fixup_call:
     return true;
   }
diff --git a/llvm/test/CodeGen/AVR/branch-relaxation-long.ll b/llvm/test/CodeGen/AVR/branch-relaxation-long.ll
index bca505e5edd5f1..11c79dbccc1773 100644
--- a/llvm/test/CodeGen/AVR/branch-relaxation-long.ll
+++ b/llvm/test/CodeGen/AVR/branch-relaxation-long.ll
@@ -1,24 +1,25 @@
-; RUN: llc < %s -march=avr -mattr=avr3 | FileCheck %s
-; RUN: llc < %s -march=avr -mattr=avr2 | FileCheck --check-prefix=AVR2 %s
+; RUN: llc < %s -march=avr -mcpu=avr25 -filetype=obj -o - | llvm-objdump --mcpu=avr25 -dr --no-show-raw-insn --no-leading-addr - | FileCheck --check-prefix=AVR25 %s
+; RUN: llc < %s -march=avr -mcpu=avr3 -filetype=obj -o - | llvm-objdump --mcpu=avr3 -dr --no-show-raw-insn --no-leading-addr - | FileCheck --check-prefix=AVR3 %s
 
-; CHECK-LABEL: relax_to_jmp:
-; CHECK: cpi     r{{[0-9]+}}, 0
-; CHECK: brne    [[BB1:.LBB[0-9]+_[0-9]+]]
-; CHECK: jmp     [[BB2:.LBB[0-9]+_[0-9]+]]
-; CHECK: [[BB1]]:
-; CHECK: nop
-; CHECK: [[BB2]]:
+; AVR25: <relax_to_jmp>:
+; AVR25-NEXT: andi r24, 0x1
+; AVR25-NEXT: cpi r24, 0x0
+; AVR25-NEXT: brne .+0
+; AVR25-NEXT: R_AVR_7_PCREL .text+0x8
+; AVR25-NEXT: rjmp .+0
+; AVR25-NEXT: R_AVR_13_PCREL .text+0x100c
+; AVR25: ldi r24, 0x3
+; AVR25-NEXT: ret
 
-;; A `RJMP` is generated instead of expected `JMP` for AVR2,
-;; and it is up to the linker to report 'out of range' or
-;; 'exceed flash maximum size'.
-; AVR2-LABEL: relax_to_jmp:
-; AVR2: cpi     r{{[0-9]+}}, 0
-; AVR2: brne    [[BB1:.LBB[0-9]+_[0-9]+]]
-; AVR2: rjmp    [[BB2:.LBB[0-9]+_[0-9]+]]
-; AVR2: [[BB1]]:
-; AVR2: nop
-; AVR2: [[BB2]]:
+; AVR3: <relax_to_jmp>:
+; AVR3-NEXT: andi r24, 0x1
+; AVR3-NEXT: cpi r24, 0x0
+; AVR3-NEXT: brne .+0
+; AVR3-NEXT: R_AVR_7_PCREL .text+0xa
+; AVR3-NEXT: jmp 0x0
+; AVR3-NEXT: R_AVR_CALL .text+0x100e
+; AVR3: ldi r24, 0x3
+; AVR3-NEXT: ret
 
 define i8 @relax_to_jmp(i1 %a) {
 entry-block:
@@ -2081,24 +2082,25 @@ finished:
   ret i8 3
 }
 
-; CHECK-LABEL: relax_to_jmp_backwards:
-; CHECK: [[BB1:.LBB[0-9]+_[0-9]+]]
-; CHECK: nop
-; CHECK: cpi     r{{[0-9]+}}, 0
-; CHECK: breq    [[BB2:.LBB[0-9]+_[0-9]+]]
-; CHECK: jmp     [[BB1]]
-; CHECK: [[BB2]]:
+; AVR25: <relax_to_jmp_backwards>:
+; AVR25-NEXT: andi r24, 0x1
+; AVR25: cpi r24, 0x0
+; AVR25-NEXT: breq .+0
+; AVR25-NEXT: R_AVR_7_PCREL .text+0x201c
+; AVR25-NEXT: rjmp .+0
+; AVR25-NEXT: R_AVR_13_PCREL .text+0x1012
+; AVR25: ldi r24, 0x3
+; AVR25-NEXT: ret
 
-;; A `RJMP` is generated instead of expected `JMP` for AVR2,
-;; and it is up to the linker to report 'out of range' or
-;; 'exceed flash maximum size'.
-; AVR2-LABEL: relax_to_jmp_backwards:
-; AVR2: [[BB1:.LBB[0-9]+_[0-9]+]]
-; AVR2: nop
-; AVR2: cpi     r{{[0-9]+}}, 0
-; AVR2: breq    [[BB2:.LBB[0-9]+_[0-9]+]]
-; AVR2: rjmp    [[BB1]]
-; AVR2: [[BB2]]:
+; AVR3: <relax_to_jmp_backwards>:
+; AVR3-NEXT: andi r24, 0x1
+; AVR3: cpi r24, 0x0
+; AVR3-NEXT: breq .+0
+; AVR3-NEXT: R_AVR_7_PCREL .text+0x2020
+; AVR3-NEXT: jmp 0x0
+; AVR3-NEXT: R_AVR_CALL .text+0x1014
+; AVR3: ldi r24, 0x3
+; AVR3-NEXT: ret
 
 define i8 @relax_to_jmp_backwards(i1 %a) {
 entry-block:
diff --git a/llvm/test/CodeGen/AVR/jmp.ll b/llvm/test/CodeGen/AVR/jmp.ll
index 95dfff4836b4e8..f3d6724074e845 100644
--- a/llvm/test/CodeGen/AVR/jmp.ll
+++ b/llvm/test/CodeGen/AVR/jmp.ll
@@ -15,10 +15,11 @@ bb2:
 
 declare i8 @bar(i8);
 
-; CHECK: rcall   .-2
+; CHECK: rcall   .+0
 ; CHECK-NEXT: 00000000: R_AVR_13_PCREL bar
 ; CHECK-NEXT: cpi     r24, 0x7b
-; CHECK-NEXT: brne    .+4
+; CHECK-NEXT: brne    .+0
+; CHECK-NEXT: 00000004: R_AVR_7_PCREL .text+0xa
 ; CHECK-NEXT: ldi     r24, 0x64
 ; CHECK-NEXT: ret
 ; CHECK-NEXT: ldi     r24, 0xc8
diff --git a/llvm/test/MC/AVR/inst-brbc.s b/llvm/test/MC/AVR/inst-brbc.s
index 4edbbfd8580248..6df400068f1024 100644
--- a/llvm/test/MC/AVR/inst-brbc.s
+++ b/llvm/test/MC/AVR/inst-brbc.s
@@ -17,9 +17,11 @@ foo:
 ; CHECK-NEXT:                ; fixup A - offset: 0, value: (.Ltmp1-16)+2, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: 23 f4   brvc .+8
-; INST-NEXT: c0 f7   brsh .-16
-; INST-NEXT: 59 f7   brne .-42
-; INST-NEXT: 52 f7   brpl .-44
-; INST-NEXT: 4c f7   brge .-46
-; INST-NEXT: c7 f4   brid .+48
+; INST-NEXT: 03 f4 brvc .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0xa
+; INST-NEXT: 00 f4 brsh .+0
+; INST-NEXT: R_AVR_7_PCREL .text-0xc
+; INST-NEXT: 59 f7 brne .-42
+; INST-NEXT: 52 f7 brpl .-44
+; INST-NEXT: 4c f7 brge .-46
+; INST-NEXT: c7 f4 brid .+48
diff --git a/llvm/test/MC/AVR/inst-brbs.s b/llvm/test/MC/AVR/inst-brbs.s
index 3f4b134aef682c..e6c14ce3c255ee 100644
--- a/llvm/test/MC/AVR/inst-brbs.s
+++ b/llvm/test/MC/AVR/inst-brbs.s
@@ -16,9 +16,11 @@ foo:
 ; CHECK-NEXT:                ; fixup A - offset: 0, value: (.Ltmp1-12)+2, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: 23 f0   brvs .+8
-; INST-NEXT: d0 f3   brlo .-12
-; INST-NEXT: 59 f3   breq .-42
-; INST-NEXT: 52 f3   brmi .-44
-; INST-NEXT: 4c f3   brlt .-46
-; INST-NEXT: 77 f0   brie .+28
+; INST-NEXT: 03 f0 brvs .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0xa
+; INST-NEXT: 00 f0 brlo .+0
+; INST-NEXT: R_AVR_7_PCREL .text-0x8
+; INST-NEXT: 59 f3 breq .-42
+; INST-NEXT: 52 f3 brmi .-44
+; INST-NEXT: 4c f3 brlt .-46
+; INST-NEXT: 77 f0 brie .+28
diff --git a/llvm/test/MC/AVR/inst-brcc.s b/llvm/test/MC/AVR/inst-brcc.s
index dd1b2b11a6d306..e9482c5cf1e656 100644
--- a/llvm/test/MC/AVR/inst-brcc.s
+++ b/llvm/test/MC/AVR/inst-brcc.s
@@ -22,7 +22,11 @@ bar:
 ; CHECK-NEXT:                ;   fixup A - offset: 0, value: bar, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: 08 f5      brsh .+66
-; INST-NEXT: a8 f7      brsh .-22
-; INST-NEXT: 08 f5      brsh .+66
-; INST-NEXT: 00 f4      brsh .+0
+; INST-NEXT: 00 f4 brsh .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x44
+; INST-NEXT: 00 f4 brsh .+0
+; INST-NEXT: R_AVR_7_PCREL .text-0x12
+; INST-NEXT: 00 f4 brsh .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x48
+; INST-NEXT: 00 f4 brsh .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x8
diff --git a/llvm/test/MC/AVR/inst-brcs.s b/llvm/test/MC/AVR/inst-brcs.s
index 3fafccdb49257c..52a20a8e334d28 100644
--- a/llvm/test/MC/AVR/inst-brcs.s
+++ b/llvm/test/MC/AVR/inst-brcs.s
@@ -22,7 +22,11 @@ bar:
 ; CHECK-NEXT:               ;   fixup A - offset: 0, value: bar, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: 20 f0      brlo .+8
-; INST-NEXT: 10 f0      brlo .+4
-; INST-NEXT: 20 f0      brlo .+8
-; INST-NEXT: 00 f0      brlo .+0
+; INST-NEXT: 00 f0 brlo .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0xa
+; INST-NEXT: 00 f0 brlo .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x8
+; INST-NEXT: 00 f0 brlo .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0xe
+; INST-NEXT: 00 f0 brlo .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x8
diff --git a/llvm/test/MC/AVR/inst-breq.s b/llvm/test/MC/AVR/inst-breq.s
index 7a6eac6f01ad07..4e10438762daba 100644
--- a/llvm/test/MC/AVR/inst-breq.s
+++ b/llvm/test/MC/AVR/inst-breq.s
@@ -22,7 +22,11 @@ bar:
 ; CHECK-NEXT:                      ;   fixup A - offset: 0, value: bar, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: b9 f3      breq .-18
-; INST-NEXT: d1 f3      breq .-12
-; INST-NEXT: b9 f3      breq .-18
-; INST-NEXT: 01 f0      breq .+0
+; INST-NEXT: 01 f0 breq .+0
+; INST-NEXT: R_AVR_7_PCREL .text-0x10
+; INST-NEXT: 01 f0 breq .+0
+; INST-NEXT: R_AVR_7_PCREL .text-0x8
+; INST-NEXT: 01 f0 breq .+0
+; INST-NEXT: R_AVR_7_PCREL .text-0xc
+; INST-NEXT: 01 f0 breq .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x8
diff --git a/llvm/test/MC/AVR/inst-brge.s b/llvm/test/MC/AVR/inst-brge.s
index 6cf79db4dbd659..9657413d14895b 100644
--- a/llvm/test/MC/AVR/inst-brge.s
+++ b/llvm/test/MC/AVR/inst-brge.s
@@ -19,6 +19,9 @@ bar:
 ; CHECK-NEXT:                ;   fixup A - offset: 0, value: bar, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: cc f4      brge .+50
-; INST-NEXT: ac f4      brge .+42
-; INST-NEXT: 04 f4      brge .+0
+; INST-NEXT: 04 f4 brge .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x34
+; INST-NEXT: 04 f4 brge .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x2e
+; INST-NEXT: 04 f4 brge .+0
+; INST-NEXT: R_AVR_7_PCREL .text+0x6
diff --git a/llvm/test/MC/AVR/inst-brhc.s b/llvm/test/MC/AVR/inst-brhc.s
index 924895e4bf5df1..f6c2b66a89a66c 100644
--- a/llvm/test/MC/AVR/inst-brhc.s
+++ b/llvm/test/MC/AVR/inst-brhc.s
@@ -19,6 +19,9 @@ bar:
 ; CHECK-NEXT:                ;   fixup A - offset: 0, value: bar, kind: fixup_7_pcrel
 
 ; INST-LABEL: <foo>:
-; INST-NEXT: 35 f4      brhc .+12
-; INST-NEXT: 3d f4      brhc .+14
-; INST-NEXT...
[truncated]

benshi001

I think this change is so aggressive, just for some special devices with 8KB but without long range JMP/CALL.

I suggest this way:

Introduce a new target feature, something like "RjmpWrap" (or a better name??)
Add this target feature to these special devices.
And then we only perform force relocation with this new target feature specified.

Patryk27 · 2024-11-29T13:07:12Z

Hmm, so we'd like to diverge from avr-gcc's behavior, right?

benshi001 · 2024-12-02T10:12:48Z

Hmm, so we'd like to diverge from avr-gcc's behavior, right?

Is there a strong reason for us to strictly follow avr-gcc? My key concern is that the AVR backend has been stable, we need not do aggressive functional changes.

benshi001 · 2024-12-03T00:33:23Z

But I also think supporting the flash space wrapping is necessary. If you have no time for my suggested solution, I will try it later this month.

Patryk27 · 2024-12-03T15:17:55Z

Is there a strong reason for us to strictly follow avr-gcc?

No, I don't think so - I'm up for avoiding relocations as well 🙂

If you have no time for my suggested solution, I will try it later this month.

I can do that, no problem - I'm just reflecting on the approach:

Option A: Create a new feature (like RjmpWrap) and enable it for the smaller (8 kB) microcontrollers,
Option B: Teach LLVM flash sizes of all of the AVRs and implement logic that automatically applies the memory wrapping trick.

Option A is simpler to implement, while option B (as described in the Alternatives approach at the top) is potentially more versatile (we'll be able to apply the wrapping trick even for larger microcontrollers, if we happen to generate a branch large enough). Note that for larger uCs this trick is just a program-size optimization, not a requirement.

I'm leaning towards option B, since it feels less hacky to me, but I don't have any strong preference here - care to choose?

benshi001 · 2024-12-04T02:24:42Z

Is there a strong reason for us to strictly follow avr-gcc?

No, I don't think so - I'm up for avoiding relocations as well 🙂

If you have no time for my suggested solution, I will try it later this month.

I can do that, no problem - I'm just reflecting on the approach:

Option A: Create a new feature (like RjmpWrap) and enable it for the smaller (8 kB) microcontrollers,

Option B: Teach LLVM flash sizes of all of the AVRs and implement logic that automatically applies the memory wrapping trick.

Option A is simpler to implement, while option B (as described in the Alternatives approach at the top) is potentially more versatile (we'll be able to apply the wrapping trick even for larger microcontrollers, if we happen to generate a branch large enough). Note that for larger uCs this trick is just a program-size optimization, not a requirement.

I'm leaning towards option B, since it feels less hacky to me, but I don't have any strong preference here - care to choose?

I prefer A to B, because:

It is simpler, and less code changes.
For larger devices, we use long JMP, and let the linker to do the branch relaxation.

What's more,

we only add this new TargetFeature to devices with larger flash but without long JMP. Devices with flash size less than 4KB or with long JMP are unnecessary.
Generally I prefer force relocation, but if the devices have a unique flash size (for example 8KB), I also accept hard coding it into adjustRelativeBranch .

Patryk27 · 2024-12-10T16:30:24Z

Okie, ready! I've adjusted the description above as well. cc @benshi001

Patryk27 · 2024-12-22T11:37:07Z

Rebased due to a conflict over 806761a.

benshi001 · 2024-12-23T00:50:04Z

Rebased due to a conflict over 806761a.

Are you now on the Christmas holiday? What's your opinion on my comments? Do you mind I do the modification based on your code?

Patryk27 · 2024-12-23T06:33:44Z

What's your opinion on my comments?

Which comments? 👀

You mentioned using the feature-based approach, which I did implement - I don’t see any more comments 😅

benshi001 · 2024-12-24T01:06:49Z

benshi001 · 2024-12-24T01:11:13Z

Sorry, it is my fault to not post them.

benshi001 · 2024-12-11T00:54:36Z

llvm/lib/Target/AVR/AVRDevices.td

@@ -239,7 +251,7 @@ class Device<string Name, Family Fam, ELFArch Arch,
 // in AVR binutils. We do not replicate this.
 def : Device<"avr1", FamilyAVR1, ELFArchAVR1>;
 def : Device<"avr2", FamilyAVR2, ELFArchAVR2>;
-def : Device<"avr25", FamilyAVR25, ELFArchAVR25>;
+def : Device<"avr25", FamilyAVR25, ELFArchAVR25, [FeatureWrappingRjmp]>;


It seems not all of avr25 serial devices have a large SRAM, such as attiny13, so I would rather leave device avr25 unchanged.

Sure - fwiw this was synchronized with gcc:

https://github.com/gcc-mirror/gcc/blob/d061c681814917c2208f383bd6069e6ab90db220/gcc/config/avr/avr-mcus.def#L88

llvm/lib/Target/AVR/MCTargetDesc/AVRAsmBackend.cpp

llvm/test/CodeGen/AVR/branch-relaxation-long-backward.ll

Patryk27 · 2024-12-24T13:38:09Z

Cool, thanks! I'll take a look at the comments tomorrow 🙂

Patryk27 · 2024-12-26T15:04:59Z

Ready!

benshi001

LGTM. Thanks.

mbuesch · 2024-12-27T21:04:14Z

Does this fix attiny26, too?

I get this

error: out of range branch target (expected an integer in the range -4096 to 4095)

with the nightly rust compiler on an attiny26 target, too.

Would this change fix this, even though it doesn't apply FeatureWrappingRjmp to attiny26?

Patryk27 · 2024-12-27T21:38:07Z

attiny26 has 2 KB of Flash memory, so this trick/fix shouldn't be needed there - if you get that error, it might mean that the generated binary is just too huge.

Is this an open source project, something I/we could compile locally?

mbuesch · 2024-12-27T21:54:17Z

I get this behavior with recent rust-nightly in this completely empty project:

attiny26test.tar.gz

It does build and work fine with nightly-2024-03-22. See rust-toolchain.toml.

Patryk27 · 2024-12-27T21:56:28Z

Thanks for the reproducer, I'll take a look!

llvmbot added the mc Machine (object) code label Nov 28, 2024

alexrp mentioned this pull request Nov 28, 2024

Compilation failure on latest mater for target AVR25 (ATTiny13A MCU) ziglang/zig#21487

Open

benshi001 self-requested a review November 29, 2024 11:58

benshi001 reviewed Nov 29, 2024

View reviewed changes

Patryk27 force-pushed the fix-avr-jumps branch from d519636 to 2dd5dc0 Compare December 10, 2024 16:28

Patryk27 changed the title ~~[AVR] Force relocations for jumps and calls~~ [AVR] Wrap out-of-bounds relative jumps Dec 10, 2024

benshi001 self-requested a review December 11, 2024 01:19

Patryk27 force-pushed the fix-avr-jumps branch from 2dd5dc0 to 9fc7bb3 Compare December 22, 2024 11:36

benshi001 requested changes Dec 24, 2024

View reviewed changes

[AVR] Wrap out-of-bounds relative jumps

7d8acf6

Patryk27 force-pushed the fix-avr-jumps branch from 9fc7bb3 to 7d8acf6 Compare December 26, 2024 15:04

Patryk27 requested a review from benshi001 December 26, 2024 15:04

benshi001 approved these changes Dec 27, 2024

View reviewed changes

benshi001 merged commit c6ea7fb into llvm:main Dec 27, 2024
8 checks passed

Patryk27 deleted the fix-avr-jumps branch December 27, 2024 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AVR] Wrap out-of-bounds relative jumps #118015

[AVR] Wrap out-of-bounds relative jumps #118015

Patryk27 commented Nov 28, 2024 •

edited

Loading

Patryk27 commented Nov 28, 2024

llvmbot commented Nov 28, 2024

Motivation

Alternatives

benshi001 left a comment •

edited

Loading

Patryk27 commented Nov 29, 2024

benshi001 commented Dec 2, 2024

benshi001 commented Dec 3, 2024

Patryk27 commented Dec 3, 2024

benshi001 commented Dec 4, 2024 •

edited

Loading

Patryk27 commented Dec 10, 2024

Patryk27 commented Dec 22, 2024

benshi001 commented Dec 23, 2024

Patryk27 commented Dec 23, 2024

benshi001 commented Dec 24, 2024

benshi001 commented Dec 24, 2024 •

edited

Loading

benshi001 Dec 11, 2024

Patryk27 Dec 26, 2024

Patryk27 commented Dec 24, 2024

Patryk27 commented Dec 26, 2024

benshi001 left a comment

mbuesch commented Dec 27, 2024

Patryk27 commented Dec 27, 2024

mbuesch commented Dec 27, 2024

Patryk27 commented Dec 27, 2024

[AVR] Wrap out-of-bounds relative jumps #118015

[AVR] Wrap out-of-bounds relative jumps #118015

Conversation

Patryk27 commented Nov 28, 2024 • edited Loading

Motivation

Patryk27 commented Nov 28, 2024

llvmbot commented Nov 28, 2024

Motivation

Alternatives

benshi001 left a comment • edited Loading

Choose a reason for hiding this comment

Patryk27 commented Nov 29, 2024

benshi001 commented Dec 2, 2024

benshi001 commented Dec 3, 2024

Patryk27 commented Dec 3, 2024

benshi001 commented Dec 4, 2024 • edited Loading

Patryk27 commented Dec 10, 2024

Patryk27 commented Dec 22, 2024

benshi001 commented Dec 23, 2024

Patryk27 commented Dec 23, 2024

benshi001 commented Dec 24, 2024

benshi001 commented Dec 24, 2024 • edited Loading

benshi001 Dec 11, 2024

Choose a reason for hiding this comment

Patryk27 Dec 26, 2024

Choose a reason for hiding this comment

Patryk27 commented Dec 24, 2024

Patryk27 commented Dec 26, 2024

benshi001 left a comment

Choose a reason for hiding this comment

mbuesch commented Dec 27, 2024

Patryk27 commented Dec 27, 2024

mbuesch commented Dec 27, 2024

Patryk27 commented Dec 27, 2024

Patryk27 commented Nov 28, 2024 •

edited

Loading

benshi001 left a comment •

edited

Loading

benshi001 commented Dec 4, 2024 •

edited

Loading

benshi001 commented Dec 24, 2024 •

edited

Loading