Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Enhance kCFI type IDs with a 3-bit arity indicator. #117121

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

scottconstable
Copy link
Contributor

@scottconstable scottconstable commented Nov 21, 2024

Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits.

Like any hashing scheme, hash collisions are possible. For example, a commodity Linux kernel configured for Ubuntu 24.04 server has 141,617 total indirect call targets, with 10,903 unique function types. With a 32-bit kCFI hash, the expected number of collisions is 10,903-2^32+2^32*(1-1/(2^32))^10,903 = 0.01383765 (see https://courses.cs.duke.edu/cps102/spring09/Lectures/L-18.pdf for the formula). This number can balloon with the addition of drivers and kernel modules.

This patch reduces both the expected number of collisions and the potential impact of a collision by augmenting the hash with an arity value that indicates how many parameters the function has at the ABI level. Specifically, the patch further truncates the kCFI hash down to 29 bits, then concatenates a 3-bit arity indicator as follows:

Arity Indicator Description
0 0 parameters
1 1 parameter in RDI
2 2 parameters in RDI and RSI
3 3 parameters in RDI, RSI, and RDX
4 4 parameters in RDI, RSI, RDX, and RCX
5 5 parameters in RDI, RSI, RDX, RCX, and R8
6 6 parameters in RDI, RSI, RDX, RCX, R8, and R9
7 At least one parameter may be passed on the stack

This scheme enhances security in two ways. First, it prevents a j-arity function pointer from being used to call a k-arity function, unless j=k. The current 32-bit kCFI hash does not prevent, for example, a 2-arity fptr from calling a 3-arity target if the kCFI hashes collide. If this were to happen, then potentially malicious stale/dead data in RDX at the call site could suddenly become live as the third parameter at the call target.

Second, this scheme reduces the expected number of hash collisions within each arity, compared against the expected number of collisions (0.01383765) for the 32-bit hashing scheme that includes all arities. The table below shows the expected number of collisions for each arity, given the number of unique indirect callable function types within that arity in the same Ubuntu 24.04 server kernel discussed above.

Arity Unique Indirect Callable Function Types Number of Expected Collisions
0 32 0.00000092
1 2492 0.00578125
2 3775 0.01326841
3 2547 0.00603931
4 1169 0.00127162
5 519 0.00025038
6 221 0.00004528
7 148 0.00002026

One additional benefit of this patch is that it can benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI type ID, then this information is trivial to infer.

EDIT: I tested this PR by recompiling and running a Linux 6.11 kernel with the following parameters:

  • cfi=off
  • cfi=kcfi
  • cfi=kcfi cfi=norand
  • cfi=fineibt
  • cfi=fineibt cfi=norand

With each of these configurations the kernel runs without crashing. Furthermore, the KCFI type IDs appear to be correct. For example, in the cfi=kcfi cfi=norand configuration, the CFI header for dd_merged_requests has its low-order 3 bits set to 0b011, which matches the arity of that function (its arity is 3):

(gdb) disass dd_merged_requests-16
Dump of assembler code for function __cfi_dd_merged_requests:
   0xffffffff818d4ca0 <+0>:     mov    $0x8dfc864b,%eax

And when I disassemble a call site of the same type, I can confirm that the call site uses the additive inverse of the enhanced KCFI type ID:

(gdb) disass elv_merge_requests                                                                                                                                                                                  Dump of assembler code for function elv_merge_requests:
...
   0xffffffff8188c0ac <+44>:    mov    $0x720379b5,%r10d
   0xffffffff8188c0b2 <+50>:    add    -0xf(%r11),%r10d
   0xffffffff8188c0b6 <+54>:    je     0xffffffff8188c0ba <elv_merge_requests+58>
   0xffffffff8188c0b8 <+56>:    ud2
   0xffffffff8188c0ba <+58>:    call   *%r11

Note that 0x8dfc864b + 0x720379b5 = 0.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Nov 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-clang

Author: Scott Constable (scottconstable)

Changes

Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits.

Like any hashing scheme, hash collisions are possible. For example, a commodity Linux kernel configured for Ubuntu 24.04 server has 141,617 total indirect call targets, with 10,903 unique function types. With a 32-bit kCFI hash, the expected number of collisions is 10,903-2^32+2^32*(1-1/(2^32))^10,903 = 0.01383765 (see https://courses.cs.duke.edu/cps102/spring09/Lectures/L-18.pdf for the formula). This number can balloon with the addition of drivers and kernel modules.

This patch reduces both the expected number of collisions and the potential impact of a collision by augmenting the hash with an arity value that indicates how many parameters the function has at the ABI level. Specifically, the patch further truncates the kCFI hash down to 29 bits, then concatenates a 3-bit arity indicator as follows:

Arity Indicator Description
0 0 parameters
1 1 parameter in RDI
2 2 parameters in RDI and RSI
3 3 parameters in RDI, RSI, and RDX
4 4 parameters in RDI, RSI, RDX, and RCX
5 5 parameters in RDI, RSI, RDX, RCX, and R8
6 6 parameters in RDI, RSI, RDX, RCX, R8, and R9
7 At least one parameter may be passed on the stack

This scheme enhances security in two ways. First, it prevents a j-arity function pointer from being used to call a k-arity function, unless j=k. The current 32-bit kCFI hash does not prevent, for example, a 2-arity fptr from calling a 3-arity target if the kCFI hashes collide. If this were to happen, then potentially malicious stale/dead data in RDX at the call site could suddenly become live as the third parameter at the call target.

Second, this scheme reduces the expected number of hash collisions within each arity, compared against the expected number of collisions (0.01383765) for the 32-bit hashing scheme that includes all arities. The table below shows the expected number of collisions for each arity, given the number of unique indirect callable function types within that arity in the same Ubuntu 24.04 server kernel discussed above.

Arity Unique Indirect Callable Function Types Number of Expected Collisions
0 32 0.00000092
1 2492 0.00578125
2 3775 0.01326841
3 2547 0.00603931
4 1169 0.00127162
5 519 0.00025038
6 221 0.00004528
7 148 0.00002026

One additional benefit of this patch is that it can benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI type ID, then this information is trivial to infer.


Full diff: https://github.com/llvm/llvm-project/pull/117121.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+28-3)
  • (modified) clang/test/CodeGen/kcfi-normalize.c (+12-6)
  • (modified) clang/test/CodeGen/kcfi.c (+19-3)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index b854eeb62a80ce..7cc6f120ec39a9 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2183,7 +2183,8 @@ llvm::ConstantInt *CodeGenModule::CreateCrossDsoCfiTypeId(llvm::Metadata *MD) {
 }
 
 llvm::ConstantInt *CodeGenModule::CreateKCFITypeId(QualType T) {
-  if (auto *FnType = T->getAs<FunctionProtoType>())
+  auto *FnType = T->getAs<FunctionProtoType>();
+  if (FnType)
     T = getContext().getFunctionType(
         FnType->getReturnType(), FnType->getParamTypes(),
         FnType->getExtProtoInfo().withExceptionSpec(EST_None));
@@ -2196,8 +2197,32 @@ llvm::ConstantInt *CodeGenModule::CreateKCFITypeId(QualType T) {
   if (getCodeGenOpts().SanitizeCfiICallNormalizeIntegers)
     Out << ".normalized";
 
-  return llvm::ConstantInt::get(Int32Ty,
-                                static_cast<uint32_t>(llvm::xxHash64(OutName)));
+  uint32_t OutHash = static_cast<uint32_t>(llvm::xxHash64(OutName));
+  const auto &Triple = getTarget().getTriple();
+  if (Triple.isX86() && Triple.isArch64Bit() && Triple.isOSLinux()) {
+    // Estimate the function's arity (i.e., the number of arguments) at the ABI
+    // level by counting the number of parameters that are likely to be passed
+    // as registers, such as pointers and 64-bit (or smaller) integers. The
+    // Linux x86-64 ABI allows up to 6 parameters to be passed in GPRs.
+    // Additional parameters or parameters larger than 64 bits may be passed on
+    // the stack, in which case the arity is denoted as 7.
+    bool MayHaveStackArgs = FnType->getNumParams() > 6;
+
+    for (unsigned int i = 0; !MayHaveStackArgs && i < FnType->getNumParams();
+         ++i) {
+      const Type *PT = FnType->getParamType(i).getTypePtr();
+      if (!(PT->isPointerType() || (PT->isIntegralOrEnumerationType() &&
+                                    getContext().getTypeSize(PT) <= 64)))
+        MayHaveStackArgs = true;
+    }
+
+    // The 3-bit arity is concatenated with the lower 29 bits of the KCFI hash
+    // to form an enhanced KCFI type ID. This can prevent, for example, a
+    // 3-arity function's ID from ever colliding with a 2-arity function's ID.
+    OutHash = (OutHash << 3) | (MayHaveStackArgs ? 7 : FnType->getNumParams());
+  }
+
+  return llvm::ConstantInt::get(Int32Ty, OutHash);
 }
 
 void CodeGenModule::SetLLVMFunctionAttributes(GlobalDecl GD,
diff --git a/clang/test/CodeGen/kcfi-normalize.c b/clang/test/CodeGen/kcfi-normalize.c
index b9150e88f6ab5f..8b7445fc85e490 100644
--- a/clang/test/CodeGen/kcfi-normalize.c
+++ b/clang/test/CodeGen/kcfi-normalize.c
@@ -10,25 +10,31 @@
 void foo(void (*fn)(int), int arg) {
     // CHECK-LABEL: define{{.*}}foo
     // CHECK-SAME: {{.*}}!kcfi_type ![[TYPE1:[0-9]+]]
-    // CHECK: call void %0(i32 noundef %1){{.*}}[ "kcfi"(i32 1162514891) ]
+    // KCFI ID = 0x2A548E59
+    // CHECK: call void %0(i32 noundef %1){{.*}}[ "kcfi"(i32 710184537) ]
     fn(arg);
 }
 
 void bar(void (*fn)(int, int), int arg1, int arg2) {
     // CHECK-LABEL: define{{.*}}bar
     // CHECK-SAME: {{.*}}!kcfi_type ![[TYPE2:[0-9]+]]
-    // CHECK: call void %0(i32 noundef %1, i32 noundef %2){{.*}}[ "kcfi"(i32 448046469) ]
+    // KCFI ID = 0xD5A52C2A
+    // CHECK: call void %0(i32 noundef %1, i32 noundef %2){{.*}}[ "kcfi"(i32 -710595542) ]
     fn(arg1, arg2);
 }
 
 void baz(void (*fn)(int, int, int), int arg1, int arg2, int arg3) {
     // CHECK-LABEL: define{{.*}}baz
     // CHECK-SAME: {{.*}}!kcfi_type ![[TYPE3:[0-9]+]]
-    // CHECK: call void %0(i32 noundef %1, i32 noundef %2, i32 noundef %3){{.*}}[ "kcfi"(i32 -2049681433) ]
+    // KCFI ID = 0x2EA2BF3B
+    // CHECK: call void %0(i32 noundef %1, i32 noundef %2, i32 noundef %3){{.*}}[ "kcfi"(i32 782417723) ]
     fn(arg1, arg2, arg3);
 }
 
 // CHECK: ![[#]] = !{i32 4, !"cfi-normalize-integers", i32 1}
-// CHECK: ![[TYPE1]] = !{i32 -1143117868}
-// CHECK: ![[TYPE2]] = !{i32 -460921415}
-// CHECK: ![[TYPE3]] = !{i32 -333839615}
+// KCFI ID = DEEB3EA2
+// CHECK: ![[TYPE1]] = !{i32 -555008350}
+// KCFI ID = 24372DCB
+// CHECK: ![[TYPE2]] = !{i32 607595979}
+// KCFI ID = 0x60D0180C
+// CHECK: ![[TYPE3]] = !{i32 1624250380}
diff --git a/clang/test/CodeGen/kcfi.c b/clang/test/CodeGen/kcfi.c
index 622843cedba50f..dc9e818a9f8cca 100644
--- a/clang/test/CodeGen/kcfi.c
+++ b/clang/test/CodeGen/kcfi.c
@@ -7,7 +7,6 @@
 
 /// Must emit __kcfi_typeid symbols for address-taken function declarations
 // CHECK: module asm ".weak __kcfi_typeid_[[F4:[a-zA-Z0-9_]+]]"
-// CHECK: module asm ".set __kcfi_typeid_[[F4]], [[#%d,HASH:]]"
 /// Must not __kcfi_typeid symbols for non-address-taken declarations
 // CHECK-NOT: module asm ".weak __kcfi_typeid_{{f6|_Z2f6v}}"
 
@@ -29,7 +28,7 @@ int __call(fn_t f) __attribute__((__no_sanitize__("kcfi"))) {
 
 // CHECK: define dso_local{{.*}} i32 @{{call|_Z4callPFivE}}(ptr{{.*}} %f){{.*}}
 int call(fn_t f) {
-  // CHECK: call{{.*}} i32 %{{.}}(){{.*}} [ "kcfi"(i32 [[#HASH]]) ]
+  // CHECK: call{{.*}} i32 %{{.}}(){{.*}} [ "kcfi"(i32 [[#%d,HASH:]]) ]
   return f();
 }
 
@@ -48,6 +47,20 @@ static int f5(void) { return 2; }
 // CHECK-DAG: declare !kcfi_type ![[#TYPE]]{{.*}} i32 @{{f6|_Z2f6v}}()
 extern int f6(void);
 
+typedef struct {
+  int *p1;
+  int *p2[16];
+} s_t;
+
+// CHECK: define internal{{.*}} i32 @{{f7|_ZL2f7PFi3s_tEPS_}}(ptr{{.*}} %f, ptr{{.*}} %s){{.*}}
+static int f7(int (*f)(s_t), s_t *s) {
+  // CHECK: call{{.*}} i32 %{{.*}} [ "kcfi"(i32 [[#%d,HASH4:]]) ]
+  return f(*s) + 1;
+}
+
+// CHECK: define internal{{.*}} i32 @{{f8|_ZL2f83s_t}}(ptr{{.*}} %s){{.*}} !kcfi_type ![[#%d,TYPE4:]]
+static int f8(s_t s) { return 0; }
+
 #ifndef __cplusplus
 // C: define internal ptr @resolver1() #[[#]] !kcfi_type ![[#]] {
 int ifunc1(int) __attribute__((ifunc("resolver1")));
@@ -59,12 +72,14 @@ long ifunc2(long) __attribute__((ifunc("resolver2")));
 #endif
 
 int test(void) {
+  s_t s;
   return call(f1) +
          __call((fn_t)f2) +
          call(f3) +
          call(f4) +
          f5() +
-         f6();
+         f6() +
+         f7(f8, &s);
 }
 
 #ifdef __cplusplus
@@ -85,3 +100,4 @@ void test_member_call(void) {
 // CHECK-DAG: ![[#TYPE]] = !{i32 [[#HASH]]}
 // CHECK-DAG: ![[#TYPE2]] = !{i32 [[#%d,HASH2:]]}
 // MEMBER-DAG: ![[#TYPE3]] = !{i32 [[#HASH3]]}
+// CHECK-DAG: ![[#TYPE4]] = !{i32 [[#HASH4]]}

@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2024

@llvm/pr-subscribers-clang-codegen

Author: Scott Constable (scottconstable)

Changes

Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits.

Like any hashing scheme, hash collisions are possible. For example, a commodity Linux kernel configured for Ubuntu 24.04 server has 141,617 total indirect call targets, with 10,903 unique function types. With a 32-bit kCFI hash, the expected number of collisions is 10,903-2^32+2^32*(1-1/(2^32))^10,903 = 0.01383765 (see https://courses.cs.duke.edu/cps102/spring09/Lectures/L-18.pdf for the formula). This number can balloon with the addition of drivers and kernel modules.

This patch reduces both the expected number of collisions and the potential impact of a collision by augmenting the hash with an arity value that indicates how many parameters the function has at the ABI level. Specifically, the patch further truncates the kCFI hash down to 29 bits, then concatenates a 3-bit arity indicator as follows:

Arity Indicator Description
0 0 parameters
1 1 parameter in RDI
2 2 parameters in RDI and RSI
3 3 parameters in RDI, RSI, and RDX
4 4 parameters in RDI, RSI, RDX, and RCX
5 5 parameters in RDI, RSI, RDX, RCX, and R8
6 6 parameters in RDI, RSI, RDX, RCX, R8, and R9
7 At least one parameter may be passed on the stack

This scheme enhances security in two ways. First, it prevents a j-arity function pointer from being used to call a k-arity function, unless j=k. The current 32-bit kCFI hash does not prevent, for example, a 2-arity fptr from calling a 3-arity target if the kCFI hashes collide. If this were to happen, then potentially malicious stale/dead data in RDX at the call site could suddenly become live as the third parameter at the call target.

Second, this scheme reduces the expected number of hash collisions within each arity, compared against the expected number of collisions (0.01383765) for the 32-bit hashing scheme that includes all arities. The table below shows the expected number of collisions for each arity, given the number of unique indirect callable function types within that arity in the same Ubuntu 24.04 server kernel discussed above.

Arity Unique Indirect Callable Function Types Number of Expected Collisions
0 32 0.00000092
1 2492 0.00578125
2 3775 0.01326841
3 2547 0.00603931
4 1169 0.00127162
5 519 0.00025038
6 221 0.00004528
7 148 0.00002026

One additional benefit of this patch is that it can benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI type ID, then this information is trivial to infer.


Full diff: https://github.com/llvm/llvm-project/pull/117121.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+28-3)
  • (modified) clang/test/CodeGen/kcfi-normalize.c (+12-6)
  • (modified) clang/test/CodeGen/kcfi.c (+19-3)
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index b854eeb62a80ce..7cc6f120ec39a9 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2183,7 +2183,8 @@ llvm::ConstantInt *CodeGenModule::CreateCrossDsoCfiTypeId(llvm::Metadata *MD) {
 }
 
 llvm::ConstantInt *CodeGenModule::CreateKCFITypeId(QualType T) {
-  if (auto *FnType = T->getAs<FunctionProtoType>())
+  auto *FnType = T->getAs<FunctionProtoType>();
+  if (FnType)
     T = getContext().getFunctionType(
         FnType->getReturnType(), FnType->getParamTypes(),
         FnType->getExtProtoInfo().withExceptionSpec(EST_None));
@@ -2196,8 +2197,32 @@ llvm::ConstantInt *CodeGenModule::CreateKCFITypeId(QualType T) {
   if (getCodeGenOpts().SanitizeCfiICallNormalizeIntegers)
     Out << ".normalized";
 
-  return llvm::ConstantInt::get(Int32Ty,
-                                static_cast<uint32_t>(llvm::xxHash64(OutName)));
+  uint32_t OutHash = static_cast<uint32_t>(llvm::xxHash64(OutName));
+  const auto &Triple = getTarget().getTriple();
+  if (Triple.isX86() && Triple.isArch64Bit() && Triple.isOSLinux()) {
+    // Estimate the function's arity (i.e., the number of arguments) at the ABI
+    // level by counting the number of parameters that are likely to be passed
+    // as registers, such as pointers and 64-bit (or smaller) integers. The
+    // Linux x86-64 ABI allows up to 6 parameters to be passed in GPRs.
+    // Additional parameters or parameters larger than 64 bits may be passed on
+    // the stack, in which case the arity is denoted as 7.
+    bool MayHaveStackArgs = FnType->getNumParams() > 6;
+
+    for (unsigned int i = 0; !MayHaveStackArgs && i < FnType->getNumParams();
+         ++i) {
+      const Type *PT = FnType->getParamType(i).getTypePtr();
+      if (!(PT->isPointerType() || (PT->isIntegralOrEnumerationType() &&
+                                    getContext().getTypeSize(PT) <= 64)))
+        MayHaveStackArgs = true;
+    }
+
+    // The 3-bit arity is concatenated with the lower 29 bits of the KCFI hash
+    // to form an enhanced KCFI type ID. This can prevent, for example, a
+    // 3-arity function's ID from ever colliding with a 2-arity function's ID.
+    OutHash = (OutHash << 3) | (MayHaveStackArgs ? 7 : FnType->getNumParams());
+  }
+
+  return llvm::ConstantInt::get(Int32Ty, OutHash);
 }
 
 void CodeGenModule::SetLLVMFunctionAttributes(GlobalDecl GD,
diff --git a/clang/test/CodeGen/kcfi-normalize.c b/clang/test/CodeGen/kcfi-normalize.c
index b9150e88f6ab5f..8b7445fc85e490 100644
--- a/clang/test/CodeGen/kcfi-normalize.c
+++ b/clang/test/CodeGen/kcfi-normalize.c
@@ -10,25 +10,31 @@
 void foo(void (*fn)(int), int arg) {
     // CHECK-LABEL: define{{.*}}foo
     // CHECK-SAME: {{.*}}!kcfi_type ![[TYPE1:[0-9]+]]
-    // CHECK: call void %0(i32 noundef %1){{.*}}[ "kcfi"(i32 1162514891) ]
+    // KCFI ID = 0x2A548E59
+    // CHECK: call void %0(i32 noundef %1){{.*}}[ "kcfi"(i32 710184537) ]
     fn(arg);
 }
 
 void bar(void (*fn)(int, int), int arg1, int arg2) {
     // CHECK-LABEL: define{{.*}}bar
     // CHECK-SAME: {{.*}}!kcfi_type ![[TYPE2:[0-9]+]]
-    // CHECK: call void %0(i32 noundef %1, i32 noundef %2){{.*}}[ "kcfi"(i32 448046469) ]
+    // KCFI ID = 0xD5A52C2A
+    // CHECK: call void %0(i32 noundef %1, i32 noundef %2){{.*}}[ "kcfi"(i32 -710595542) ]
     fn(arg1, arg2);
 }
 
 void baz(void (*fn)(int, int, int), int arg1, int arg2, int arg3) {
     // CHECK-LABEL: define{{.*}}baz
     // CHECK-SAME: {{.*}}!kcfi_type ![[TYPE3:[0-9]+]]
-    // CHECK: call void %0(i32 noundef %1, i32 noundef %2, i32 noundef %3){{.*}}[ "kcfi"(i32 -2049681433) ]
+    // KCFI ID = 0x2EA2BF3B
+    // CHECK: call void %0(i32 noundef %1, i32 noundef %2, i32 noundef %3){{.*}}[ "kcfi"(i32 782417723) ]
     fn(arg1, arg2, arg3);
 }
 
 // CHECK: ![[#]] = !{i32 4, !"cfi-normalize-integers", i32 1}
-// CHECK: ![[TYPE1]] = !{i32 -1143117868}
-// CHECK: ![[TYPE2]] = !{i32 -460921415}
-// CHECK: ![[TYPE3]] = !{i32 -333839615}
+// KCFI ID = DEEB3EA2
+// CHECK: ![[TYPE1]] = !{i32 -555008350}
+// KCFI ID = 24372DCB
+// CHECK: ![[TYPE2]] = !{i32 607595979}
+// KCFI ID = 0x60D0180C
+// CHECK: ![[TYPE3]] = !{i32 1624250380}
diff --git a/clang/test/CodeGen/kcfi.c b/clang/test/CodeGen/kcfi.c
index 622843cedba50f..dc9e818a9f8cca 100644
--- a/clang/test/CodeGen/kcfi.c
+++ b/clang/test/CodeGen/kcfi.c
@@ -7,7 +7,6 @@
 
 /// Must emit __kcfi_typeid symbols for address-taken function declarations
 // CHECK: module asm ".weak __kcfi_typeid_[[F4:[a-zA-Z0-9_]+]]"
-// CHECK: module asm ".set __kcfi_typeid_[[F4]], [[#%d,HASH:]]"
 /// Must not __kcfi_typeid symbols for non-address-taken declarations
 // CHECK-NOT: module asm ".weak __kcfi_typeid_{{f6|_Z2f6v}}"
 
@@ -29,7 +28,7 @@ int __call(fn_t f) __attribute__((__no_sanitize__("kcfi"))) {
 
 // CHECK: define dso_local{{.*}} i32 @{{call|_Z4callPFivE}}(ptr{{.*}} %f){{.*}}
 int call(fn_t f) {
-  // CHECK: call{{.*}} i32 %{{.}}(){{.*}} [ "kcfi"(i32 [[#HASH]]) ]
+  // CHECK: call{{.*}} i32 %{{.}}(){{.*}} [ "kcfi"(i32 [[#%d,HASH:]]) ]
   return f();
 }
 
@@ -48,6 +47,20 @@ static int f5(void) { return 2; }
 // CHECK-DAG: declare !kcfi_type ![[#TYPE]]{{.*}} i32 @{{f6|_Z2f6v}}()
 extern int f6(void);
 
+typedef struct {
+  int *p1;
+  int *p2[16];
+} s_t;
+
+// CHECK: define internal{{.*}} i32 @{{f7|_ZL2f7PFi3s_tEPS_}}(ptr{{.*}} %f, ptr{{.*}} %s){{.*}}
+static int f7(int (*f)(s_t), s_t *s) {
+  // CHECK: call{{.*}} i32 %{{.*}} [ "kcfi"(i32 [[#%d,HASH4:]]) ]
+  return f(*s) + 1;
+}
+
+// CHECK: define internal{{.*}} i32 @{{f8|_ZL2f83s_t}}(ptr{{.*}} %s){{.*}} !kcfi_type ![[#%d,TYPE4:]]
+static int f8(s_t s) { return 0; }
+
 #ifndef __cplusplus
 // C: define internal ptr @resolver1() #[[#]] !kcfi_type ![[#]] {
 int ifunc1(int) __attribute__((ifunc("resolver1")));
@@ -59,12 +72,14 @@ long ifunc2(long) __attribute__((ifunc("resolver2")));
 #endif
 
 int test(void) {
+  s_t s;
   return call(f1) +
          __call((fn_t)f2) +
          call(f3) +
          call(f4) +
          f5() +
-         f6();
+         f6() +
+         f7(f8, &s);
 }
 
 #ifdef __cplusplus
@@ -85,3 +100,4 @@ void test_member_call(void) {
 // CHECK-DAG: ![[#TYPE]] = !{i32 [[#HASH]]}
 // CHECK-DAG: ![[#TYPE2]] = !{i32 [[#%d,HASH2:]]}
 // MEMBER-DAG: ![[#TYPE3]] = !{i32 [[#HASH3]]}
+// CHECK-DAG: ![[#TYPE4]] = !{i32 [[#HASH4]]}

@scottconstable
Copy link
Contributor Author

@phoebewang @lvwr Please review this PR.

static_cast<uint32_t>(llvm::xxHash64(OutName)));
uint32_t OutHash = static_cast<uint32_t>(llvm::xxHash64(OutName));
const auto &Triple = getTarget().getTriple();
if (Triple.isX86() && Triple.isArch64Bit() && Triple.isOSLinux()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> if (FnType && Triple.isX86() && Triple.isArch64Bit() && Triple.isOSLinux())

Just to make sure that FnType isn't a null pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirmc Thank you for the suggestion. I looked at the code and as far as I can tell, CreateKCFITypeId() is only called here:

Ctx, MDB.createConstant(CreateKCFITypeId(FD->getType()))));
, which seems to guarantee that T will always be a function type. Regardless, I think your suggestion is reasonable, in case CreateKCFITypeID() is called elsewhere in the future, so I incorporated this additional check into the PR.

@scottconstable
Copy link
Contributor Author

@phoebewang and @lvwr I also noticed that there is this code in LLVM:

void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
if (!M.getModuleFlag("kcfi"))
return;
// Matches CodeGenModule::CreateKCFITypeId in Clang.
LLVMContext &Ctx = M.getContext();
MDBuilder MDB(Ctx);
std::string Type = MangledType.str();
if (M.getModuleFlag("cfi-normalize-integers"))
Type += ".normalized";
F.setMetadata(LLVMContext::MD_kcfi_type,
MDNode::get(Ctx, MDB.createConstant(ConstantInt::get(
Type::getInt32Ty(Ctx),
static_cast<uint32_t>(xxHash64(Type))))));
. As far as I can tell, this code is not triggered when I build the Linux kernel with -fsanitize=kcfi.

When is this code triggered? And do you think it is necessary to additionally implement the arity-enhancement to this code?

@phoebewang
Copy link
Contributor

@phoebewang and @lvwr I also noticed that there is this code in LLVM:

void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
if (!M.getModuleFlag("kcfi"))
return;
// Matches CodeGenModule::CreateKCFITypeId in Clang.
LLVMContext &Ctx = M.getContext();
MDBuilder MDB(Ctx);
std::string Type = MangledType.str();
if (M.getModuleFlag("cfi-normalize-integers"))
Type += ".normalized";
F.setMetadata(LLVMContext::MD_kcfi_type,
MDNode::get(Ctx, MDB.createConstant(ConstantInt::get(
Type::getInt32Ty(Ctx),
static_cast<uint32_t>(xxHash64(Type))))));

. As far as I can tell, this code is not triggered when I build the Linux kernel with -fsanitize=kcfi.
When is this code triggered? And do you think it is necessary to additionally implement the arity-enhancement to this code?

I'm not familar with KCFI. I find it's added by @samitolvanen in e1c36bd. I think you should triger it with attached test case.

@phoebewang
Copy link
Contributor

Second, this scheme reduces the expected number of hash collisions within each arity, compared against the expected number of collisions (0.01383765) for the 32-bit hashing scheme that includes all arities. The table below shows the expected number of collisions for each arity, given the number of unique indirect callable function types within that arity in the same Ubuntu 24.04 server kernel discussed above.

The collisions vary a lot with different number of function types. It looks to me more smooth if we use 2 bits to distinguish 4 cases: 1, 2, 3 and 0 or others.

Copy link

github-actions bot commented Nov 23, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@scottconstable
Copy link
Contributor Author

@phoebewang and @lvwr I also noticed that there is this code in LLVM:

void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
if (!M.getModuleFlag("kcfi"))
return;
// Matches CodeGenModule::CreateKCFITypeId in Clang.
LLVMContext &Ctx = M.getContext();
MDBuilder MDB(Ctx);
std::string Type = MangledType.str();
if (M.getModuleFlag("cfi-normalize-integers"))
Type += ".normalized";
F.setMetadata(LLVMContext::MD_kcfi_type,
MDNode::get(Ctx, MDB.createConstant(ConstantInt::get(
Type::getInt32Ty(Ctx),
static_cast<uint32_t>(xxHash64(Type))))));

. As far as I can tell, this code is not triggered when I build the Linux kernel with -fsanitize=kcfi.
When is this code triggered? And do you think it is necessary to additionally implement the arity-enhancement to this code?

I'm not familar with KCFI. I find it's added by @samitolvanen in e1c36bd. I think you should triger it with attached test case.

It looks to me like this code might be triggered in some LTO configurations, and/or when linking code compiled from multiple source languages with the expectation that the KCFI type IDs will be compatible. Is my understanding correct?

The comment in the code says "Matches CodeGenModule::CreateKCFITypeId in Clang," which I interpret to mean that this code should produce identical KCFI type IDs for identical function types, which might be tricky if the target binary is compiled from different languages. I added some code to llvm::setKCFIType that I hope will produce consistent output, but admittedly I'm not sure that my treatment of clang::Type and llvm::Type is consistent.

@@ -208,10 +209,34 @@ void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
std::string Type = MangledType.str();
if (M.getModuleFlag("cfi-normalize-integers"))
Type += ".normalized";

uint32_t OutHash = static_cast<uint32_t>(llvm::xxHash64(Type));
auto T = Triple(Twine(M.getTargetTriple()));
Copy link
Contributor Author

@scottconstable scottconstable Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Triple line looks awkward and I regret that I needed to include another header to enable this. Maybe there is a workable API within the existing headers, but I couldn't find one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion! This does look tidier than what I had written. I updated the PR.


for (unsigned int i = 0; !MayHaveStackArgs && i < NumParams; ++i) {
const llvm::Type *PT = F.getArg(i)->getType();
if (!(PT->isPointerTy() || PT->getIntegerBitWidth() <= 64))
Copy link
Contributor Author

@scottconstable scottconstable Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the if condition equivalent to what I wrote in CodeGenModule::CreateKCFITypeId with clang::Type? Specifically, is

      // typeof(*PT) = clang::Type
      if (!(PT->isPointerType() || (PT->isIntegralOrEnumerationType() &&
                                    getContext().getTypeSize(PT) <= 64)))

equivalent to:

      // typeof(*PT) = llvm::Type
      if (!(PT->isPointerTy() || PT->getIntegerBitWidth() <= 64))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Front end like Clang has solved it already. I think we can simply checking the number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that clang does not reserve stack for large arguments and instead this is done later by the LLVM X86 backend. For example:

struct S {
    int *p1;
    int *p2;
    int array[8];
};

int foo(struct S s, struct S *sp) {
    return *s.p1 + *s.p2 + *sp->p1 + *sp->p2;
}

Then when I compile to LLVM IR I see:

define dso_local i32 @foo(ptr noundef byval(%struct.S) align 8 %s, ptr noundef %sp) #0 {

Which suggests an arity of 2. But the X86 backend transforms foo to pass s on the stack, and then sp becomes the sole argument and is passed in rdi. Hence, by the chart in the PR description, this should be treated as an arity-7 function:

Arity Indicator Description
0 0 parameters
1 1 parameter in RDI
2 2 parameters in RDI and RSI
3 3 parameters in RDI, RSI, and RDX
4 4 parameters in RDI, RSI, RDX, and RCX
5 5 parameters in RDI, RSI, RDX, RCX, and R8
6 6 parameters in RDI, RSI, RDX, RCX, R8, and R9
7 At least one parameter may be passed on the stack

This predicate:

      // typeof(*PT) = llvm::Type
      if (!(PT->isPointerTy() || PT->getIntegerBitWidth() <= 64))
        MayHaveStackArgs = true;

should prevent s from being counted as a register argument and correctly set the arity field to 7.

@scottconstable
Copy link
Contributor Author

Second, this scheme reduces the expected number of hash collisions within each arity, compared against the expected number of collisions (0.01383765) for the 32-bit hashing scheme that includes all arities. The table below shows the expected number of collisions for each arity, given the number of unique indirect callable function types within that arity in the same Ubuntu 24.04 server kernel discussed above.

The collisions vary a lot with different number of function types. It looks to me more smooth if we use 2 bits to distinguish 4 cases: 1, 2, 3 and 0 or others.

I re-ran the numbers with a 30-bit hash and 2-bit arity, and you are correct that the distribution of expected collisions is more smooth:

Arity Unique Indirect Callable Function Types Number of Expected Collisions
0 or >3 2089 0.00201654
1 2492 0.00287330
2 3775 0.00660789
3 2547 0.00300181

However, a 2-bit arity would undermine what is arguably the more desirable property:

This scheme enhances security in two ways. First, it prevents a j-arity function pointer from being used to call a k-arity function, unless j=k. The current 32-bit kCFI hash does not prevent, for example, a 2-arity fptr from calling a 3-arity target if the kCFI hashes collide. If this were to happen, then potentially malicious stale/dead data in RDX at the call site could suddenly become live as the third parameter at the call target.

For example, if the 30-bit hash of a 0-arity function type collides with the 30-bit hash of the type of a 4-arity function type, then the RDI, RSI, RDX, and RCX registers that die when calling a function of the 0-arity type will unexpectedly become live if a COP attack redirects the call to a function of the 4-arity type.

Therefore, I believe that the 29-bit hash and 3-bit arity offers a more favorable security posture.

@phoebewang
Copy link
Contributor

@phoebewang and @lvwr I also noticed that there is this code in LLVM:

void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
if (!M.getModuleFlag("kcfi"))
return;
// Matches CodeGenModule::CreateKCFITypeId in Clang.
LLVMContext &Ctx = M.getContext();
MDBuilder MDB(Ctx);
std::string Type = MangledType.str();
if (M.getModuleFlag("cfi-normalize-integers"))
Type += ".normalized";
F.setMetadata(LLVMContext::MD_kcfi_type,
MDNode::get(Ctx, MDB.createConstant(ConstantInt::get(
Type::getInt32Ty(Ctx),
static_cast<uint32_t>(xxHash64(Type))))));

. As far as I can tell, this code is not triggered when I build the Linux kernel with -fsanitize=kcfi.
When is this code triggered? And do you think it is necessary to additionally implement the arity-enhancement to this code?

I'm not familar with KCFI. I find it's added by @samitolvanen in e1c36bd. I think you should triger it with attached test case.

It looks to me like this code might be triggered in some LTO configurations, and/or when linking code compiled from multiple source languages with the expectation that the KCFI type IDs will be compatible. Is my understanding correct?

Looks like the latter, see 71c7313

@phoebewang
Copy link
Contributor

Second, this scheme reduces the expected number of hash collisions within each arity, compared against the expected number of collisions (0.01383765) for the 32-bit hashing scheme that includes all arities. The table below shows the expected number of collisions for each arity, given the number of unique indirect callable function types within that arity in the same Ubuntu 24.04 server kernel discussed above.

The collisions vary a lot with different number of function types. It looks to me more smooth if we use 2 bits to distinguish 4 cases: 1, 2, 3 and 0 or others.

I re-ran the numbers with a 30-bit hash and 2-bit arity, and you are correct that the distribution of expected collisions is more smooth:

Arity Unique Indirect Callable Function Types Number of Expected Collisions
0 or >3 2089 0.00201654
1 2492 0.00287330
2 3775 0.00660789
3 2547 0.00300181
However, a 2-bit arity would undermine what is arguably the more desirable property:

This scheme enhances security in two ways. First, it prevents a j-arity function pointer from being used to call a k-arity function, unless j=k. The current 32-bit kCFI hash does not prevent, for example, a 2-arity fptr from calling a 3-arity target if the kCFI hashes collide. If this were to happen, then potentially malicious stale/dead data in RDX at the call site could suddenly become live as the third parameter at the call target.

For example, if the 30-bit hash of a 0-arity function type collides with the 30-bit hash of the type of a 4-arity function type, then the RDI, RSI, RDX, and RCX registers that die when calling a function of the 0-arity type will unexpectedly become live if a COP attack redirects the call to a function of the 4-arity type.

Therefore, I believe that the 29-bit hash and 3-bit arity offers a more favorable security posture.

Although the default calling convention uses 6 registers, others like RegCall uses more. Do you want to check calling convention as well?

@scottconstable
Copy link
Contributor Author

@phoebewang and @lvwr I also noticed that there is this code in LLVM:

void llvm::setKCFIType(Module &M, Function &F, StringRef MangledType) {
if (!M.getModuleFlag("kcfi"))
return;
// Matches CodeGenModule::CreateKCFITypeId in Clang.
LLVMContext &Ctx = M.getContext();
MDBuilder MDB(Ctx);
std::string Type = MangledType.str();
if (M.getModuleFlag("cfi-normalize-integers"))
Type += ".normalized";
F.setMetadata(LLVMContext::MD_kcfi_type,
MDNode::get(Ctx, MDB.createConstant(ConstantInt::get(
Type::getInt32Ty(Ctx),
static_cast<uint32_t>(xxHash64(Type))))));

. As far as I can tell, this code is not triggered when I build the Linux kernel with -fsanitize=kcfi.
When is this code triggered? And do you think it is necessary to additionally implement the arity-enhancement to this code?

I'm not familar with KCFI. I find it's added by @samitolvanen in e1c36bd. I think you should triger it with attached test case.

It looks to me like this code might be triggered in some LTO configurations, and/or when linking code compiled from multiple source languages with the expectation that the KCFI type IDs will be compatible. Is my understanding correct?

Looks like the latter, see 71c7313

Actually, I think this code was introduced to address a compatibility issue with KASAN, which apparently must generate KCFI-enabled code without clang. I found this explanation at 3b14862 and ClangBuiltLinux/linux#1742.

Regardless, it looks like llvm::setKCFIType is intended to always produce the same KCFI type ID as CodeGenModule::CreateKCFITypeId for equivalent function types. For this PR, this implies that llvm::setKCFIType and CodeGenModule::CreateKCFITypeId must always infer the same arity for the same function type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants