Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimised out capability loads and stores making code slower #292

Closed
jonwoodruff opened this issue Aug 31, 2017 · 6 comments
Closed

Optimised out capability loads and stores making code slower #292

jonwoodruff opened this issue Aug 31, 2017 · 6 comments

Comments

@jonwoodruff
Copy link

jonwoodruff commented Aug 31, 2017

<Both C code and generated assembly updated in light of RichardsonAlex's comment>
C code:

void doLoop(char * in, char * out, int i)
{
	char * __capability inLocal = (char * __capability)in;
	char * __capability outLocal = (char * __capability)out;
	outLocal[i+1]=inLocal[i+1];
	outLocal[i+2]=inLocal[i+2];
}

Assembly generated:

	daddiu	$sp, $sp, -16
	sd	$fp, 8($sp)             # 8-byte Folded Spill
	move	 $fp, $sp
	addiu	$1, $6, 1
	sll	$1, $1, 0
	daddu	$2, $4, $1
	daddu	$1, $5, $1
	lb	$2, 0($2)
	sb	$2, 0($1)
	addiu	$1, $6, 2
	sll	$1, $1, 0
	daddu	$2, $4, $1
	daddu	$1, $5, $1
	lb	$2, 0($2)
	sb	$2, 0($1)
	move	 $sp, $fp
	ld	$fp, 8($sp)             # 8-byte Folded Reload
	jr	$ra
	daddiu	$sp, $sp, 16

Assembly desired:

	daddiu	$sp, $sp, -16
	sd	$fp, 8($sp)             # 8-byte Folded Spill
	move	 $fp, $sp
	cfromptr	$1, $c1, $c0
	cfromptr	$2, $c2, $c0
	clb	$1, $6, 1($c1)
	csb	$1, $6, 1($c2)
	clb	$1, $6, 2($c1)
	csb	$1, $6, 2($c2)
	move	 $sp, $fp
	ld	$fp, 8($sp)             # 8-byte Folded Reload
	jr	$ra
	daddiu	$sp, $sp, 16

Optimal assembly desired:

	daddiu	$sp, $sp, -16
	sd	$fp, 8($sp)             # 8-byte Folded Spill
	move	 $fp, $sp
	cfromptr	$1, $c1, $c0
	clb	$1, $6, 1($c1)
	clb	$2, $6, 2($c1)
	cfromptr	$2, $c2, $c0
	csb	$1, $6, 1($c2)
	csb	$2, $6, 2($c2)
	move	 $sp, $fp
	ld	$fp, 8($sp)             # 8-byte Folded Reload
	jr	$ra
	daddiu	$sp, $sp, 16
@arichardson
Copy link
Member

You aren't actually using the capabilities so the compiler doesn't load via them

@arichardson
Copy link
Member

Hmm actually it also happens if I use inLocal instead of in

@arichardson
Copy link
Member

This is the IR at -O2:

; Function Attrs: norecurse nounwind
define void @doLoop(i8* nocapture readonly, i8* nocapture, i32 signext) local_unnamed_addr #0 {
  %4 = add nsw i32 %2, 1
  %5 = sext i32 %4 to i64
  %6 = getelementptr inbounds i8, i8* %0, i64 %5
  %7 = addrspacecast i8* %6 to i8 addrspace(200)*
  %8 = load i8, i8 addrspace(200)* %7, align 1, !tbaa !3
  %9 = getelementptr inbounds i8, i8* %1, i64 %5
  %10 = addrspacecast i8* %9 to i8 addrspace(200)*
  store i8 %8, i8 addrspace(200)* %10, align 1, !tbaa !3
  %11 = add nsw i32 %2, 2
  %12 = sext i32 %11 to i64
  %13 = getelementptr inbounds i8, i8* %0, i64 %12
  %14 = addrspacecast i8* %13 to i8 addrspace(200)*
  %15 = load i8, i8 addrspace(200)* %14, align 1, !tbaa !3
  %16 = getelementptr inbounds i8, i8* %1, i64 %12
  %17 = addrspacecast i8* %16 to i8 addrspace(200)*
  store i8 %15, i8 addrspace(200)* %17, align 1, !tbaa !3
  ret void
}

@davidchisnall
Copy link
Member

This looks like a phase ordering problem in the CHERI addressing mode folder. It appears that we're trying to replace capability loads and stores with MIPS ones before trying to use complex addressing modes.

@davidchisnall
Copy link
Member

There are a number of interesting things here, especially if you put it back into a loop:

void doLoop(char * in, char * out, long i)
{
	char * __capability inLocal = (char * __capability)in;
	char * __capability outLocal = (char * __capability)out;
	do {
		outLocal[i]=inLocal[i];
		i--;
	} while (i>0);
}

InstCombine appears to move the address space cast into the loop, which adds an extra instruction in a loop for us.

@arichardson arichardson transferred this issue from CTSRD-CHERI/llvm Feb 18, 2019
@arichardson
Copy link
Member

Closing this since we don't care about improving MIPS codegen anymore.

@arichardson arichardson closed this as not planned Won't fix, can't repro, duplicate, stale Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants