-
Notifications
You must be signed in to change notification settings - Fork 12.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU: Remove wavefrontsize64 feature from dummy target #117410
AMDGPU: Remove wavefrontsize64 feature from dummy target #117410
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) ChangesThis is a refinement for the existing hack. With this, Continue to assume the value is 64 with no wavesize feature. Introduce an isWaveSizeKnown helper to check if we know the Full diff: https://github.com/llvm/llvm-project/pull/117410.diff 4 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/GCNProcessors.td b/llvm/lib/Target/AMDGPU/GCNProcessors.td
index 3403cbab526d46..6241fa6e22ab8b 100644
--- a/llvm/lib/Target/AMDGPU/GCNProcessors.td
+++ b/llvm/lib/Target/AMDGPU/GCNProcessors.td
@@ -9,11 +9,11 @@
// The code produced for "generic" is only useful for tests and cannot
// reasonably be expected to execute on any particular target.
def : ProcessorModel<"generic", NoSchedModel,
- [FeatureWavefrontSize64, FeatureGDS, FeatureGWS]
+ [FeatureGDS, FeatureGWS]
>;
def : ProcessorModel<"generic-hsa", NoSchedModel,
- [FeatureWavefrontSize64, FeatureGDS, FeatureGWS, FeatureFlatAddressSpace]
+ [FeatureGDS, FeatureGWS, FeatureFlatAddressSpace]
>;
//===------------------------------------------------------------===//
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.cpp b/llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
index 6233ca2eb4f1dd..51361b75940560 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
@@ -100,14 +100,16 @@ GCNSubtarget &GCNSubtarget::initializeSubtargetDependencies(const Triple &TT,
if (Gen == AMDGPUSubtarget::INVALID) {
Gen = TT.getOS() == Triple::AMDHSA ? AMDGPUSubtarget::SEA_ISLANDS
: AMDGPUSubtarget::SOUTHERN_ISLANDS;
- }
-
- if (!hasFeature(AMDGPU::FeatureWavefrontSize32) &&
- !hasFeature(AMDGPU::FeatureWavefrontSize64)) {
+ // Assume wave64 for the unknown target, if not explicitly set.
+ if (getWavefrontSizeLog2() == 0)
+ WavefrontSizeLog2 = 6;
+ } else if (!hasFeature(AMDGPU::FeatureWavefrontSize32) &&
+ !hasFeature(AMDGPU::FeatureWavefrontSize64)) {
// If there is no default wave size it must be a generation before gfx10,
// these have FeatureWavefrontSize64 in their definition already. For gfx10+
// set wave32 as a default.
ToggleFeature(AMDGPU::FeatureWavefrontSize32);
+ WavefrontSizeLog2 = getGeneration() >= AMDGPUSubtarget::GFX10 ? 5 : 6;
}
// We don't support FP64 for EG/NI atm.
@@ -147,10 +149,6 @@ GCNSubtarget &GCNSubtarget::initializeSubtargetDependencies(const Triple &TT,
!getFeatureBits().test(AMDGPU::FeatureCuMode))
LocalMemorySize *= 2;
- // Don't crash on invalid devices.
- if (WavefrontSizeLog2 == 0)
- WavefrontSizeLog2 = 5;
-
HasFminFmaxLegacy = getGeneration() < AMDGPUSubtarget::VOLCANIC_ISLANDS;
HasSMulHi = getGeneration() >= AMDGPUSubtarget::GFX9;
@@ -166,7 +164,7 @@ GCNSubtarget &GCNSubtarget::initializeSubtargetDependencies(const Triple &TT,
void GCNSubtarget::checkSubtargetFeatures(const Function &F) const {
LLVMContext &Ctx = F.getContext();
- if (hasFeature(AMDGPU::FeatureWavefrontSize32) ==
+ if (hasFeature(AMDGPU::FeatureWavefrontSize32) &&
hasFeature(AMDGPU::FeatureWavefrontSize64)) {
Ctx.diagnose(DiagnosticInfoUnsupported(
F, "must specify exactly one of wavefrontsize32 and wavefrontsize64"));
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index f3f96940c1f44b..5eada4f003ece7 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -1564,6 +1564,14 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
return getWavefrontSize() == 64;
}
+ /// Returns if the wavesize of this subtarget is known reliable. This is false
+ /// only for the a default target-cpu that does not have an explicit
+ /// +wavefrontsize target feature.
+ bool isWaveSizeKnown() const {
+ return hasFeature(AMDGPU::FeatureWavefrontSize32) ||
+ hasFeature(AMDGPU::FeatureWavefrontSize64);
+ }
+
const TargetRegisterClass *getBoolRC() const {
return getRegisterInfo()->getBoolRC();
}
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index 344028c4b48689..e21aa70c9859a0 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -649,9 +649,9 @@ void AMDGPUInstPrinter::printDefaultVccOperand(bool FirstOperand,
raw_ostream &O) {
if (!FirstOperand)
O << ", ";
- printRegOperand(STI.hasFeature(AMDGPU::FeatureWavefrontSize64)
- ? AMDGPU::VCC
- : AMDGPU::VCC_LO,
+ printRegOperand(STI.hasFeature(AMDGPU::FeatureWavefrontSize32)
+ ? AMDGPU::VCC_LO
+ : AMDGPU::VCC,
O, MRI);
if (FirstOperand)
O << ", ";
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this target get invoked with empty -mcpu
?
Yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! libc tests are still happy so I'm fairly confident this doesn't break anything.
The disassembler seems unhappy on CI (misclicked the comment button and accidentally closed it, sorry) |
dad0898
to
a9b9b65
Compare
bccd646
to
8c00007
Compare
0f6254a
to
79ff6e6
Compare
8c00007
to
a4f1256
Compare
This is a refinement for the existing hack. With this, the default target will have neither wavefrontsize feature present, unless it was explicitly specified. That is, getWavefrontSize() == 64 no longer implies +wavefrontsize64. getWavefrontSize() == 32 does imply +wavefrontsize32. Continue to assume the value is 64 with no wavesize feature. This maintains the codegenable property without any code that directly cares about the wavesize needs to worry about it. Introduce an isWaveSizeKnown helper to check if we know the wavesize is accurate based on having one of the features explicitly set, or a known target-cpu. I'm not sure what's going on in wave_any.s. It's testing what happens when both wavesizes are enabled, but this is treated as an error in codegen. We now treat wave32 as the winning case, so some cases that were previously printed as vcc are now vcc_lo.
79ff6e6
to
6fa5b9a
Compare
This is a refinement for the existing hack. With this, the default target will have neither wavefrontsize feature present, unless it was explicitly specified. That is, getWavefrontSize() == 64 no longer implies +wavefrontsize64. getWavefrontSize() == 32 does imply +wavefrontsize32. Continue to assume the value is 64 with no wavesize feature. This maintains the codegenable property without any code that directly cares about the wavesize needing to worry about it. Introduce an isWaveSizeKnown helper to check if we know the wavesize is accurate based on having one of the features explicitly set, or a known target-cpu. I'm not sure what's going on in wave_any.s. It's testing what happens when both wavesizes are enabled, but this is treated as an error in codegen. We now treat wave32 as the winning case, so some cases that were previously printed as vcc are now vcc_lo.
This is a refinement for the existing hack. With this,
the default target will have neither wavefrontsize feature
present, unless it was explicitly specified. That is,
getWavefrontSize() == 64 no longer implies +wavefrontsize64.
getWavefrontSize() == 32 does imply +wavefrontsize32.
Continue to assume the value is 64 with no wavesize feature.
This maintains the codegenable property without any code
that directly cares about the wavesize needing to worry about it.
Introduce an isWaveSizeKnown helper to check if we know the
wavesize is accurate based on having one of the features explicitly
set, or a known target-cpu.
I'm not sure what's going on in wave_any.s. It's testing what
happens when both wavesizes are enabled, but this is treated
as an error in codegen. We now treat wave32 as the winning
case, so some cases that were previously printed as vcc are now
vcc_lo.