-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use of the same substitution for different template parameters is very hard to demangle #106
Comments
It looks like this problem manifests in non-lambda situations too. For example:
produces the mangling
and libc++abi's demangler produces:
... which are both wrong in the same way: the second |
See also #68 |
I think on reflection this might be a bug in the various implementations. In particular, we say: "Note that substitutable components are the represented symbolic constructs, not their associated mangling character strings." It seems to me that the template parameters of distinct template specializations (such as If that's right, then the first above example should mangle as |
What about function parameters? For example:
is mangled by GCC and Clang as
and by ICC as
(the A demangler that chooses to demangle the above as
... would encounter exactly the same issues that we see for template parameters: the second |
The ABI has an example which clearly supports ICC about your unrelated issue:
I wonder if GCC and Clang are just failing to adjust L when mangling. |
Well, I mean, they don't have nothing to do with each other. They're references to the same parameter, just being redundantly mangled. So I think the question here is whether that relationship is actually reasonable to ask demanglers to handle. |
I agree that that's how it should work. I do worry about whether mangler implementations are going to end up having trouble with template redeclarations if we change this, though. Also, fixing this is ABI-affecting. Do we think this case is marginal enough that this is actually acceptable to change? |
One of them is the first parameter of
I've implemented an approximation of this for Clang as follows: track for each substitution the innermost <encoding> whose function/template parameters it refers to, and don't allow a substitution to be used once we leave its innermost referenced <encoding>. That's not entirely faithful to the idea that we use substitutions for the same symbolic construct (different <encoding>s that refer to entities in the same scope won't be able to share substitutions), but it's a lot easier to implement than doing it correctly would be. I think it's also a lot easier to specify (a substitution that refers to a function or template parameter in an <encoding> cannot be used outside of that <encoding>). Maybe that's good enough?
Well, I raised this issue because we hit this in practice -- one of our internal systems was unhappy that we were generating symbols we couldn't demangle (item 3 in the original comment), and while fixing that I noticed that the demangling was still wrong after fixing the demangler. The setup there was passing a local generic lambda to a function template that passed another local generic lambda to another template, resulting in both lambdas appearing in the mangling of the same symbol. So it's not theoretical; such manglings do arise in practice. I've not seen any cases where this really matters for ABI reasons (where the same symbol would be generated in more than one translation unit), nor any cases where the identity of the symbol matters and duplicates with different manglings would be a real problem. But I would not discount the possibility that they're out there somewhere. I think on balance using a rule that can be reasonably demangled is worth the risk of breaking something here. These cases are very uncommon right now but are going to become increasingly more common with increased adoption of generic lambdas and the use of lambdas in function signatures. Implementations could emit both symbols for a time if they're concerned (I think GCC does this sort of thing in some cases already). |
I've tried to implement this correctly (not merely approximately) -- that is, treat substitutions as symbolically different if they refer to function or template parameters of different entities. It does indeed seem to be difficult. It seems like we could either say:
Option 3 seems most promising to me. As noted above, my current foray into this space says that a substitution created within an <encoding> can only be used within that same <encoding> if it references a function or template parameter. That seems suitably easy to implement, and requires no changes to existing demanglers. If there's interest in that direction, I can try to gather some data on how common such manglings are. If we don't want to risk an ABI change, then formalizing option 1 (the de facto current rule) and leaving demangler implementers with a headache seems like the path to take. |
Wearing my Apple hat, I'm tentatively comfortable with an ABI change here. Apple's interest as a platform owner is in (1) maintaining binary compatibility with libc++ as a first-party library vendor and (2) allowing third parties to vend their own reasonable binary interfaces, and it doesn't seem like either of those is implicated here — I certainly hope users don't expect this kind of template declaration with lambdas and type inference to be a stable part of a library interface. |
Another place where this comes us is when mangling requirements for member-like constrained friend templates: template<typename T, typename U> concept C = true;
template<typename T, typename U> struct X {};
template<typename T> struct A {
template<typename U> requires C<T, U> friend X<T, U> f(...) {}
};
void g(A<int> a) { f<float>(a); } Clang mangles |
There's a case now with friend function templates where we need to be mangling the declaring type because friends of different types are different entities by the ODR, isn't there? Does this reliably fall into that case, or is it a more general problem? Or am I misremembering how that DR got resolved? |
Yes, there is such a case, and it applies in the prior example (Clang currently gets this wrong for friend function templates, but we're working on fixing that -- the mangling of template<typename T> concept C = true;
template<typename T> struct A {
template<typename U> requires C<T> && C<U> void f(T, U) {}
};
void g(A<int> a) { a.f(0, 0); } Clang mangles this as
Then in the function parameters:
|
I see. It seems wrong that |
If we could do everything all over again, it might make sense to number template depths from the inside out, not from the outside in, so that substituting outer levels (or ignoring them for friend declarations) doesn't change things that aren't dependent on the outer parameters -- and in particular, so that Following on from my prior suggestion, here's a possible approach:
So, broadly, we don't reuse substitutions when they contain a symbolic reference to a (function or template) parameter by depth and index unless we're in the same substitution scope where the substitution was created. This would purely be a compiler change: demanglers can continue to keep a flat list of substitutions. I think that would address all the issues in this PR. It should also mean that compilers that use different notions of symbolic identity for function or template parameters (eg, depth and index versus de Bruijn indexes) should be able to use their own internal notions of "same type" for substitutions and still produce the same manglings as each other. It also seems pretty straightforward and efficient to implement, and shouldn't change any manglings that don't already run into problems where their substitutions don't make sense. We would then mangle my previous example as For my second comment the mangling would change from |
Example:
GCC, Clang, and EDG agree that the
operator()
mangling comes out as_ZN1XIZ1fIiEvOT_EUlS2_DpT0_E_EclIJEEEvDpT_
, which no-one can demangle. At least three different things go wrong here:The
OT_
referring to the first parameter in the lambda-sig got rewritten to a substitutionS2_
. This completely breaks LLVM's demangling strategy, which rewrites template parameters (and substitutions) to the corresponding type as they're parsed, and so has no way to even represent a substitution that might mean two completely different things in different contexts, as happens here.The
Dp
apparently confuses GCC's demangler, leaving it unable to see thatT0_
meansauto
.The (hidden by substitution)
T_
andT0_
appear in a context where there is a level of template parameters in scope already. That confuses LLVM's demangler (but that seems like a comparatively straightforward bug).My focus here is problem 1: allowing references to the lambda's implicit template parameter to be rewritten as a substitution referring to f's template parameter seems problematic. It's not clear whether the rules intend that, but it's at least what three different compilers do. That choice means that we can't expand substitutions as we parse during demangling -- we must preserve the original form of the substitution string and re-process it, because a
T_
appearing within it can mean different things for different uses of the same substitution.Perhaps distinct template parameters should never be considered the same for the purpose of forming substitutions, even if they have the same depth and index.
The text was updated successfully, but these errors were encountered: