Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested catch statements that span multiple basic blocks lead to invalid decompilations/exceptions #43

Open
aweinstock314 opened this issue Feb 7, 2022 · 4 comments

Comments

@aweinstock314
Copy link

aweinstock314 commented Feb 7, 2022

The following classes (manually assembled by Krakatau) fail to decompile with the develop branch of Procyon.

.version 49 0
.class super public NestedCatchA
.super java/lang/Object

.method static public a : (Ljava/lang/Object;)V
    .limit stack 3
    .limit locals 2

    L1: 
        aload_0
        ifnonnull L3
        aconst_null
        athrow
        goto L3
    L2: 
        .stack stack_1 Object java/lang/Throwable
        athrow
        .stack same
    L3: 
        getstatic java/lang/System out Ljava/io/PrintStream;
        ldc "A"
        invokevirtual java/io/PrintStream println (Ljava/lang/String;)V
        .stack same
        return
    L4: 
        .stack stack_1 Object java/lang/Throwable
        astore_1
        getstatic java/lang/System out Ljava/io/PrintStream;
        ldc "B"
        invokevirtual java/io/PrintStream println (Ljava/lang/String;)V
        .stack same
        return

    .catch java/lang/Throwable from L1 to L2 using L2
    .catch java/lang/Throwable from L2 to L3 using L4
.end method

.method static public main : ([Ljava/lang/String;)V
    .limit stack 2
    .limit locals 2
    ldc "foo"
    invokestatic NestedCatchA a (Ljava/lang/Object;)V
    aconst_null
    invokestatic NestedCatchA a (Ljava/lang/Object;)V
    return
.end method
.end class
.version 49 0
.class super public NestedCatchB
.super java/lang/Object

.method static public a : (Ljava/lang/Object;)V
    .limit stack 3
    .limit locals 2

    L1: 
        aload_0
        ifnonnull L3
        aconst_null
        athrow
    L3: 
        getstatic java/lang/System out Ljava/io/PrintStream;
        ldc "A"
        invokevirtual java/io/PrintStream println (Ljava/lang/String;)V
        .stack same
        return
    L2: 
        .stack stack_1 Object java/lang/Throwable
        athrow
        .stack same
    L4: 
        astore_1
        getstatic java/lang/System out Ljava/io/PrintStream;
        ldc "B"
        invokevirtual java/io/PrintStream println (Ljava/lang/String;)V
        .stack same
        return

    .catch java/lang/Throwable from L1 to L3 using L2
    .catch java/lang/Throwable from L3 to L2 using L4
.end method

.method static public main : ([Ljava/lang/String;)V
    .limit stack 2
    .limit locals 2
    ldc "foo"
    invokestatic NestedCatchB a (Ljava/lang/Object;)V
    aconst_null
    invokestatic NestedCatchB a (Ljava/lang/Object;)V
    return
.end method
.end class

The former (NestedCatchA), decompiles to source code that javac rejects due to throw; not providing an expression.

// 
// Decompiled by Procyon v1.0-SNAPSHOT
// 

public class NestedCatchA
{
    public static void a(final Object o) {
        Label_0010: {
            try {
                if (o == null) {
                    throw null;
                }
                break Label_0010;
            }
            catch (Throwable t) {
                throw t;
            }
            try {
                throw;
                System.out.println("A");
            }
            catch (Throwable t2) {
                System.out.println("B");
            }
        }
    }

    public static void main(final String[] array) {
        a("foo");
        a(null);
    }
}

The latter (NestedCatchB) causes the exception from com.strobel.decompiler.ast.Error.expressionLinkedFromMultipleLocations to be thrown.

// 
// Decompiled by Procyon v1.0-SNAPSHOT
// 

public class NestedCatchB
{
    public static void a(final Object p0) {
        // 
        // This method could not be decompiled.
        // 
        // Original Bytecode:
        // 
        //     1: ifnonnull       6
        //     4: aconst_null    
        //     5: athrow         
        //     6: getstatic       java/lang/System.out:Ljava/io/PrintStream;
        //     9: ldc             "A"
        //    11: invokevirtual   java/io/PrintStream.println:(Ljava/lang/String;)V
        //    14: return         
        //    15: athrow         
        //    16: astore_1       
        //    17: getstatic       java/lang/System.out:Ljava/io/PrintStream;
        //    20: ldc             "B"
        //    22: invokevirtual   java/io/PrintStream.println:(Ljava/lang/String;)V
        //    25: return         
        //    StackMapTable: 00 04 0E 40 07 00 0B 00 08
        //    Exceptions:
        //  Try           Handler
        //  Start  End    Start  End    Type                 
        //  -----  -----  -----  -----  ---------------------
        //  0      6      15     16     Ljava/lang/Throwable;
        //  6      15     16     26     Ljava/lang/Throwable;
        // 
        // The error that occurred was:
        // 
        // java.lang.IllegalStateException: Expression is linked from several locations: Label_0006:
        //     at com.strobel.decompiler.ast.Error.expressionLinkedFromMultipleLocations(Error.java:27)
        //     at com.strobel.decompiler.ast.AstOptimizer.mergeDisparateObjectInitializations(AstOptimizer.java:2596)
        //     at com.strobel.decompiler.ast.AstOptimizer.optimize(AstOptimizer.java:235)
        //     at com.strobel.decompiler.ast.AstOptimizer.optimize(AstOptimizer.java:42)
        //     at com.strobel.decompiler.languages.java.ast.AstMethodBodyBuilder.createMethodBody(AstMethodBodyBuilder.java:214)
        //     at com.strobel.decompiler.languages.java.ast.AstMethodBodyBuilder.createMethodBody(AstMethodBodyBuilder.java:99)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.createMethodBody(AstBuilder.java:782)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.createMethod(AstBuilder.java:675)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.addTypeMembers(AstBuilder.java:552)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.createTypeCore(AstBuilder.java:519)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.createTypeNoCache(AstBuilder.java:161)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.createType(AstBuilder.java:150)
        //     at com.strobel.decompiler.languages.java.ast.AstBuilder.addType(AstBuilder.java:125)
        //     at com.strobel.decompiler.languages.java.JavaLanguage.buildAst(JavaLanguage.java:71)
        //     at com.strobel.decompiler.languages.java.JavaLanguage.decompileType(JavaLanguage.java:59)
        //     at com.strobel.decompiler.DecompilerDriver.decompileType(DecompilerDriver.java:330)
        //     at com.strobel.decompiler.DecompilerDriver.main(DecompilerDriver.java:144)
        // 
        throw new IllegalStateException("An error occurred while decompiling this method.");
    }

    public static void main(final String[] array) {
        a("foo");
        a(null);
    }
}

I'd expect them to decompile to something similar to NestedCatch{A,B}Expected.

public class NestedCatchAExpected {
    public static void a(final Object o) {
        try {
            if(o == null) {
                throw null;
            }
            System.out.println("A");
        }
        catch (Throwable t) {
            try {
                throw t;
            } catch(Throwable t2) {
                System.out.println("B");
            }
        }
    }

    public static void main(String[] args) {
        a("foo");
        a(null);
    }
}
public class NestedCatchBExpected {
    public static void a(final Object o) {
        try {
            if(o == null) {
                throw null;
            }
            try {
                System.out.println("A");
            } catch(Throwable t2) {
                System.out.println("B");
            }
        }
        catch (Throwable t) {
            throw t;
        }
    }

    public static void main(String[] args) {
        a("foo");
        a(null);
    }
}

I'd be interested in working on a fix, if you have suggestions for how to proceed.

@aweinstock314
Copy link
Author

It seems like the exception handler's catch block gets duplicated in AstBuilder.convertToAst.

I printed the following internal structures:

--- a/Procyon.CompilerTools/src/main/java/com/strobel/decompiler/ast/AstBuilder.java
+++ b/Procyon.CompilerTools/src/main/java/com/strobel/decompiler/ast/AstBuilder.java
@@ -106,18 +106,26 @@ public final class AstBuilder {
         LOG.fine("Performing stack analysis...");

         final List<ByteCode> byteCode = builder.performStackAnalysis();

         LOG.fine("Creating bytecode AST...");
+        System.out.println("bytecode");
+        for(ByteCode b : byteCode) {
+            System.out.println(b);
+        }

         @SuppressWarnings("UnnecessaryLocalVariable")
         final List<Node> ast = builder.convertToAst(
             byteCode,
             new LinkedHashSet<>(builder._exceptionHandlers),
             0,
             new MutableInteger(byteCode.size())
         );
+        System.out.println("bytecode ast");
+        for(Node n : ast) {
+            System.out.println(n);
+        }

And got the following output:

  • NestedCatchA
bytecode
#0000: load p0 StackBefore={} StoreTo={stack_01_0} VariablesBefore={#0000,}
#0001: ifnonnull Label_0010 StackBefore={#0000} VariablesBefore={#0000,}
#0004:* aconstnull StackBefore={} StoreTo={stack_05_0} VariablesBefore={#0000,}
#0005: athrow StackBefore={#0004} VariablesBefore={#0000,}
#0009:* athrow StackBefore={#0000} VariablesBefore={#0000,}
#0010:* getstatic java/lang/System.out:Ljava/io/PrintStream; StackBefore={} StoreTo={stack_0F_0} VariablesBefore={#0000,}
#0013: ldc A StackBefore={#0010} StoreTo={stack_0F_1} VariablesBefore={#0000,}
#0015: invokevirtual java/io/PrintStream.println:(Ljava/lang/String;)V StackBefore={#0010,#0013} VariablesBefore={#0000,}
#0018: return StackBefore={} VariablesBefore={#0000,}
#0019:* store var_1_13 StackBefore={#0000} VariablesBefore={#0000,}
#0020: getstatic java/lang/System.out:Ljava/io/PrintStream; StackBefore={} StoreTo={stack_19_0} VariablesBefore={#0000,#0019}
#0023: ldc B StackBefore={#0020} StoreTo={stack_19_1} VariablesBefore={#0000,#0019}
#0025: invokevirtual java/io/PrintStream.println:(Ljava/lang/String;)V StackBefore={#0020,#0023} VariablesBefore={#0000,#0019}
#0028: return StackBefore={} VariablesBefore={#0000,#0019}
bytecode ast
try {
    stack_01_0 = p0
    __ifnonnull(Label_0010, stack_01_0)
    Label_0004:
    stack_05_0 = aconstnull()
    athrow(stack_05_0)
}
catch (java.lang.Throwable stack_09_0) {
    Label_0009:
    athrow(stack_09_0)
}

try {
    Label_0009:
    athrow(stack_09_0)
    Label_0010:
    stack_0F_0 = getstatic(System::out)
    stack_0F_1 = ldc("A")
    invokevirtual(PrintStream::println, stack_0F_0, stack_0F_1)
    return()
}
catch (java.lang.Throwable stack_13_0) {
    Label_0019:
    var_1_13 = stack_13_0
    stack_19_0 = getstatic(System::out)
    stack_19_1 = ldc("B")
    invokevirtual(PrintStream::println, stack_19_0, stack_19_1)
    return()
}
  • NestedCatchB
bytecode
#0000: load p0 StackBefore={} StoreTo={stack_01_0} VariablesBefore={#0000,}
#0001: ifnonnull Label_0006 StackBefore={#0000} VariablesBefore={#0000,}
#0004:* aconstnull StackBefore={} StoreTo={stack_05_0} VariablesBefore={#0000,}
#0005: athrow StackBefore={#0004} VariablesBefore={#0000,}
#0006:* getstatic java/lang/System.out:Ljava/io/PrintStream; StackBefore={} StoreTo={stack_0B_0} VariablesBefore={#0000,}
#0009: ldc A StackBefore={#0006} StoreTo={stack_0B_1} VariablesBefore={#0000,}
#0011: invokevirtual java/io/PrintStream.println:(Ljava/lang/String;)V StackBefore={#0006,#0009} VariablesBefore={#0000,}
#0014: return StackBefore={} VariablesBefore={#0000,}
#0015:* athrow StackBefore={#0000} VariablesBefore={#0000,}
#0016:* store var_1_10 StackBefore={#0000} VariablesBefore={#0000,}
#0017: getstatic java/lang/System.out:Ljava/io/PrintStream; StackBefore={} StoreTo={stack_16_0} VariablesBefore={#0000,#0016}
#0020: ldc B StackBefore={#0017} StoreTo={stack_16_1} VariablesBefore={#0000,#0016}
#0022: invokevirtual java/io/PrintStream.println:(Ljava/lang/String;)V StackBefore={#0017,#0020} VariablesBefore={#0000,#0016}
#0025: return StackBefore={} VariablesBefore={#0000,#0016}
bytecode ast
try {
    stack_01_0 = p0
    __ifnonnull(Label_0006, stack_01_0)
    Label_0004:
    stack_05_0 = aconstnull()
    athrow(stack_05_0)
    Label_0006:
    stack_0B_0 = getstatic(System::out)
    stack_0B_1 = ldc("A")
    invokevirtual(PrintStream::println, stack_0B_0, stack_0B_1)
    return()
}
catch (java.lang.Throwable stack_0F_0) {
    Label_0015:
    athrow(stack_0F_0)
}

try {
    Label_0006:
    stack_0B_0 = getstatic(System::out)
    stack_0B_1 = ldc("A")
    invokevirtual(PrintStream::println, stack_0B_0, stack_0B_1)
    return()
    Label_0015:
    athrow(stack_0F_0)
}
catch (java.lang.Throwable stack_10_0) {
    Label_0016:
    var_1_10 = stack_10_0
    stack_16_0 = getstatic(System::out)
    stack_16_1 = ldc("B")
    invokevirtual(PrintStream::println, stack_16_0, stack_16_1)
    return()
}

In the former, I think that the presence of

    Label_0009:
    athrow(stack_09_0)

in the second try block is incorrect, and that in the latter, the presence of

    Label_0015:
    athrow(stack_0F_0)

in the second try block is incorrect.

I suspect they're being included because they're within the bounds [startOffset, endOffset] for the ExceptionTableEntrys, but so far it seems like most of AstBuilder is correctly using [startOffset, endOffset).

@aweinstock314
Copy link
Author

It seems like the call to ensureDesiredProtectedRanges in AstBuilder.pruneExceptionHandlers is enlarging the range of the second try block in NestedCatchA from [9, 10) to [9, 19).
Commenting that out causes procyon to give the following outputs, which are closer to correct (though there's a spurious goto in the former, and I'm not 100% sure of the nesting structure on the latter):

// Decompiled by Procyon v1.0-SNAPSHOT
// 

public class NestedCatchA
{   
    public static void a(final Object o) {
        try {
            if (o == null) {
                throw null;
            }
            goto Label_0010;
        }
        catch (Throwable t) {
            try {
                throw t;
            }
            catch (Throwable t2) {
                System.out.println("B");
            }
        }
    }

    public static void main(final String[] array) {
        a("foo");
        a(null);
    }
}
// 
// Decompiled by Procyon v1.0-SNAPSHOT
// 

public class NestedCatchB
{   
    public static void a(final Object o) {
        try {
            if (o == null) {
                throw null;
            }
        }
        catch (Throwable t) {
            throw t;
        }
        try {
            System.out.println("A");
        }
        catch (Throwable t2) {
            System.out.println("B");
        }
    }

    public static void main(final String[] array) {
        a("foo");
        a(null);
    }
}

This would break other test cases though, so I'll still need to understand ensureDesiredProtectedRanges to figure out how to make it work for the sorts of try-catches in these examples.

@mstrobel
Copy link
Owner

mstrobel commented Feb 8, 2022

Oh lord, you've stumbled into Procyon's weakest link--it's exception handler processing. It's pretty broken, and I'm honestly surprised it works as well as it does (most of the time). It's really not hard to break it, especially if you're feeding it classes that weren't emitted by a Java compiler.

I've promised myself I'll rewrite it at some point, but I've been telling myself that for years, so 🤷🏻‍♂️.

@aweinstock314
Copy link
Author

I'd still like to fix this (this is a minimized reproducer for a larger program that fails to decompile some methods, with the same exception as NestedCatchB).

Do you think it's likely that local fixes can work, or that a partial rewrite is needed?

Current local fixes I'm considering:

  • Without ensureDesiredProtectionRanges, testFinallyWithinFinally duplicates one of the inner subtrees outside the finally. I think this symptom can be addressed by keeping track of which ranges of instructions have already been processed, and using that information when calculating the start/end indices in convertToAst.
  • The goto that's retained in the most recent output of NestedCatchA also shows up in testComplexNestedTryCatchFinallyWithThrowingOuterFinally. (I realize now that my NestedCatchAExpected is incorrect, and that System.out.println("A"); is not contained within either try block). I think this is because the target of the conditional jump isn't added to the AST, so it should probably be added to the end, with its label, after a return, to be cleaned up later by GotoRemoval.
  • If there's some invariants that javac-generated try/catch/finally blocks always follow in terms of which basic blocks are next to each other, perhaps a preprocessing step can re-order basic blocks (inserting jumps and patching offsets as needed) to make the input look more javac-like (using information from the control flow graph, since it looks from the graphviz like that's correct and faithful to the class file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants