Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor performance on function boundary detection. #1

Open
5c4lar opened this issue Dec 25, 2021 · 12 comments
Open

Poor performance on function boundary detection. #1

5c4lar opened this issue Dec 25, 2021 · 12 comments

Comments

@5c4lar
Copy link

5c4lar commented Dec 25, 2021

We evaluated the function boundary detection accuracy of DeepDi on our dataset with the following code:

def process(file):
    output = ""
    with DD.Open(file.encode()) as file_data:
        for sec in file_data.sections.iter(DD.Section):
            if not sec.executable:
                continue
            if sec.end - sec.start > 0:
                text_result = file_data.disassemble(sec.start, sec.end, False)
                for data in text_result.functions.iter(ctypes.c_int64):
                    output += str(data.value) + '\n'
                
    with open(os.path.join('output', os.path.basename(file)), "w") as f:
        f.write(output)

The result is much worse than what you claimed in your paper, with average precision at 0.19743204989853363 and average recall at 0.10910939236737424. We also test the accuracy of ddisasm, which is much better. Both precision and recall are close to 1.0.
Is there anything wrong with my implementation? If so, could you please give an example of the right way to detect the function boundary with DeepDi? Or is this just because your model is overfitting and cannot generalize well?

@Nifury
Copy link
Member

Nifury commented Dec 25, 2021

Your implementation looks good to me. The poor accuracy could be caused by a bug in my code, or overfitting, but I’m on vacation and cannot look into this issue immediately. I’ll keep you updated when I get home. It would also help a lot if you could share one of your binaries.

Happy Holidays!

@5c4lar
Copy link
Author

5c4lar commented Dec 25, 2021

Also, we test on the master branch, it seems that the interface on the release branch is very different. I tried to test on the release branch but always get OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library when trying to initialize the disassembler. Would that be related to this problem? Will the release branch perform better? How should I fix the problem? Or when will the CPU-only version be available?

@5c4lar
Copy link
Author

5c4lar commented Dec 25, 2021

On SPEC 2006, which is one of the datasets you have tested.

precision 0.3034157479771696
recall 0.2690107340170151

That‘s strange... Maybe we can discuss this later 🤨

Happy Holidays!

@5c4lar
Copy link
Author

5c4lar commented Dec 25, 2021

@Nifury
Copy link
Member

Nifury commented Dec 28, 2021

Thanks for providing the binary! I briefly went over the assembly code, and I think the endbr64 instruction is the reason why DeepDi has low accuracy. endbr64 is introduced in GCC 8, and it never appeared in the training set. DeepDi doesn't recognize this instruction, and thus won't classify it as a function entrypoint.

The master branch is sort of obsolete, and it is CPU-only. The release branch is for Artifact Evaluation and it should offer the same performance as reported in the paper. OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library indicates you have some dependencies missing. It is GPU-only and requires an NVIDIA GPU. Please refer to README for more details.

when will the CPU-only version be available?

Unfortunately, we don't currently have a timeline for the CPU-only version.

Happy New Year!

@BinMl
Copy link

BinMl commented Jan 16, 2022

@Nifury Hi! Thanks for providing the artifact code. May I know whether any Linux-based version would be provided? I saw from the README that "Native support for Windows and Linux", but it seems Windows-only?

@BinMl
Copy link

BinMl commented Jan 16, 2022

@Nifury Hi, I also encounter the issue about low precision and recall. I attached the non-stripped binary at the end of this post. Before running DeepDi, the binary will be stripped.

I use the code proposed by @ucasqsl and the library from the master branch. I cannot run DeepDi with the release branch since my Windows laptop does not have a GPU. My GCC version is 7.5.0.

If I understand correctly, the function boundaries should match the output of readelf -s <non-stripped-binary>. Would you mind taking a look at it? Thanks for your help in advance.

spec2000-clang-64-O0-gcc.zip

@Nifury
Copy link
Member

Nifury commented Jan 16, 2022

@BinMl Thanks for your interest in DeepDi, and thanks for providing your binary!

It seems that the library from the master branch is indeed buggy. I also tested the code from the release branch, and it gave a more reasonable result: 5 false positives and 1 false negatives.

May I know whether any Linux-based version would be provided?

Currently only the Windows library is provided. I could try to fix the code in the master branch or port it to Linux in my free time, but I may not be able to do it in a timely manner.

@BinMl
Copy link

BinMl commented Jan 16, 2022

@Nifury Thanks for your prompt reply!

It seems that the library from the master branch is indeed buggy. I also tested the code from the release branch, and it gave a more reasonable result: 5 false positives and 1 false negatives.

It sounds great! I found the APIs of the master branch and the release branch are a bit different. Would you mind sharing your code so that I can make sure the correctness of my demo (although I know the coding is simple, lol)?

But on the other hand it is sad that I cannot play with DeepDi for a while since I can hardly find a Windows machine with cuda enabled ;(.

I also noticed that DeepDi is a commercial tool (https://www.deepbitstech.com/deepdi.html). May I know whether the personal edition has the bug-free CPU-only version? I am considering to buy it but a bit hesitate due to the issues of the CPU-only version.

@Nifury
Copy link
Member

Nifury commented Jan 17, 2022

Okay I'm working on releasing a Docker image so that it will be easier for everyone to evaluate it. The image will contain both CPU and GPU version, and will be free for non-commercial use.

If everything goes well, it will be released in the next few days 😀

@Nifury
Copy link
Member

Nifury commented Jan 27, 2022

Just committed the new version. Feel free to test it and report any issues.

@lo-cloud
Copy link

We evaluated the function boundary detection accuracy of DeepDi on our dataset with the following code:

def process(file):
    output = ""
    with DD.Open(file.encode()) as file_data:
        for sec in file_data.sections.iter(DD.Section):
            if not sec.executable:
                continue
            if sec.end - sec.start > 0:
                text_result = file_data.disassemble(sec.start, sec.end, False)
                for data in text_result.functions.iter(ctypes.c_int64):
                    output += str(data.value) + '\n'
                
    with open(os.path.join('output', os.path.basename(file)), "w") as f:
        f.write(output)

The result is much worse than what you claimed in your paper, with average precision at 0.19743204989853363 and average recall at 0.10910939236737424. We also test the accuracy of ddisasm, which is much better. Both precision and recall are close to 1.0. Is there anything wrong with my implementation? If so, could you please give an example of the right way to detect the function boundary with DeepDi? Or is this just because your model is overfitting and cannot generalize well?

Could you please explain in more detail about how to detect the function boundary with DeepDi? I don't know how to do, and I really need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants