Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bp_prefix not set on breakpoints declared with function name instead of hex_address ( #1039

Closed
Angelo942 opened this issue Jan 4, 2024 · 5 comments

Comments

@Angelo942
Copy link
Contributor

gef/gef.py

Line 7571 in deeab2f

def context_code(self) -> None:

gef/gef.py

Lines 7578 to 7579 in deeab2f

breakpoints = gdb.breakpoints() or []
bp_locations = [b.location for b in breakpoints if b.location and b.location.startswith("*")]

gef/gef.py

Line 7589 in deeab2f

for insn in self.instruction_iterator(pc, nb_insn, nb_prev=nb_insn_prev):

gef/gef.py

Line 7593 in deeab2f

bp_prefix = Color.redify(BP_GLYPH) if self.addr_has_breakpoint(insn.address, bp_locations) else " "

gef/gef.py

Lines 7568 to 7569 in deeab2f

def addr_has_breakpoint(self, address: int, bp_locations: List[str]) -> bool:
return any(hex(address) in b for b in bp_locations)

b.location is just the string used by the user to set the breakpoint not its address, so the current code relies on the user using an hexadecimal address and doesn't handle breakpoints set on a function.

Would it be worth it to parse info breakpoints to extract all the addresses instead of ignoring some breakpoints ?

I also considered parsing b.location and look for the address from the symbol name, but I still have troubles with that part of the API and we would have to handle ourselves the differences between declarations such as b <fun>, b *<fun> and b *'<fun>'+<offset>.

GDB 13.1 did introduce Breakpoint.locations that can return the exact address, but I don't think it's an option since it relies on a recent version of gdb.

@Grazfather
Copy link
Collaborator

so the current code relies on the user using an hexadecimal address and doesn't handle breakpoints set on a function.

Yeah that is not good. We could try to resolve the bp.location, but maybe check for Breakpoint.locations and use it if we can. If we can get the bp number, then using info breakpoints might be easy, but we probably don't want to run commands and parse its output too often, because it can slow down single-stepping (since it runs on every context print).

@Angelo942
Copy link
Contributor Author

For info breakpoints why do you want the bp number first ? I was thinking about a simple regex such as:

        breakpoints = gdb.execute("info breakpoints", to_string=True)
        bp_locations = re.findall(r'0x[0-9a-f]+', breakpoints)

For parsing bp.location I can start working on it, but I haven't figured out yet how to find the address of a symbol without relying on commands and it will never be perfect, because we have no way of parsing something like b *$rip.
So far I have sketched something like this:

    if GDB_VERSION >= (13, 1):
        bp_locations = [location.address for b in breakpoints for location in b.locations if location is not None]
    else:
        bp_locations = [self.parse_location(b.location) for b in breakpoints if b.location is not None] 

    @lru_cache()
    def parse_location(self, location: str) -> int:
        exact = False
        if location.startswith("*"):
            location = location[1:].strip() # Remove leading spaces
            if location.startswith("0x"):
                return int(location, 16)
            elif location.isdigit():
                return int(location)
            exact = True
        location = location.split("+") # Can a symbol have a 
        if len(location) == 2:
            symbol, offset = map(lambda x: x.strip(), location)
        else:
            symbol, offset = location[0], 0
        base_address = self.parse_symbol(symbol, exact=exact)
        
    def parse_symbol(self, symbol: str, exact=True) -> int:
        #TODO

@Grazfather
Copy link
Collaborator

For info breakpoints why do you want the bp number first ? I

So that you can info breakpoint NUMBER so we know which line to parse. I was thinking if we wanted to look for a specific one.

For registers, that is tough, since the register value basically becomes irrelevant immediately after it's set.

I think that we should use hasattr instead of checking the GDB_VERSION, since it better guards against some distros maybe having patches that add the feature early or don't have it in 13.1.

What is your part with b.locations? Why would a breakpoint have more than one location, and from what I see, that attribute is not set.

@Angelo942
Copy link
Contributor Author

I think that we should use hasattr instead of checking the GDB_VERSION, since it better guards against some distros maybe having patches that add the feature early or don't have it in 13.1.

I didn't think about these cases. Okay 👍

What is your part with b.locations? Why would a breakpoint have more than one location, and from what I see, that attribute is not set.

I'm still on version 12.1 so I was looking for a docker to test it and for now I only sketched it based on the documentation. "Variable: Breakpoint.locations. Get the most current list of breakpoint locations that are inserted for this breakpoint". You may end up with a breakpoint in multiple locations if you set it on a line of the source code.

Regarding what to do now, I looked at the performances of info breakpoints and I saw how slow it is, making context take twice as long and getting slower with each additional breakpoint. I don't feel like I can handle the parsing though. Do I still test the code for b.locations and try to push a patch for those who have the new versions of gdb or do we wait to have something that works for everyone ?

@Grazfather
Copy link
Collaborator

Thank you for testing the speed. I say we add just the hasattr, and basically behave better if we can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants