Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent Checkpointing PR (#2184) #3406

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Commits on Jan 10, 2024

  1. Functionality that will be shared, moved from TraceStream.cc

    - Moved into util.cc
    - Added forward_to to skip trace data to some arbitrary point in time
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    743c719 View commit details
    Browse the repository at this point in the history
  2. Getters required to expose data

    We need to be able to expose this data so it can
    be serialized.
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    dd76fad View commit details
    Browse the repository at this point in the history
  3. Find original exe for ReplayTask

    Digs out original executable image that this task was forked
    from, or in the case of exec, exec'd on.
    
    This is required for persistent checkpointing, so that the names in the
    proc fs corresponds to a correct name at replay time (i.e. has the same
    behavior/looks the same in proc fs as a normal replay). The thread name is
    not what should be showing up in /proc/tid/comm, but the actual
    executable. So we need to be able to find this "original exe" of the
    task.
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    cff2fcf View commit details
    Browse the repository at this point in the history
  4. Check if Event is checkpointable

    Required for the create checkpoints command, etc. to determine what
    events in the trace are checkpointable, when not having a live session.
    
    In future commits/PRs, remove the static function in ReplaySession.cc`
    that does the same thing and use this member function on Event instead.
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    5c40e5a View commit details
    Browse the repository at this point in the history
  5. Additional proc fs query paths

    Gets additional proc fs paths for a task, in this case
    /mem. Required for persistent checkpointing to figure out
    on how to handle mappings and what to serialize (and what not to
    serialize).
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    cc41c3f View commit details
    Browse the repository at this point in the history
  6. Lifted CloneCompletion out of Session

    The function extract_name will also be required for setting up syscall
    buffer stuff in coming commits.
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    9be51d2 View commit details
    Browse the repository at this point in the history
  7. Getters/setters required for PCP

    Need to be able to set this data when restoring an address space.
    theIDinside committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    9e7e27e View commit details
    Browse the repository at this point in the history

Commits on Jan 15, 2024

  1. Persistent checkpointing

    Added persistent checkpoint schema for capnproto rr_pcp.capnp,
    as well a compile command for it in CMakeLists.txt, that works like
    the other one (rr_trace.capnp)
    
    CheckpointInfo and MarkData types works as intermediaries between a
    serialized checkpoint and a deserialized "live" one. MarkData is used for
    copying the contents of Mark, InternalMark, ProtoMark and it's various
    data into, for serialization as well when deserializing, to reconstruct
    those types.
    
    The reasoning for adding MarkData is to not intrude in Mark/InternalMark/ProtoMark
    interface and possibly break some guarantees or invariants they provide.
    If something goes wrong now, it's constrained only to persistent
    checkpointing not reconstituting a session properly.
    
    GDB spawned by RR now has 2 additional commands, write-checkpoints, which
    serializes any checkpoints set by the `checkpoint` command and
    load-checkpoints.
    
    Added the rr create-checkpoints command which create persistent checkpoints
    on a specified interval, which it attempts to honor as closely as possible.
    
    RerunCommand and ReplayCommand are now aware of PCPs.
    
    Replay sessions get spawned from persistent checkpoints if they
    exist on disk when using `-g <evt>` or when using `-f <pid>` and that
    "task" was created some time after a persistent checkpoint.
    
    Added the --ignore-pcp flag to these commands, which ignores pcps
    and spawns sessions normally.
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    a1ffa19 View commit details
    Browse the repository at this point in the history
  2. fixup for can_checkpoint_at

    Restored comments, that existed in static function in ReplaySession.cc
    Change all use of this to Event::can_checkpoint_at
    Removed static can_checkpoint_at in ReplaySession.cc
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    e9460c7 View commit details
    Browse the repository at this point in the history
  3. Fix preferred include & unnecessary check for partial init

    Since checkpoints are partially initialized, checking that they are is pointless.
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    085d5c3 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    42f1e29 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    45b3abf View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    2c4b0b2 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    436e241 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    71805fc View commit details
    Browse the repository at this point in the history
  9. Moved responsibility of de/ser into FdTable and FileMonitor

    Deserializing and serializing an FdTable is now performed by the class itself instead of in a free function
    
    FileMonitor has a public member function that is used for serialization.
    Each derived type that requires special/additional logic, extends
    the virtual member function serialize_type.
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    d2f7418 View commit details
    Browse the repository at this point in the history
  10. Remove skipMonitoringMappedFd

    not necessary for serialization, as FdTable is separately restored.
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    6c803d4 View commit details
    Browse the repository at this point in the history
  11. Refactor task OS-name setting

    Task::copy_state sets the OS name of a task in the same fashion that
    persistent checkpointing sets name. Refactored this functionality into
    Task::set_name.
    
    Also removed the unnecessary `update_prname` from Task::copy_state.
    
    update_prname is not a "write to tracee"-operation but a "read from tracee"-operation; and since
    we already know what name we want to set Task::prname to, we skip this reading from the tracee
    in Task::copy_state and just set it to the parameter passed in to Task::set_name
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    fedda0c View commit details
    Browse the repository at this point in the history
  12. Add const qualifier

    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    bc524da View commit details
    Browse the repository at this point in the history
  13. Fixes rr-debugger#3678

    Refactor so that marks_with_checkpoints is just changed in one place, not arbitrarily access it. Ref counts had the same changes in a previous commit.
    
    Fixes a bug for loaded persistent checkpoints where the re-created checkpoints did not get their reference counting correct.
    
    This closes rr-debugger#3678
    theIDinside committed Jan 15, 2024
    Configuration menu
    Copy the full SHA
    f1d5e90 View commit details
    Browse the repository at this point in the history