Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change dist ckpt defaults #10913

Merged
merged 9 commits into from
Oct 24, 2024
Merged

Commits on Oct 24, 2024

  1. Enable ckpt features by default, ckpt every 15mins and reduce preempt…

    …ion time to 1min
    
    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    08b9dc6 View commit details
    Browse the repository at this point in the history
  2. fix ssm tests

    Signed-off-by: Shriya Palsamudram <[email protected]>
    JRD971000 authored and ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    c30b36e View commit details
    Browse the repository at this point in the history
  3. Make note that ckpt_async_save is disabled for SSMs

    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    00218b9 View commit details
    Browse the repository at this point in the history
  4. Enable async ckpt for SSMs with fix

    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    3b26274 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d4e5310 View commit details
    Browse the repository at this point in the history
  6. Fix failing unit tests

    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    9a2b837 View commit details
    Browse the repository at this point in the history
  7. Ashors/peft async ckpt (#11010)

    * [WIP] prototype for supporting async checkpointing with peft
    
    Signed-off-by: ashors1 <[email protected]>
    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    54a082b View commit details
    Browse the repository at this point in the history
  8. Enable async ckpt for the peft test

    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    8e52279 View commit details
    Browse the repository at this point in the history
  9. Fix peft setup test

    Signed-off-by: Shriya Palsamudram <[email protected]>
    ShriyaPalsamudram committed Oct 24, 2024
    Configuration menu
    Copy the full SHA
    13a41a0 View commit details
    Browse the repository at this point in the history