Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: allow interrupts on proof generation #324

Closed
wants to merge 2 commits into from
Closed

Conversation

acud
Copy link
Contributor

@acud acud commented Aug 7, 2024

This change adds support to interrupting and stopping the proof generation via a new RPC call that simply sets the AtomicBool on the service to true, resulting in the proof generation to be cancelled the same as in a shutdown.

It might be useful to add a thread join on the handle the same way that the shutdown is doing, wdyt @poszu?

This doesn't help with cancelling the proof generation via FFI (which I think can still be used by the post go tool). I tried to also add that but got lost in the go->c->rust rabbit-hole (tried to inject a callback that would provide a bool from the go-code in order to stop the execution, the same way that is done currently with the AtomicBool).

Closes #81
Depends on https://github.com/spacemeshos/api/pull/366/files

Copy link
Collaborator

@poszu poszu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #81 was created before we separated proving from the go-spacemesh into the post-service connected to the node via GRPC. Previously proving was called over FFI and it was a problem that it couldn't be interrupted. Now that's not an issue anymore.

What's the benefit of having this GRPC endpoint? The same thing can be achieved by shutting down the post-service (SIGINT).

I think the more interesting part is preservation of the progress:

It should be possible to interrupt proof generation and continue later from the point it stopped.

That should be useful for users that need to stop proving for some reason (e.g. HW issues, power outage etc.), so they could continue proving instead of starting over.

Regarding interrupting the FFI proving call, you could pass in a function to be called from rust to periodically check if it should stop, then the Go side could use an atomic set from a different thread to interrupt proving. Or Rust could spawn a thread in which proving would run and return an opaque handle. This handle could be passed into a C/Rust function to interrupt. Something like:

handle := C.generate_proof(....) // actual proving runs async on a thread
ticker := time.NewTicker(time.Minute)
defer ticker.Stop()
for {
  switch {
    case <- ctx.Done():
      C.interrupt_proving(handle)
    case <- ticker.C:
       result := C.proof(handle)
       // ... handle _in progress_ vs _finished_
    }
}

@@ -213,6 +216,11 @@ impl crate::client::PostService for PostService {
fn get_metadata(&self) -> &PostMetadata {
&self.metadata
}

fn interrupt_proof(&self) -> eyre::Result<()> {
self.stop.clone().store(true, Ordering::Relaxed);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Arc doesn't need to be cloned here

Suggested change
self.stop.clone().store(true, Ordering::Relaxed);
self.stop.store(true, Ordering::Relaxed);

@acud
Copy link
Contributor Author

acud commented Aug 8, 2024

Thanks for reviewing @poszu. How would you envision persisting the "paused" state? on the post-rs side? e.g. in some folder that corresponds to the challenge? or returning it to the caller somehow?

@acud acud closed this Oct 15, 2024
@acud acud deleted the interrupt-prf branch October 15, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interruptable proof generation
2 participants