Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow offchain worker requests to all TSS nodes in entropy-tss test environment #1147

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

ameba23
Copy link
Contributor

@ameba23 ameba23 commented Nov 5, 2024

This PR sets the TSS node endpoints associated with all chain nodes in our test setup for entropy-tss.

It means that http requests coming from the propagation pallet will be made to all four TSS nodes. Previously, only Alice got a request from the chain and the other 3 TSS server's had similar requests made by an http client in the test code itself.

This had the advantage that we could test the handling of bad inputs given to these endpoints. But a lot of care had to be taken to make the mock requests appear at the right time, and we have had problems with the tests occasionally failing, eg: #1119

I am hoping that we can make the tests more reliable and more closely emulate what will happen in production by doing this.

This PR also makes a change to the genesis config making it possible to start in a pre-jumpstarted state to test signing. For an explainantion see the (merged) sub-PR which introduces that change: #1162

A couple of related changes:

  • There is now only one set of pre-generated keyshares, as we know beforehand who the initial signer set will be as it is part of the chainspec.
  • I've used a different helper function when setting up a lot of our tests: testing_utils::helpers::spawn_tss_nodes_and_start_chain. This has the advantage that it sets up both the chain and TSS nodes in one function, making a bit less boilerplate code in the tests. But it has a gotcha, which is that because it lives in testing-utils and not entropy-tss, cfg(test) is false meaning the unsafe api does not get loaded. So it can only be used in the tests which don't need the unsafe API. I could move it into entropy-tss, but then we would not be able to use it in integration tests.
  • We had a test which did a DKG, then signs a message, then a reshare, then signs another message. The problem with this is that if doing a real DKG, the signer set is not known at genesis. Before, we had chosen the next_signers for the reshare to match what signers get chosen, but there is no guarantee that this will continue to work if the initial chain state changes. So now we only test resharing following a mock DKG where we know the initial signers beforehand.

@ameba23 ameba23 marked this pull request as draft November 5, 2024 11:46
@ameba23
Copy link
Contributor Author

ameba23 commented Nov 7, 2024

I am a bit stuck with this.

DKG is working fine but the reshare protocol has problems.

To be specific its the aux gen protocol when run from within the 'reshare' protocol:

The DKG protocol is composed of the key init, reshare and aux gen protocol - so we know all of these will work with this setup.

'Reshare' is composed of reshare and aux gen. It seems from looking at logs that the reshare runs, but only two of the three parties finalize successfully, then during aux gen a connection gets dropped.

I tried switching things around so the aux gen protocol gets run before the reshare, and it seems that aux gen is the problem as a connection gets dropped and it seems only one peer runs the protocol loop.

There are also issues with mocking the DKG using pre-generated keyshares which we do in the signing tests, as now when the jump start extrinsic is submitted, it starts a real DKG, and things get messy when the mock DKG confirmations are also submitted. But i think this should be not to hard to fix.

@ameba23
Copy link
Contributor Author

ameba23 commented Nov 11, 2024

I had a hunch that the problems i am seeing are related to problems introduced by my recent PR: #1136

I have now tried rolling back to before that commit and re-applying these changes and running the tests, but i am seeing exactly the same behavior as on this branch. So thats not the problem.

Here is the relevant part of the test logging output:

  2024-11-11T09:45:06.271344Z  INFO entropy_tss::signing_client::protocol_transport: Got ws connection, with message: SubscribeMessage { session_id: Reshare { verifying_key: [2, 24, 213, 93, 160, 174, 33, 208, 3, 29, 227, 227, 175, 53, 23, 229, 3, 220, 55, 68, 234, 118, 51, 129, 176, 175, 185, 68, 247, 93, 250, 222, 233], block_number: 21 }, public_key: 2cbc68e8bf0fbc1c28c282d1263fc9d29267dc12a1044fb730e8b65abc37524c (5D5Mw6Wb...), signature: 8259ea5bbf1308232f884f7eaf85d494c664e318ef21f1ee26b5575c0fa542789c2ca902e78fb74447016c549ff35b4cc67353360041b295da14ab0e00348781 }
    at crates/threshold-signature-server/src/signing_client/protocol_transport.rs:159

  2024-11-11T09:45:06.271898Z  INFO entropy_tss::signing_client::protocol_transport: Got ws connection, with message: SubscribeMessage { session_id: Reshare { verifying_key: [2, 24, 213, 93, 160, 174, 33, 208, 3, 29, 227, 227, 175, 53, 23, 229, 3, 220, 55, 68, 234, 118, 51, 129, 176, 175, 185, 68, 247, 93, 250, 222, 233], block_number: 21 }, public_key: 946140d3d5ddb980c74ffa1bb64353b5523d2d77cdf3dc617fd63de9d3b66338 (5FRFqLyd...), signature: 9aa6c7204d2135cadb94878b4117617a9336c4becd2ceb04903d570a7ad6fa38d223dec1ca25f4279a49191296a8cfeb386117be6b33e912e55fc1e22aaba288 }
    at crates/threshold-signature-server/src/signing_client/protocol_transport.rs:159

  2024-11-11T09:45:06.272088Z  INFO entropy_tss::signing_client::protocol_transport: Got ws connection, with message: SubscribeMessage { session_id: Reshare { verifying_key: [2, 24, 213, 93, 160, 174, 33, 208, 3, 29, 227, 227, 175, 53, 23, 229, 3, 220, 55, 68, 234, 118, 51, 129, 176, 175, 185, 68, 247, 93, 250, 222, 233], block_number: 21 }, public_key: 946140d3d5ddb980c74ffa1bb64353b5523d2d77cdf3dc617fd63de9d3b66338 (5FRFqLyd...), signature: b4a22cbf3746335a7a757824c63cbeb01a31a3fa428d4a860c47ae790257a118bdabe88fe31e4ad8a854d8d49b6770c7fd192a7a88813753dc116a16bd726b82 }
    at crates/threshold-signature-server/src/signing_client/protocol_transport.rs:159

  2024-11-11T09:45:06.272212Z  INFO entropy_protocol::execute_protocol: Executing reshare
    at crates/protocol/src/execute_protocol.rs:343
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: d4731f8e-5b02-4f75-8218-6a612164990a, uri: /validator/reshare, method: POST

  2024-11-11T09:45:06.272554Z  INFO entropy_protocol::execute_protocol: Executing reshare
    at crates/protocol/src/execute_protocol.rs:343
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: 829e963e-68a0-4c46-b628-a42aa0c1bba4, uri: /validator/reshare, method: POST

  2024-11-11T09:45:06.272964Z  INFO entropy_protocol::execute_protocol: Executing reshare
    at crates/protocol/src/execute_protocol.rs:343
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: badb679b-ba0f-4287-bc39-f65952c250f2, uri: /validator/reshare, method: POST

  2024-11-11T09:45:06.274708Z  INFO entropy_protocol::execute_protocol: Finished reshare
    at crates/protocol/src/execute_protocol.rs:361
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: 829e963e-68a0-4c46-b628-a42aa0c1bba4, uri: /validator/reshare, method: POST

  2024-11-11T09:45:06.274719Z  INFO entropy_protocol::execute_protocol: Starting aux gen
    at crates/protocol/src/execute_protocol.rs:365
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: 829e963e-68a0-4c46-b628-a42aa0c1bba4, uri: /validator/reshare, method: POST

  2024-11-11T09:45:10.401568Z  INFO entropy_protocol::execute_protocol: Finished reshare
    at crates/protocol/src/execute_protocol.rs:361
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: d4731f8e-5b02-4f75-8218-6a612164990a, uri: /validator/reshare, method: POST

  2024-11-11T09:45:10.401590Z  INFO entropy_protocol::execute_protocol: Starting aux gen
    at crates/protocol/src/execute_protocol.rs:365
    in entropy_tss::validator::api::new_reshare
    in entropy_tss::http-request with uuid: d4731f8e-5b02-4f75-8218-6a612164990a, uri: /validator/reshare, method: POST

  2024-11-11T09:45:17.100457Z  WARN entropy_tss::signing_client::api: Websocket connection closed unexpectedly MessageAfterProtocolFinish
    at crates/threshold-signature-server/src/signing_client/api.rs:140

@ameba23
Copy link
Contributor Author

ameba23 commented Nov 21, 2024

Had a look at this problem with @HCastano thismorning. He made a suggestion to try running the DKG with an additional reshare immediately after the DKG finishes. To see if it is something to do with the test conditions for the reshare test rather than the reshare protocol itself.

I tried doing this, but got into a muddle with new holders and old holders, and was having synedrion's no entry found for key error.

But if i run an extra aux gen protocol at the end of the DKG protocol, it does work.

@ameba23 ameba23 self-assigned this Nov 26, 2024
ameba23 and others added 5 commits November 26, 2024 13:01
* Add pre-jumpstart option to chainspec and helper fn, update tests

* Dont include JumpStartDetails in staking pallet config, only ids of the jumpstarted signer set

* Update mock config for propagation and registry pallets

* Comments, tidy

* Add new field to attestation pallet mock

* Update client tests

* Fix issue with test helper fn

* Add pre-generated keyshares directly to the kvdb, rather than using the unsafe API

* Update integratiion test chain spec

* Fixes for pre-jumpstarted state in tests - mocking jump start no longer works

* Update reshare test

* Update client tests

* Typo

* Tidy test

* Tidy test

* Fix for deadline issue

* Rm unused variable

* Fix local storage for rotate network key endpoint

* Update test

* Clippy

* Clippy

* Fix remaining tests

* Fix tests, tidy

* Update generated keyshares

* Update script to generate keyshares

* Update script to generate keyshares readme

* Tidy

* Undo testnet reshare fix
@ameba23 ameba23 marked this pull request as ready for review November 26, 2024 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

1 participant