Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix "it_removes_lassie_temp_on_start" #631

Merged
merged 1 commit into from
Dec 6, 2024
Merged

Conversation

bajtos
Copy link
Member

@bajtos bajtos commented Dec 6, 2024

A bunch of Dependabot PRs are blocked from landing because of this test failing.

In this PR, I am fixing the test by changing the CID we are downloading, to avoid the Lassie bug described in filecoin-project/lassie#492.

I am also adding more diagnostics to simplify troubleshooting in the future.

Here is how to run this single test and get the additional info:

cargo test --test daemon_tests -- --nocapture

Also add more diagnostics to simplify troubleshooting in the future.

Here is how to run this single test and get the additional info:

```
cargo test --test daemon_tests -- --nocapture
```

Signed-off-by: Miroslav Bajtoš <[email protected]>
@bajtos bajtos requested a review from juliangruber December 6, 2024 15:43
@bajtos
Copy link
Member Author

bajtos commented Dec 6, 2024

@@ -1,6 +1,6 @@
// Signal that we are going to start the retrieval
Zinnia.activity.info("fetch:start");
const response = await fetch("ipfs://QmdmQXB2mzChmMeKY47C43LxUdg1NDJ5MWcKMKxDu7RgQm");
const response = await fetch("ipfs://bafybeiazvkej6ou3w6xmva5ed6suonxjv3jkhq4ke73q5hgmcjmf76uos4");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does changing the CID fix the tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a Lassie bug. I thought I linked to it somewhere in the changes, looks like I did not.

filecoin-project/lassie#492

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I subscribed to this issue as well. Could this be impacting Spark production data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting a slightly different result with spark's manual check:

[...]
Querying IPNI to find retrieval providers for QmdmQXB2mzChmMeKY47C43LxUdg1NDJ5MWcKMKxDu7RgQm
IPNI returned 56 provider results
Failed to fetch QmdmQXB2mzChmMeKY47C43LxUdg1NDJ5MWcKMKxDu7RgQm from /dns/http.f02620.devtty.eu/https/http-path/%2Fipni-provider%2F12D3KooWNXvbyvLUUd1qQEqhzjTpVoT5fdYUZEv4RJSxZ3rDF2c7 using http
Error: Cannot parse "/dns/http.f02620.devtty.eu/https/http-path/%2Fipni-provider%2F12D3KooWNXvbyvLUUd1qQEqhzjTpVoT5fdYUZEv4RJSxZ3rDF2c7": unsupported protocol "https"
    at multiaddrToHttpUrl (file:///Users/julian/dev/filecoin-station/spark/lib/multiaddr.js:10:7)
    at getRetrievalUrl (file:///Users/julian/dev/filecoin-station/spark/lib/spark.js:272:21)
    at Spark.fetchCAR (file:///Users/julian/dev/filecoin-station/spark/lib/spark.js:106:19)
    at Spark.executeRetrievalCheck (file:///Users/julian/dev/filecoin-station/spark/lib/spark.js:80:16)
    at eventLoopTick (ext:core/01_core.js:178:11)
    at async file:///Users/julian/dev/filecoin-station/spark/manual-check.js:23:1 {
  code: "UNSUPPORTED_MULTIADDR_PROTO"
}
Measurement: {
  cid: "QmdmQXB2mzChmMeKY47C43LxUdg1NDJ5MWcKMKxDu7RgQm",
  minerId: "f0frisbii",
  indexerResult: "OK",
  statusCode: 702,
  byteLength: 0,
  providerId: "12D3KooWNXvbyvLUUd1qQEqhzjTpVoT5fdYUZEv4RJSxZ3rDF2c7",
  protocol: "http",
  providerAddress: "/dns/http.f02620.devtty.eu/https/http-path/%2Fipni-provider%2F12D3KooWNXvbyvLUUd1qQEqhzjTpVoT5fdYUZE"... 14 more characters,
  startAt: 2024-12-07T07:22:18.898Z,
  endAt: 2024-12-07T07:22:18.899Z
}

The retrieval failed.
The provider address %j cannot be converted to a URL: /dns/http.f02620.devtty.eu/https/http-path/%2Fipni-provider%2F12D3KooWNXvbyvLUUd1qQEqhzjTpVoT5fdYUZEv4RJSxZ3rDF2c7 Cannot parse "/dns/http.f02620.devtty.eu/https/http-path/%2Fipni-provider%2F12D3KooWNXvbyvLUUd1qQEqhzjTpVoT5fdYUZEv4RJSxZ3rDF2c7": unsupported protocol "https"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore I believe the lassie issue doesn't affect Spark data, as the check fails before talking to lassi

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to take a closer look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran manual check while substituting Frisbii PeerID with various Provider.ID values returned by https://cid.contact/cid/QmdmQXB2mzChmMeKY47C43LxUdg1NDJ5MWcKMKxDu7RgQm

I tested both Graphsync and HTTP retrievals.

I verified that both cases called Lassie in a way that does not trigger the issue with parsing http-path multiaddrs.

To explain: we tell Lassie the protocol & provider address to use for the retrieval, therefore Lassie skips the provider lookup (where the http-path parsing issue happens) and goes directly to the retrieval step.

Of course, if an SP advertises to IPNI retrievals via http-path multiaddr, then Spark checks will fail, but only for this SP. The measurements will indicate status code 702 (error code UNSUPPORTED_MULTIADDR_PROTO), which we will see in our data.

@bajtos bajtos merged commit 3a82951 into main Dec 6, 2024
13 checks passed
@bajtos bajtos deleted the fix-test-timing-out branch December 6, 2024 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants