Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prevent duplicate trustless-gateway reqs #503

Merged
merged 6 commits into from
Apr 20, 2024

Conversation

SgtPooki
Copy link
Member

@SgtPooki SgtPooki commented Apr 18, 2024

  • chore: add @types/polka
  • feat: getStats() returns trustless gateway stats

Title

fix: prevent duplicate trustless-gateway reqs

Description

Fixes ipfs/service-worker-gateway#104

This PR fixes issues brought up in service-worker-gateway where sub-resources end up causing multiple requests to a trustless gateway for the root CID.

Notes & open questions

I feel like the blockstore should handle this and ensure requests that depend on a root CID are deduped, but since trustless-gateways may be used outside of the context of blockstores, I believe we still need this improvement.

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

@SgtPooki SgtPooki requested a review from a team as a code owner April 18, 2024 23:14
Copy link
Member Author

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self review of the code so far that should fail CI test because fix isn't pushed yet.

@SgtPooki
Copy link
Member Author

SgtPooki commented Apr 18, 2024

note the previous failure in CI: https://github.com/ipfs/helia/actions/runs/8745897909/job/24001821684?pr=503#step:5:793

And it should be fixed by 97a05e4 (#503)

Copy link
Member Author

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self review of code entirety

Copy link
Member

@2color 2color left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

@SgtPooki
Copy link
Member Author

One thing I thought of after stepping away last night: we should probably use finally to remove entry from map

if (!res.ok) {
this.#errors++
throw new Error(`unable to fetch raw block for CID ${cid} from gateway ${this.url}`)
let pendingResponse: Promise<Uint8Array> | undefined = this.#pendingResponses.get(gwUrl.toString())
Copy link
Member

@achingbrain achingbrain Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should dedupe requests based on a canonical stringification of the multihash of the CID (base64 or something) rather than then requested URL, because you can request two CIDs with different versions/codecs but they'd resolve to the same block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put a codepen together to try and play with how this is done, and had some helpful tips from Adin to get me going

https://codepen.io/SgtPooki/pen/XWQygKB?editors=1111

it would be interesting to see where this falls down.

Also, is it just me that thinks it would be useful to export a blessed "are these CIDs referencing the same bytes" toString() function from multiformats?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this PR open to make sure you take a peek at this before I merge

Copy link
Member

@achingbrain achingbrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, comments inline but please ensure things are removed from the pending responses map as it'll cause a memory leak as implemented.

Copy link
Member Author

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self review after addressing all comments

@@ -190,4 +190,33 @@ describe('trustless-gateway-block-broker', () => {

await expect(sessionBlockstore?.retrieve?.(blocks[0].cid)).to.eventually.deep.equal(blocks[0].block)
})

it('does not trigger new network requests if the same cid request is in-flight', async function () {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test was renamed in the last iteration from "[...] same block [...]" to "[...] same cid [...]" because we're not actually testing the uniqueBlock functionality here where we would prevent requests given separate CIDs if they're referencing the same bytes.

@SgtPooki SgtPooki requested a review from achingbrain April 20, 2024 00:13
@SgtPooki
Copy link
Member Author

CI failed but not due to changes here AFAICT: https://github.com/ipfs/helia/actions/runs/8760997666/job/24046923998?pr=503#step:5:240

@achingbrain achingbrain merged commit 338885f into main Apr 20, 2024
18 checks passed
@achingbrain achingbrain deleted the fix/duplicate-trustless-gateway-requests branch April 20, 2024 16:07
@achingbrain achingbrain mentioned this pull request Apr 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate block requests
3 participants