Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hoverboard can run with env.CONTENT_CLAIMS url and will use content-claims when handling bitswap requests #20

Closed
wants to merge 40 commits into from

Conversation

gobengo
Copy link
Contributor

@gobengo gobengo commented Nov 30, 2023

Motivation:

WIP

@gobengo gobengo changed the title Use content claims 1701302316 hoverboard can run with env.CONTENT_CLAIMS url and will use content-claims client to connect to it Nov 30, 2023
@gobengo gobengo self-assigned this Nov 30, 2023
package.json Outdated Show resolved Hide resolved
@gobengo gobengo marked this pull request as ready for review December 15, 2023 00:30
}
const miniflare = new Miniflare({
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the workers-sdk testWorker fn that is unstable has a way of accessing the r2 bucket from outside the worker so I had to pul this in, and iiuc this new Miniflare 3 can only run on the built script in dist.

@gobengo gobengo requested a review from travis December 19, 2023 20:17
Copy link
Member

@travis travis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! Would definitely feel comfier with another review from someone who's more familiar with this stuff, but I think this is mergable with or without the recommendations I've made.

* @param {import('@web3-storage/content-claims/client/api').LocationClaim} claim
*/
async transform (claim, controller) {
for (const location of claim.location) { controller.enqueue([claim, location]) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to do this here rather than just doing const location = claim.location at the beginning of the transform function on line 182? it seems to me like it's just creating a bunch of extra datastructures, but wouldn't be surprised if I'm missing some nuance

Copy link
Contributor Author

@gobengo gobengo Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By doing them in separate streams, the second stream can process each location in parallel and also with separate error handling logic. instead of thinking about what to do if 2 locations are resolved but the 3rd gets an error or something.

*/
const locatedBlocks = createReadableStream(claims)
// claim -> [claim, location]
.pipeThrough(new TransformStream({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pipe syntax keeps tripping me up, feels verbose, maybe this could be something like

.pipeThrough(transforms(
  async extractLocation (claim, controller) {
    for (const location of claim.location) { controller.enqueue([claim, location]) }
  },
  async fetchLocation ([claim, location], controller) {
    const response = await fetch(location, { headers: { accept: 'application/vnd.ipld.car' } })
    if (response.ok) {
      return controller.enqueue([claim, location, response])
    }
    throw new Error(`unexpected not-ok response ${response}`)
  },
  async readToBlocks ([claim, location, response], controller) {
    await response.body?.pipeThrough(new CARReaderStream()).pipeTo(new WritableStream({
      write: (block) => {
        // @todo validate that block.bytes hashes to `link`
        controller.enqueue({ claim, location, response, block })
      }
    }))
))

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does transforms come from?

Copy link
Contributor Author

@gobengo gobengo Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited to use stream.compose in node but also trying here to use web streams and have as much as possible be easy to move to browsers. lmk if you know of a good library with those properties but personally I'd prefer not to make one rn.

}

/**
* @param {import('../api.js').RelationClaim[]} claims
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love a good comment here summarizing what's going on - there's some complicated logic below and I'd love an abstract

Copy link
Contributor

@olizilla olizilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reading this it struck me that it's a faff (programatically speaking) to have to unpack a relation claim + index + location claim to determine where to get a block from.

We can specify a location and a range in the location claim and have all the info we need to fetch the block bytes, so I'm making that work for materialised claims in storacha/content-claims#43

I'd like to land that and see if we can simplify things here.

package.json Show resolved Hide resolved
src/blocks.js Outdated Show resolved Hide resolved
@gobengo
Copy link
Contributor Author

gobengo commented Dec 20, 2023

@olizilla sounds good I added these TODOs in the issue body above
Screenshot 2023-12-20 at 2 52 59 PM

@gobengo gobengo changed the title hoverboard can run with env.CONTENT_CLAIMS url and will use content-claims client to connect to it hoverboard can run with env.CONTENT_CLAIMS url and will use content-claims when handling bitswap requests Jan 8, 2024
@alanshaw
Copy link
Member

superseded by #26

@alanshaw alanshaw closed this May 28, 2024
@alanshaw alanshaw deleted the use-content-claims-1701302316 branch May 28, 2024 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants