Skip to content

propose project merge with my align-block-file #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dominictarr opened this issue Jul 18, 2017 · 3 comments
Closed

propose project merge with my align-block-file #12

dominictarr opened this issue Jul 18, 2017 · 3 comments

Comments

@dominictarr
Copy link

we are trying to do the same thing here, and if we merge our api then yak shaving (like getting it fast in indexeddb) will benefit all our projects.

so, this would mean adding methods from each other's apis, and writing a test suite (and benchmark suite!) that can be reused.

so the union of
https://github.com/flumedb/aligned-block-file
plus
https://github.com/mafintosh/random-access-file

methods already shared: read except that RAF uses start, length and ABF uses start, end

then RAF has the following methods

  • write (ABF currently only has append)
  • del (ABF more accurately calls this truncate)
  • end (curious about how this is used, because not every RAF implementation has a place for this data?)
  • close
  • unlink
  • events "open" "close" ABF uses observables. I'd be content to switch to observables or support both.

ABF has

  • readUInt{32,48,64}BE since ABF internally has an aligned buffer thing, these methods can actually be made really fast, rather than reading a 4-8 byte Buffer and then converting it to a number then throwing it away. It doesn't allocate buffer memory, but creating a buffer object will add more memory to the heap than it contains!
  • append (would be easy to add this to RAF)
  • size returns current size... should probably remove this method?
  • offset observable.
  • truncate could be an alias for RAF.del

The main difficulty here would is we need to settle on either start,end or start,length and that is a breaking change for someone... we could choose based on which has less code to fix or which pattern is used more consistently in the node apis (it seems that both are present, but i'm not clear on the rationale for when to use a particular one)

@mafintosh
Copy link
Collaborator

Combining efforts sounds good to me. We are already heavily invested in this abstraction across all my BT projects and hypercore/hyperdrive (https://github.com/mafintosh/multi-random-access, https://github.com/mafintosh/random-access-memory, lots more).

Couple of notes

  • .del(offset, length, callback) - the intent here is actually to support in-place deletes in addition to simple truncates. The memory driver already does that and I plan on adding support for fs block hole punching where available (https://lwn.net/Articles/415889/)

  • .end(callback) - legacy, we can just remove this

  • .unlink(callback) - legacy, we use a new method called .destroy(callback) in case you want to destroy all underlying storage

For the readUint methods I'm want to support passing in a buffer to read into (.read(offset, buf, cb)) in addition to .read(offset, length, cb) - There are a bunch of other memory optimisation use cases this enables, including completely statically allocated network piping using https://github.com/mafintosh/turbo-net in the future.

@dominictarr
Copy link
Author

what does .del() do, does it just write zeros? oh hmm, or are they read as zeros, but are actually not there! that might be good for sparse hash tables (hmm, although, this would be a tradeoff because it would work better with smaller blocks...)

I notice you also have one for doing sparse data.
I've actually been thinking about building stuff on random writes,
but I'm not sure about the durability, if you do a bunch of updates, and then those are partially written, I want to know what point the file was last valid (so that I can replay subsequent writes)... but that is another discussion.

I mostly only need append-only and truncate at the moment so what if we separate this into a parts: random reads, append only writes + truncate, then random writes/del. then it's easy to specify what set of apis a module supports.

@mafintosh
Copy link
Collaborator

Closing cause old, but always happy to discuss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants