Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for storr caches? #68

Open
wlandau opened this issue Aug 21, 2018 · 6 comments
Open

Support for storr caches? #68

wlandau opened this issue Aug 21, 2018 · 6 comments
Labels
feature a feature request or enhancement help wanted ❤️ we'd love your help!

Comments

@wlandau
Copy link

wlandau commented Aug 21, 2018

I love @richfitz's storr API, and it opens up all sorts of storage backends without requiring extra work from memoise itself.

@jimhester jimhester added feature a feature request or enhancement help wanted labels Aug 21, 2018
@jennybc jennybc added help wanted ❤️ we'd love your help! and removed help wanted labels May 10, 2019
@dirkschumacher
Copy link

Couldn't that go into a separate package? But happy to implement it whereever sensible.

@DarwinAwardWinner
Copy link

This proof of concept seems to work:

library(memoise)
library(storr)
library(cachem)
library(rlang)

st <- storr_environment()

wrap_storr_cache <- function(st, missing = cachem::key_missing(), namespace = st$default_namespace) {
  missing_ <- enquo(missing)
  structure(
    list(
      get = function(key, missing = missing_) {
        tryCatch(
          st$get(key, namespace = namespace),
          error = function(e) eval_tidy(as_quosure(missing))
        )
      },
      set = function(key, value) st$set(key, value, namespace = namespace),
      exists = function(key) st$exists(key, namespace = namespace),
      remove = function(key) st$del(key, namespace = namespace),
      reset = function() st$clear(namespace = namespace),
      keys = function() st$list(namespace = namespace),
      prune = st$gc, # gc is not namespaced
      size = function() st$list(namespace = namespace),
      ## Hack for debugging: keep a reference to the storr object
      ## itself as well
      .st = st
    ),
    ## Not sure if this is correct, but it makes it print nicely
    class = c("cachem")
  )
}

st_cache <- wrap_storr_cache(st)

mysqrt <- function(x) {
  message("Computing sqrt of ", deparse1(x))
  sqrt(x)
}

mysqrt_memo <- memoise(mysqrt, cache = st_cache)

mysqrt_memo(1:10)
mysqrt_memo(1:10)
mysqrt_memo(1:10)

@wch
Copy link
Member

wch commented Apr 26, 2021

That seems like a reasonable implementation. I don't know which package it should live in, though.

A few other notes:

  • I think the tryCatch() should catch KeyError rather than the more general error (though I could be wrong in my understanding of how storr works).
  • If this function were to go in storr, that package could avoid taking a dependency on cachem. Instead of calling cachem::key_missing(), it could simply return an object that has the same structure.
  • With disk cache that is implemented in the cachem package, the get() method is atomic -- it doesn't do the check for the key's existence and the fetching the value in two separate steps. This is important for avoiding race conditions. I don't know if storr's object stores work the same way, but it is best if they do. Otherwise, if multiple R processes are sharing the same object store, there can be race conditions where after the existence check and before value fetch, an object is deleted from the store.

@DarwinAwardWinner
Copy link

I don't know about the rds backend, but I believe the more interesting storr backends all use databases that provide proper atomicity guarantees, even under concurrent access from multiple R processes.

@DarwinAwardWinner
Copy link

Oh, it looks like storr specifically allows for the possibility of a key being stored but the corresponding data being deleted from the object store: in this case, it returns HashError as described here: https://richfitz.github.io/storr/articles/storr.html#classed-exceptions

@DarwinAwardWinner
Copy link

As for where this code should live, perhaps it should be added to cachem, along with additional code to implement the appropriate cache size/age expiry logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement help wanted ❤️ we'd love your help!
Projects
None yet
Development

No branches or pull requests

6 participants