Problem testing "transition" behaviour on modern CouchDB

I'm writing this issue up as an explanation for the Neighbourhoodie team who
have been working on updating this library, to eloborate on a testing problem we
ran into.

This library works by loading data expressed in this format into a PouchDB
instance:

    {"docs":[{"_id":"foo","_rev":"1-x"}]}
    {"seq":1}
    {"docs":[{"_id":"bar","_rev":"1-y"}]}
    {"seq":2}
    {"docs":[{"_id": "baz", "_rev": "1-w"}]}
    {"seq":3} 

Any line with a `docs` array has those docs added to the target DB. The last
line with a `seq` has that `seq` put into a "fake" checkpoint doc in
`_local/{id}` where the ID is the CouchDB-compatible replication ID for the
target database, and some other remote database, addressed by its URL, and
referred to in the API as a `proxy`.

    let proxy = 'http://couch.example.com:5984/foo' 

    let db = new PouchDB('foo')
    await db.load(textDumpUrl, { proxy })

The checkpoint is created so that when we later replicate from that "proxy", we
don't start from scratch, we resume from the last `seq` in the dump file.

    let remote = new PouchDB(proxy)
    await db.replicate.from(remote) // resumes from since=3

This behaviour for switching from using a text dump to doing normal replication
is referred to as "transitioning" in the test suite. The tests for this
behaviour will do the above steps to load a fixture file into a PouchDB, and
then do something like this:

    let remote = new PouchDB(proxy)
    await remote.put({ _id: 'quux' })
    await db.replicate.from(remote)

    let info = await db.info()
    assert.equal(info.doc_count, 3)

The idea is that since `remote` only contains a single doc, it will have
`last_seq=1`, and therefore the PouchDB replicator trying to resume at `seq=3`
will skip the `quux` doc and not add it to PouchDB.

This worked on CouchDB 1.x which had `seq` values that were just integers. It
does not work on later versions where `seq` is more complex and
non-deterministic, since it encodes data about per-shard-replica source `seq`,
and the database UUID. For example, if you run these steps multiple times, you
will get a different output each time, even with a single shard on a single-node
instance.

    cdb '/asd' -X DELETE
    cdb '/asd?q=1' -X PUT
    cdb '/asd/doc-1' -X PUT -d '{"n":1}'
    cdb '/asd/doc-2' -X PUT -d '{"n":2}'
    cdb '/asd/doc-3' -X PUT -d '{"n":3}'
    cdb '/asd/_changes' | jq '.last_seq'

    -> e.g. "3-g1AAAABPeJzLYWBgYMxgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTLksTD8B4KsDOZE5lygEHuyiWWqhVkaNi1ZAJ57GOI"

The initial problem we ran into is that something inside PouchDB, or the
checkpoint package used in this library, wants to parse and inspect the
internals of `seq` values, and this was failing on plain integers. So we changed
the `foobar.txt` fixture to use real `seq` values from a CouchDB database.

We now have the problem that the test won't work, because when the single doc is
added to `remote` it will have an unpredictable `seq` which will not appear in
our fixture. When the PouchDB replicator tries to resume with
`since=3-g1AAAABP...`, the remote CouchDB will not recognise that value and will
return the change feed ignoring the `since` param.

On CouchDB 1.x, it may have been the case that passing `seq=3` to a database
with only 1 change in it would return an empty feed, rather than behaving like
`since=0` had been given. The test certainly seems to assume this was the case,
but I have not verified it. In any case, this type of assumption is definitely
not true today. Besides, `seq` values _should_ be unique per database to prevent
mis-using a `seq` value from one database against another where it would not
mean anything.

On modern CouchDB, calling `GET /db/_changes?since=X` with a `seq` the DB does
not recognise, will just give you the whole feed. So you will not get the
behaviour of skipping the `quux` doc the test currently uses. The final state
after this replication would be the same as if you didn't have any checkpoint to
begin with, so this is not a good test that the library works correctly.

Instead, we need to check the replication actually resumes from where we expect.
I see two options for doing this:

- Reading the `_local/{id}` doc from PouchDB and checking the expected `seq` is
  in there.
- Running a replication but mocking the CouchDB API to check what `_changes` is
  called with.

The first option is simpler to implement but requires exposing a function to
compute the checkpoint doc ID, since PouchDB does not seem to have a method for
enumerating local docs. The second would require mocking all the APIs the
replicator interacts with and would be substantially more complicated.

Finally, I'm not sure if putting a checkpoint _only_ in PouchDB, and not in the
remote database, is sufficient to cause the replication to skip the data added
by `load()`, because the CouchDB replication protocol uses checkpoints from
_both_ the source and target databases. So the library's behaviour of just
putting a checkpoint in the PouchDB end might not be correct.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problem testing "transition" behaviour on modern CouchDB #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problem testing "transition" behaviour on modern CouchDB #69

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions