Skip to content

Commit

Permalink
Refactor listPaginated tests + make README clearer (#312)
Browse files Browse the repository at this point in the history
## Problem

The core DB previously had [a race-condition
bug](https://app.asana.com/0/1203260648987893/1207992392793971/f) that
resulted in [pagination tokens being non-deterministically returned to
the user](#266)
when they'd call the `listPaginated` endpoint (not just in the TS
client).

## Solution

The core DB team has [fixed this
bug](pinecone-io/pinecone-db#7285). Now that
it's fixed, this PR updates previously-commented-out tests + updates the
README to be more precise regarding how this method works.

## Type of Change

- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [x] This change requires a documentation update
- [ ] Infrastructure change (CI configs, etc)
- [ ] Non-code change (docs, etc)
- [ ] None of the above: (explain here)

## Test Plan

CI passes.

---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
  - https://app.asana.com/0/0/1208246198758491
  • Loading branch information
aulorbe authored Nov 1, 2024
1 parent 26d31d7 commit 92b7e66
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 11 deletions.
39 changes: 31 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -923,19 +923,33 @@ await index.update({

### List records

The `listPaginated` method can be used to list record ids matching a particular id prefix in a paginated format. With clever assignment
of record ids, this can be used to help model hierarchical relationships between different records such as when there are embeddings for multiple chunks or fragments related to the same document.
The `listPaginated` method can be used to list record IDs matching a particular ID prefix in a paginated format. With
[clever assignment
of record ids](https://docs.pinecone.io/guides/data/manage-rag-documents#use-id-prefixes), this can be used to help model hierarchical relationships between different records such as when there are embeddings for multiple chunks or fragments related to the same document.

Notes:

- When you do not specify a `prefix`, the default prefix is an empty string, which returns all vector IDs
in your index
- There is a hard limit of `100` vector IDs if no `limit` is specified. Consequently, if there are fewer than `100`
vector IDs that match a given `prefix` in your index, and you do not specify a `limit`, your `paginationToken`
will be `undefined`

The following example shows how to fetch both pages of vector IDs for vectors whose IDs contain the prefix `doc1#`,
assuming a `limit` of `3` and `doc1` document being [chunked](https://www.pinecone.io/learn/chunking-strategies/) into `4` vectors.

```typescript
const pc = new Pinecone();
const index = pc.index('my-index').namespace('my-namespace');
const results = await index.listPaginated({ prefix: 'doc1#' });

// Fetch the 1st 3 vector IDs matching prefix 'doc1#'
const results = await index.listPaginated({ limit: 3, prefix: 'doc1#' });
console.log(results);
// {
// vectors: [
// { id: 'doc1#01' }, { id: 'doc1#02' }, { id: 'doc1#03' },
// { id: 'doc1#04' }, { id: 'doc1#05' }, { id: 'doc1#06' },
// { id: 'doc1#07' }, { id: 'doc1#08' }, { id: 'doc1#09' },
// { id: 'doc1#01' }
// { id: 'doc1#02' }
// { id: 'doc1#03' }
// ...
// ],
// pagination: {
Expand All @@ -945,11 +959,20 @@ console.log(results);
// usage: { readUnits: 1 }
// }

// Fetch the next page of results
await index.listPaginated({
// Fetch the final vector ID matching prefix 'doc1#' using the paginationToken returned by the previous call
const nextResults = await index.listPaginated({
prefix: 'doc1#',
paginationToken: results.pagination?.next,
});
console.log(nextResults);
// {
// vectors: [
// { id: 'doc1#04' }
// ],
// pagination: undefined,
// namespace: 'my-namespace',
// usage: { readUnits: 1 }
// }
```

### Fetch records by ID(s)
Expand Down
6 changes: 3 additions & 3 deletions src/integration/data/vectors/list.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ describe('listPaginated, serverless index', () => {
test('test listPaginated with no arguments', async () => {
const listResults = await serverlessIndex.listPaginated();
expect(listResults).toBeDefined();
// expect(listResults.pagination).toBeDefined(); todo: re-enable this once pagination bug is fixed (https://app.asana.com/0/1204819992273155/1207992392793971/f)
expect(listResults.pagination).toBeUndefined(); // Only 11 records in the index, so no pag token returned
expect(listResults.vectors?.length).toBe(11);
expect(listResults.namespace).toBe(globalNamespaceOne);
});
Expand All @@ -29,7 +29,7 @@ describe('listPaginated, serverless index', () => {
});
expect(listResults.namespace).toBe(globalNamespaceOne);
expect(listResults.vectors?.length).toBe(1);
// expect(listResults.pagination).toBeDefined(); todo: re-enable this once pagination bug is fixed (https://app.asana.com/0/1204819992273155/1207992392793971/f)
expect(listResults.pagination).toBeUndefined();
});

test('test listPaginated with limit and pagination', async () => {
Expand All @@ -39,7 +39,7 @@ describe('listPaginated, serverless index', () => {
});
expect(listResults.namespace).toBe(globalNamespaceOne);
expect(listResults.vectors?.length).toBe(3);
// expect(listResults.pagination).toBeDefined(); todo: re-enable this once pagination bug is fixed (https://app.asana.com/0/1204819992273155/1207992392793971/f)
expect(listResults.pagination).toBeDefined();

const listResultsPg2 = await serverlessIndex.listPaginated({
prefix,
Expand Down

0 comments on commit 92b7e66

Please sign in to comment.