Refactor listPaginated tests + make README clearer (#312)

## Problem The core DB previously had [a race-condition bug](https://app.asana.com/0/1203260648987893/1207992392793971/f) that resulted in [pagination tokens being non-deterministically returned to the user](#266) when they'd call the `listPaginated` endpoint (not just in the TS client). ## Solution The core DB team has [fixed this bug](pinecone-io/pinecone-db#7285). Now that it's fixed, this PR updates previously-commented-out tests + updates the README to be more precise regarding how this method works. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan CI passes. --- - To see the specific tasks where the Asana app for GitHub is being used, see below: - https://app.asana.com/0/0/1208246198758491
pinecone-io · Nov 1, 2024 · 92b7e66 · 92b7e66
1 parent 26d31d7
commit 92b7e66
Show file tree

Hide file tree

Showing 2 changed files with 34 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -923,19 +923,33 @@ await index.update({
 
 ### List records
 
-The `listPaginated` method can be used to list record ids matching a particular id prefix in a paginated format. With clever assignment
-of record ids, this can be used to help model hierarchical relationships between different records such as when there are embeddings for multiple chunks or fragments related to the same document.
+The `listPaginated` method can be used to list record IDs matching a particular ID prefix in a paginated format. With
+[clever assignment
+of record ids](https://docs.pinecone.io/guides/data/manage-rag-documents#use-id-prefixes), this can be used to help model hierarchical relationships between different records such as when there are embeddings for multiple chunks or fragments related to the same document.
+
+Notes:
+
+- When you do not specify a `prefix`, the default prefix is an empty string, which returns all vector IDs
+  in your index
+- There is a hard limit of `100` vector IDs if no `limit` is specified. Consequently, if there are fewer than `100`
+  vector IDs that match a given `prefix` in your index, and you do not specify a `limit`, your `paginationToken`
+  will be `undefined`
+
+The following example shows how to fetch both pages of vector IDs for vectors whose IDs contain the prefix `doc1#`,
+assuming a `limit` of `3` and `doc1` document being [chunked](https://www.pinecone.io/learn/chunking-strategies/) into `4` vectors.
 
 ```typescript
 const pc = new Pinecone();
 const index = pc.index('my-index').namespace('my-namespace');
-const results = await index.listPaginated({ prefix: 'doc1#' });
+
+// Fetch the 1st 3 vector IDs matching prefix 'doc1#'
+const results = await index.listPaginated({ limit: 3, prefix: 'doc1#' });
 console.log(results);
 // {
 //   vectors: [
-//     { id: 'doc1#01' }, { id: 'doc1#02' }, { id: 'doc1#03' },
-//     { id: 'doc1#04' }, { id: 'doc1#05' },  { id: 'doc1#06' },
-//     { id: 'doc1#07' }, { id: 'doc1#08' }, { id: 'doc1#09' },
+//     { id: 'doc1#01' }
+//     { id: 'doc1#02' }
+//     { id: 'doc1#03' }
 //     ...
 //   ],
 //   pagination: {
@@ -945,11 +959,20 @@ console.log(results);
 //   usage: { readUnits: 1 }
 // }
 
-// Fetch the next page of results
-await index.listPaginated({
+// Fetch the final vector ID matching prefix 'doc1#' using the paginationToken returned by the previous call
+const nextResults = await index.listPaginated({
   prefix: 'doc1#',
   paginationToken: results.pagination?.next,
 });
+console.log(nextResults);
+// {
+//   vectors: [
+//     { id: 'doc1#04' }
+//   ],
+//   pagination: undefined,
+//   namespace: 'my-namespace',
+//   usage: { readUnits: 1 }
+// }
 ```
 
 ### Fetch records by ID(s)

diff --git a/src/integration/data/vectors/list.test.ts b/src/integration/data/vectors/list.test.ts
@@ -18,7 +18,7 @@ describe('listPaginated, serverless index', () => {
   test('test listPaginated with no arguments', async () => {
     const listResults = await serverlessIndex.listPaginated();
     expect(listResults).toBeDefined();
-    // expect(listResults.pagination).toBeDefined(); todo: re-enable this once pagination bug is fixed (https://app.asana.com/0/1204819992273155/1207992392793971/f)
+    expect(listResults.pagination).toBeUndefined(); // Only 11 records in the index, so no pag token returned
     expect(listResults.vectors?.length).toBe(11);
     expect(listResults.namespace).toBe(globalNamespaceOne);
   });
@@ -29,7 +29,7 @@ describe('listPaginated, serverless index', () => {
     });
     expect(listResults.namespace).toBe(globalNamespaceOne);
     expect(listResults.vectors?.length).toBe(1);
-    // expect(listResults.pagination).toBeDefined(); todo: re-enable this once pagination bug is fixed (https://app.asana.com/0/1204819992273155/1207992392793971/f)
+    expect(listResults.pagination).toBeUndefined();
   });
 
   test('test listPaginated with limit and pagination', async () => {
@@ -39,7 +39,7 @@ describe('listPaginated, serverless index', () => {
     });
     expect(listResults.namespace).toBe(globalNamespaceOne);
     expect(listResults.vectors?.length).toBe(3);
-    // expect(listResults.pagination).toBeDefined(); todo: re-enable this once pagination bug is fixed (https://app.asana.com/0/1204819992273155/1207992392793971/f)
+    expect(listResults.pagination).toBeDefined();
 
     const listResultsPg2 = await serverlessIndex.listPaginated({
       prefix,