Skip to content

Commit 6f43b93

Browse files
authored
Add embed endpoint (#153)
## Problem Add embed endpoint for Java SDK to allow users to create embeddings for text data such as passage or query using a specified model. More details on inference api can be found [here](https://docs.pinecone.io/guides/inference/understanding-inference). ## Solution Users can now call embed endpoint with the following parameters: 1. `String model`: Accepts a string from the specified [models](https://docs.pinecone.io/models/overview) . 2. `Map<String, Object> parameters`: Accepts `input_type` and `truncate` as keys with their corresponding values in a Map. The values are expected to be scalar. Please note that the default value of `truncate` is set to `END` if not specified. 3. `List<String> inputs`: The list must be of size atleast 1. As a part of this change, I have added a `getInferenceClient()` in `Pinecone` class to follow a similar pattern with other SDKs. The underlying client uses `OkHTTPClient` for REST calls. The method `getInferenceClient()` returns an instance of `Inference` class which I added as a wrapper to the Inference API. So far, this wrapper contains `embed()` endpoint only, with the plan of adding `rerank()` soon. Lastly, I have added docstrings and updated README with an example of the embed endpoint. ## Type of Change - [X] New feature (non-breaking change which adds functionality) ## Test Plan Added integration tests.
1 parent 27ea488 commit 6f43b93

File tree

4 files changed

+186
-5
lines changed

4 files changed

+186
-5
lines changed

README.md

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,11 @@ The `Pinecone` class is your main entry point into the Pinecone Java SDK. You ca
4747
your `apiKey`, either by passing it as an argument in your code or by setting it as an environment variable called
4848
`PINECONE_API_KEY`.
4949

50-
Note: for pod-based indexes, you will also need an `environment` variable. You can set pass this as an argument in
51-
your code or set it as an environment variable called `PINECONE_ENVIRONMENT`.
50+
This internally instantiates a single shared OkHttpClient instance, which is used for both control plane and inference
51+
operations. Note that the OkHttpClient performs best when you create a single `OkHttpClient` instance and reuse it
52+
for all of your HTTP calls. This is because each client holds its own connection pool and thread pools. Reusing
53+
connections and threads reduces latency and saves memory. Conversely, creating a client for each request wastes
54+
resources on idle pools. More details on the OkHttpClient can be found [here](https://github.com/square/okhttp/blob/f2771425cb714a5b0b27238bd081b2516b4d640f/okhttp/src/main/kotlin/okhttp3/OkHttpClient.kt#L54).
5255

5356
```java
5457
import io.pinecone.clients.Pinecone;
@@ -542,7 +545,44 @@ Pinecone pinecone = new Pinecone.Builder("PINECONE_API_KEY").build();
542545
pinecone.deleteCollection("example-collection");
543546
```
544547

548+
## Inference
549+
550+
The Pinecone SDK now supports creating embeddings via the [Inference API](https://docs.pinecone.io/guides/inference/understanding-inference).
551+
552+
```java
553+
import io.pinecone.clients.Pinecone;
554+
import org.openapitools.control.client.ApiException;
555+
import org.openapitools.control.client.model.Embedding;
556+
import org.openapitools.control.client.model.EmbeddingsList;
557+
558+
import java.util.ArrayList;
559+
import java.util.HashMap;
560+
import java.util.List;
561+
import java.util.Map;
562+
...
563+
564+
Pinecone pinecone = new Pinecone.Builder("PINECONE_API_KEY").build();
565+
Inference inference = pinecone.getInferenceClient();
566+
567+
// Prepare input sentences to be embedded
568+
List<String> inputs = new ArrayList<>();
569+
inputs.add("The quick brown fox jumps over the lazy dog.");
570+
inputs.add("Lorem ipsum");
571+
572+
// Specify the embedding model and parameters
573+
String embeddingModel = "multilingual-e5-large";
574+
575+
Map<String, Object> parameters = new HashMap<>();
576+
parameters.put("input_type", "query");
577+
parameters.put("truncate", "END");
578+
579+
// Generate embeddings for the input data
580+
EmbeddingsList embeddings = inference.embed(embeddingModel, parameters, inputs);
581+
582+
// Get embedded data
583+
List<Embedding> embeddedData = embeddings.getData();
584+
```
585+
545586
## Examples
546587

547-
- The data and control plane operation examples can be found in `io/pinecone/integration` folder.
548-
- A full end-to-end Semantic Search example can be found in the [Java Examples](https://github.com/pinecone-io/java-examples/tree/main) repo on Github.
588+
- The data and control plane operation examples can be found in `io/pinecone/integration` folder.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
package io.pinecone.integration.inference;
2+
3+
import io.pinecone.clients.Inference;
4+
import io.pinecone.clients.Pinecone;
5+
import org.junit.jupiter.api.Assertions;
6+
import org.junit.jupiter.api.Test;
7+
import org.openapitools.control.client.ApiException;
8+
import org.openapitools.control.client.model.EmbeddingsList;
9+
10+
import java.util.*;
11+
12+
import static org.junit.Assert.*;
13+
import static org.junit.jupiter.api.Assertions.assertNotNull;
14+
15+
public class EmbedTest {
16+
17+
private static final Pinecone pinecone = new Pinecone
18+
.Builder(System.getenv("PINECONE_API_KEY"))
19+
.withSourceTag("pinecone_test")
20+
.build();
21+
private static final Inference inference = pinecone.getInferenceClient();
22+
23+
@Test
24+
public void testGenerateEmbeddings() throws ApiException {
25+
List<String> inputs = new ArrayList<>(1);
26+
inputs.add("The quick brown fox jumps over the lazy dog.");
27+
inputs.add("Lorem ipsum");
28+
29+
String embeddingModel = "multilingual-e5-large";
30+
31+
Map<String, Object> parameters = new HashMap<>();
32+
parameters.put("input_type", "query");
33+
parameters.put("truncate", "END");
34+
EmbeddingsList embeddings = inference.embed(embeddingModel, parameters, inputs);
35+
36+
assertNotNull(embeddings, "Expected embedding to be not null");
37+
Assertions.assertEquals(embeddingModel, embeddings.getModel());
38+
Assertions.assertEquals(1024, embeddings.getData().get(0).getValues().size());
39+
Assertions.assertEquals(2, embeddings.getData().size());
40+
}
41+
42+
@Test
43+
public void testGenerateEmbeddingsInvalidInputs() throws ApiException {
44+
String embeddingModel = "multilingual-e5-large";
45+
List<String> inputs = new ArrayList<>();
46+
Map<String, Object> parameters = new HashMap<>();
47+
parameters.put("input_type", "query");
48+
parameters.put("truncate", "END");
49+
50+
Exception exception = assertThrows(Exception.class, () -> {
51+
inference.embed(embeddingModel, parameters, inputs);
52+
});
53+
54+
Assertions.assertTrue(exception.getMessage().contains("Must specify at least one input"));
55+
}
56+
}
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
package io.pinecone.clients;
2+
3+
import org.openapitools.control.client.ApiClient;
4+
import org.openapitools.control.client.ApiException;
5+
import org.openapitools.control.client.api.InferenceApi;
6+
import org.openapitools.control.client.model.EmbedRequest;
7+
import org.openapitools.control.client.model.EmbedRequestInputsInner;
8+
import org.openapitools.control.client.model.EmbedRequestParameters;
9+
import org.openapitools.control.client.model.EmbeddingsList;
10+
11+
import java.util.List;
12+
import java.util.Map;
13+
import java.util.stream.Collectors;
14+
15+
/**
16+
* The Inference class provides methods to interact with Pinecone's inference API through the Java SDK. It allows users
17+
* to send input data to generate embeddings using a specified model.
18+
* <p>
19+
* This class utilizes the {@link InferenceApi} to make API calls to the Pinecone inference service.
20+
*
21+
*/
22+
23+
public class Inference {
24+
25+
private final InferenceApi inferenceApi;
26+
27+
/**
28+
* Constructs an instance of {@link Inference} class.
29+
*
30+
* @param apiClient The ApiClient object used to configure the API connection.
31+
*/
32+
public Inference(ApiClient apiClient) {
33+
inferenceApi = new InferenceApi(apiClient);
34+
}
35+
36+
/**
37+
* Sends input data and parameters to the embedding model and returns a list of embeddings.
38+
*
39+
* @param model The embedding model to use.
40+
* @param parameters A map containing model-specific parameters.
41+
* @param inputs A list of input strings to generate embeddings for.
42+
* @return EmbeddingsList containing the embeddings for the provided inputs.
43+
* @throws ApiException If the API call fails, an ApiException is thrown.
44+
*/
45+
public EmbeddingsList embed(String model, Map<String, Object> parameters, List<String> inputs) throws ApiException {
46+
EmbedRequestParameters embedRequestParameters = new EmbedRequestParameters();
47+
parameters.forEach(embedRequestParameters::putAdditionalProperty);
48+
49+
EmbedRequest embedRequest = new EmbedRequest()
50+
.model(model)
51+
.parameters(embedRequestParameters)
52+
.inputs(convertToEmbedInputs(inputs));
53+
54+
return inferenceApi.embed(embedRequest);
55+
}
56+
57+
/**
58+
* Converts a list of input strings to EmbedRequestInputsInner objects.
59+
*
60+
* @param inputs A list of input strings.
61+
* @return A list of EmbedRequestInputsInner objects containing the input data.
62+
*/
63+
private List<EmbedRequestInputsInner> convertToEmbedInputs(List<String> inputs) {
64+
return inputs.stream()
65+
.map(input -> new EmbedRequestInputsInner().text(input))
66+
.collect(Collectors.toList());
67+
}
68+
}

src/main/java/io/pinecone/clients/Pinecone.java

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,12 @@
1818

1919
/**
2020
* The Pinecone class is the main entry point for interacting with Pinecone via the Java SDK.
21-
* It is used to create, delete, and manage your indexes and collections.
21+
* It is used to create, delete, and manage your indexes and collections, along with the inference api.
22+
* Note that the Pinecone class instantiates a single shared {@link OkHttpClient} instance,
23+
* which is used for both control plane and inference operations.The OkHttpClient performs best when you create a single
24+
* `OkHttpClient` instance and reuse it for all of your HTTP calls. This is because each client holds its own connection
25+
* pool and thread pools. Reusing connections and threads reduces latency and saves memory. Conversely, creating a
26+
* client for each request wastes resources on idle pools.
2227
* <p>
2328
* To instantiate the Pinecone class, use the {@link Pinecone.Builder} class to pass
2429
* an API key and any other optional configuration.
@@ -871,6 +876,18 @@ public AsyncIndex getAsyncIndexConnection(String indexName) throws PineconeValid
871876
return new AsyncIndex(connection, indexName);
872877
}
873878

879+
/**
880+
* A method to create and return a new instance of the {@link Inference} client.
881+
* <p>
882+
* This method initializes the Inference client using the current ApiClient
883+
* from the {@link ManageIndexesApi}. The {@link Inference} client can then be used
884+
* to interact with Pinecone's inference API.
885+
* @return A new {@link Inference} client instance.
886+
*/
887+
public Inference getInferenceClient() {
888+
return new Inference(manageIndexesApi.getApiClient());
889+
}
890+
874891
PineconeConnection getConnection(String indexName) {
875892
return connectionsMap.computeIfAbsent(indexName, key -> new PineconeConnection(config));
876893
}

0 commit comments

Comments
 (0)