Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llms/mistral: Implementing embeddings.EmbedderClient for Mistral and an example with PGVector #1086

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mathiasb
Copy link

@mathiasb mathiasb commented Dec 9, 2024

This PR implements the embeddings.EmbedderClient interface for mistral, using the "mistral-embed" model. There is an example, heavily inspired by the openai-embeddings-example of how it can be used with pgvector.

PR Checklist

  • Read the Contributing documentation.
  • Read the Code of conduct documentation.
  • Name your Pull Request title clearly, concisely, and prefixed with the name of the primarily affected package you changed according to Good commit messages (such as memory: add interfaces for X, Y or util: add whizzbang helpers).
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. Fixes #123).
  • Describes the source of new concepts.
  • References existing implementations as appropriate.
  • Contains test coverage for new functions.
  • Passes all golangci-lint checks.

@mathiasb
Copy link
Author

mathiasb commented Dec 9, 2024

I did a replace in my mod file for my local dev, and was unsure how to handle that in the PR, which I think is the reason for the CI failures. Happy to take pointers on if I should change something in my repo to fix it. KR

@mathiasb
Copy link
Author

Ok, I think good to go for reviewing. Also, as said earlier, any pointers or feedback welcome as this is my first PR in this project.

"errors"
)

func ConvertFloat64ToFloat32(input []float64) []float32 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this function needs to be exported.

allEmbds := make([][]float32, len(embsRes.Data))
for i, embs := range embsRes.Data {
if len(embs.Embedding) == 0 {
return nil, errors.New("empty embeddings")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be an exported error variable. var ErrEmptyEmbeddings = errors.new("...")

t.Run(tt.name, func(t *testing.T) {
t.Parallel()
output := sdk.ConvertFloat64ToFloat32(tt.input)
if len(output) != len(tt.expected) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use stretchr/testify in test.

value := os.Getenv(envVar)

// Check if it is set (non-empty)
if value == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use t.Skip instead.


model, err := sdk.New()
if err != nil {
panic(err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not panic the errors. Use require.NoError(t, err) from testify instead

if err != nil {
panic(err)
}
t.Logf("Document embeddings: %v\n", docEmbeddings)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to include the log statements.

}
fmt.Println("store.SimilaritySearch1:\n", docs)

time.Sleep(2 * time.Second) // Don't trigger cloudflare
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no better solution for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants