Skip to content

Commit df4741c

Browse files
authored
Update 2023-09-13-text-embedding-and-cosine-similarity.markdown
Fixed the pooling sentences
1 parent a8235b2 commit df4741c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

_posts/2023-09-13-text-embedding-and-cosine-similarity.markdown

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ The following is a two-dimensional graph showing sample embedding vectors for ca
3030
<figcaption>Example of Car, Cat and Dog embedding vectors and the cosine similarity between cat and dog as well as between cat and car.</figcaption>
3131
</figure>
3232

33-
While cosine similarity has a range from -1.0 to 1.0, users of the OpenAI embedding API will typically not see values greater than 0.4. This is a side effect of max pooling, which is a technique often used with neural networks to reduce long input, such as text, down to the highlights, allowing the network to focus on the important parts of the data. In this case, it's an efficient compression technique for processing natural language in the embedding algorithm. A thorough explanation max pooling and the reason for the shift toward the positive is beyond the scope of the article.
33+
While cosine similarity has a range from -1.0 to 1.0, users of the OpenAI embedding API will typically not see values less than 0.4. A thorough explanation of the reasons behind this are beyond the scope of this article, but you can learn more by searching for articles about text embedding pooling.
3434

3535
## Obtaining Embeddings and Cosine Similarity
3636

0 commit comments

Comments
 (0)