Embeddings
Embeddings are numerical representations of text strings that measure their relationships. Text embeddings measure the degree of similarity between text strings and are often used for the following tasks:
- Search (results are sorted by relevance to the query)
- Clustering (grouping text strings by similarity)
- Recommendations (recommending items with similar text strings)
- Anomaly detection (finding items that differ significantly from others)
- Diversity measurement (analyzing similarity distribution)
- Classification (classifying text strings based on their similarity to labels)
An embedding is a vector (list) of numbers. The distance between two vectors measures their degree of similarity: a small distance indicates high similarity, and a large distance indicates low similarity.
Creating Embeddings for Single and Multiple Text Objects
- Python (OpenAI клиент)
- Python (OpenAI клиент)
- cURL
#pip install langchain-openai - if you don't have this package yet
from langchain_openai import OpenAIEmbeddings
embeddings_model = OpenAIEmbeddings(
model="CompressaEmbeddings",
base_url="http://localhost:5000/v1",
api_key="Your_API_key_Compressa",
)
# Create embedding for a single query
query_embedding = embeddings.embed_query("How to cook borscht?")
# Create embeddings for multiple documents
docs_embeddings = embeddings.embed_documents([
"Borscht is a traditional Slavic soup",
"Beets are needed to make borscht",
"Borscht is usually served with sour cream",
"Meat is often added to borscht",
"Borscht has a characteristic red color"
])
# from openai import OpenAI - if you don't have this package yet
client = OpenAI(
api_key = "Your_API_key_Compressa",
base_url = "http://localhost:5000/v1"
)
# Create embedding for a single query
embedding = client.embeddings.create(
model="Compressa-Embedding",
input="How to cook borscht?",
encoding_format="float",
)
# Create embeddings for multiple documents
docs = [
"Borscht is a traditional Slavic soup",
"Beets are needed to make borscht",
"Borscht is usually served with sour cream",
"Meat is often added to borscht",
"Borscht has a characteristic red color"
]
embeddings = client.embeddings.create(
model="Compressa-Embedding",
input=docs,
encoding_format="float",
)
curl -X 'POST' \
'http://localhost:5000/v1/embeddings' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer Your_API_key_Compressa' \
-d '{
"model": "Compressa-Embedding",
"input": ["text_one", "text_two"]
}'