REST API
Compressa API (1.0.0)
Download OpenAPI specification:Download
Create Chat Completion
Request Body schema: application/jsonrequired
required | Array of ChatCompletionSystemMessageParam (object) or ChatCompletionUserMessageParam (object) or ChatCompletionAssistantMessageParam (object) or ChatCompletionToolMessageParam (object) or ChatCompletionFunctionMessageParam (object) or CustomChatCompletionMessageParam (object) (Messages) |
model required | string (Model) |
Frequency Penalty (number) or Frequency Penalty (null) (Frequency Penalty) Default: 0 | |
Logit Bias (object) or Logit Bias (null) (Logit Bias) | |
Logprobs (boolean) or Logprobs (null) (Logprobs) Default: false | |
Top Logprobs (integer) or Top Logprobs (null) (Top Logprobs) Default: 0 | |
Max Tokens (integer) or Max Tokens (null) (Max Tokens) | |
N (integer) or N (null) (N) Default: 1 | |
Presence Penalty (number) or Presence Penalty (null) (Presence Penalty) Default: 0 | |
ResponseFormat (object) or null | |
Seed (integer) or Seed (null) (Seed) | |
Stop (string) or Array of Stop (strings) or Stop (null) (Stop) | |
Stream (boolean) or Stream (null) (Stream) Default: false | |
StreamOptions (object) or null | |
Temperature (number) or Temperature (null) (Temperature) Default: 0.7 | |
Top P (number) or Top P (null) (Top P) Default: 1 | |
Array of Tools (objects) or Tools (null) (Tools) | |
"none" (string) or ChatCompletionNamedToolChoiceParam (object) or Tool Choice (null) (Tool Choice) Default: "none" | |
User (string) or User (null) (User) | |
Best Of (integer) or Best Of (null) (Best Of) | |
Use Beam Search (boolean) or Use Beam Search (null) (Use Beam Search) Default: false | |
Top K (integer) or Top K (null) (Top K) Default: -1 | |
Min P (number) or Min P (null) (Min P) Default: 0 | |
Repetition Penalty (number) or Repetition Penalty (null) (Repetition Penalty) Default: 1 | |
Length Penalty (number) or Length Penalty (null) (Length Penalty) Default: 1 | |
Early Stopping (boolean) or Early Stopping (null) (Early Stopping) Default: false | |
Ignore Eos (boolean) or Ignore Eos (null) (Ignore Eos) Default: false | |
Min Tokens (integer) or Min Tokens (null) (Min Tokens) Default: 0 | |
Array of Stop Token Ids (integers) or Stop Token Ids (null) (Stop Token Ids) | |
Skip Special Tokens (boolean) or Skip Special Tokens (null) (Skip Special Tokens) Default: true | |
Spaces Between Special Tokens (boolean) or Spaces Between Special Tokens (null) (Spaces Between Special Tokens) Default: true | |
Echo (boolean) or Echo (null) (Echo) Default: false If true, the new message will be prepended with the last message if they belong to the same role. | |
Add Generation Prompt (boolean) or Add Generation Prompt (null) (Add Generation Prompt) Default: true If true, the generation prompt will be added to the chat template. This is a parameter used by chat template in tokenizer config of the model. | |
Add Special Tokens (boolean) or Add Special Tokens (null) (Add Special Tokens) Default: false If true, special tokens (e.g. BOS) will be added to the prompt on top of what is added by the chat template. For most models, the chat template takes care of adding the special tokens so this should be set to False (as is the default). | |
Include Stop Str In Output (boolean) or Include Stop Str In Output (null) (Include Stop Str In Output) Default: false Whether to include the stop string in the output. This is only applied when the stop or stop_token_ids is set. | |
Guided Json (string) or Guided Json (object) or BaseModel (object) or Guided Json (null) (Guided Json) If specified, the output will follow the JSON schema. | |
Guided Regex (string) or Guided Regex (null) (Guided Regex) If specified, the output will follow the regex pattern. | |
Array of Guided Choice (strings) or Guided Choice (null) (Guided Choice) If specified, the output will be exactly one of the choices. | |
Guided Grammar (string) or Guided Grammar (null) (Guided Grammar) If specified, the output will follow the context free grammar. | |
Guided Decoding Backend (string) or Guided Decoding Backend (null) (Guided Decoding Backend) If specified, will override the default guided decoding backend of the server for this specific request. If set, must be either 'outlines' / 'lm-format-enforcer' | |
Guided Whitespace Pattern (string) or Guided Whitespace Pattern (null) (Guided Whitespace Pattern) If specified, will override the default whitespace pattern for guided json decoding. | |
Enforced Str (string) or Enforced Str (null) (Enforced Str) |
Responses
Request samples
- Payload
- Python
- cURL
{- "messages": [
- {
- "content": "string",
- "role": "system",
- "name": "string"
}
], - "model": "string",
- "frequency_penalty": 0,
- "logit_bias": {
- "property1": 0,
- "property2": 0
}, - "logprobs": false,
- "top_logprobs": 0,
- "max_tokens": 0,
- "n": 1,
- "presence_penalty": 0,
- "response_format": {
- "type": "text"
}, - "seed": -9223372036854776000,
- "stop": "string",
- "stream": false,
- "stream_options": {
- "include_usage": true
}, - "temperature": 0.7,
- "top_p": 1,
- "tools": [
- {
- "type": "function",
- "function": {
- "name": "string",
- "description": "string",
- "parameters": { }
}
}
], - "tool_choice": "none",
- "user": "string",
- "best_of": 0,
- "use_beam_search": false,
- "top_k": -1,
- "min_p": 0,
- "repetition_penalty": 1,
- "length_penalty": 1,
- "early_stopping": false,
- "ignore_eos": false,
- "min_tokens": 0,
- "stop_token_ids": [
- 0
], - "skip_special_tokens": true,
- "spaces_between_special_tokens": true,
- "echo": false,
- "add_generation_prompt": true,
- "add_special_tokens": false,
- "include_stop_str_in_output": false,
- "guided_json": "string",
- "guided_regex": "string",
- "guided_choice": [
- "string"
], - "guided_grammar": "string",
- "guided_decoding_backend": "string",
- "guided_whitespace_pattern": "string",
- "enforced_str": "string"
}
Response samples
- 200
- 422
null
Create Completion
Request Body schema: application/jsonrequired
model required | string (Model) |
required | Array of Prompt (integers) or Array of Prompt (integers) or Prompt (string) or Array of Prompt (strings) (Prompt) |
Best Of (integer) or Best Of (null) (Best Of) | |
Echo (boolean) or Echo (null) (Echo) Default: false | |
Frequency Penalty (number) or Frequency Penalty (null) (Frequency Penalty) Default: 0 | |
Logit Bias (object) or Logit Bias (null) (Logit Bias) | |
Logprobs (integer) or Logprobs (null) (Logprobs) | |
Max Tokens (integer) or Max Tokens (null) (Max Tokens) Default: 16 | |
n | integer (N) Default: 1 |
Presence Penalty (number) or Presence Penalty (null) (Presence Penalty) Default: 0 | |
Seed (integer) or Seed (null) (Seed) | |
Stop (string) or Array of Stop (strings) or Stop (null) (Stop) | |
Stream (boolean) or Stream (null) (Stream) Default: false | |
StreamOptions (object) or null | |
Suffix (string) or Suffix (null) (Suffix) | |
Temperature (number) or Temperature (null) (Temperature) Default: 1 | |
Top P (number) or Top P (null) (Top P) Default: 1 | |
User (string) or User (null) (User) | |
Use Beam Search (boolean) or Use Beam Search (null) (Use Beam Search) Default: false | |
Top K (integer) or Top K (null) (Top K) Default: -1 | |
Min P (number) or Min P (null) (Min P) Default: 0 | |
Repetition Penalty (number) or Repetition Penalty (null) (Repetition Penalty) Default: 1 | |
Length Penalty (number) or Length Penalty (null) (Length Penalty) Default: 1 | |
Early Stopping (boolean) or Early Stopping (null) (Early Stopping) Default: false | |
Array of Stop Token Ids (integers) or Stop Token Ids (null) (Stop Token Ids) | |
Ignore Eos (boolean) or Ignore Eos (null) (Ignore Eos) Default: false | |
Min Tokens (integer) or Min Tokens (null) (Min Tokens) Default: 0 | |
Skip Special Tokens (boolean) or Skip Special Tokens (null) (Skip Special Tokens) Default: true | |
Spaces Between Special Tokens (boolean) or Spaces Between Special Tokens (null) (Spaces Between Special Tokens) Default: true | |
Truncate Prompt Tokens (integer) or Truncate Prompt Tokens (null) (Truncate Prompt Tokens) | |
Include Stop Str In Output (boolean) or Include Stop Str In Output (null) (Include Stop Str In Output) Default: false Whether to include the stop string in the output. This is only applied when the stop or stop_token_ids is set. | |
ResponseFormat (object) or null Similar to chat completion, this parameter specifies the format of output. Only {'type': 'json_object'} or {'type': 'text' } is supported. | |
Guided Json (string) or Guided Json (object) or BaseModel (object) or Guided Json (null) (Guided Json) If specified, the output will follow the JSON schema. | |
Guided Regex (string) or Guided Regex (null) (Guided Regex) If specified, the output will follow the regex pattern. | |
Array of Guided Choice (strings) or Guided Choice (null) (Guided Choice) If specified, the output will be exactly one of the choices. | |
Guided Grammar (string) or Guided Grammar (null) (Guided Grammar) If specified, the output will follow the context free grammar. | |
Guided Decoding Backend (string) or Guided Decoding Backend (null) (Guided Decoding Backend) If specified, will override the default guided decoding backend of the server for this specific request. If set, must be one of 'outlines' / 'lm-format-enforcer' | |
Guided Whitespace Pattern (string) or Guided Whitespace Pattern (null) (Guided Whitespace Pattern) If specified, will override the default whitespace pattern for guided json decoding. |
Responses
Request samples
- Payload
- Python
- cURL
{- "model": "string",
- "prompt": [
- 0
], - "best_of": 0,
- "echo": false,
- "frequency_penalty": 0,
- "logit_bias": {
- "property1": 0,
- "property2": 0
}, - "logprobs": 0,
- "max_tokens": 16,
- "n": 1,
- "presence_penalty": 0,
- "seed": -9223372036854776000,
- "stop": "string",
- "stream": false,
- "stream_options": {
- "include_usage": true
}, - "suffix": "string",
- "temperature": 1,
- "top_p": 1,
- "user": "string",
- "use_beam_search": false,
- "top_k": -1,
- "min_p": 0,
- "repetition_penalty": 1,
- "length_penalty": 1,
- "early_stopping": false,
- "stop_token_ids": [
- 0
], - "ignore_eos": false,
- "min_tokens": 0,
- "skip_special_tokens": true,
- "spaces_between_special_tokens": true,
- "truncate_prompt_tokens": 1,
- "include_stop_str_in_output": false,
- "response_format": {
- "type": "text"
}, - "guided_json": "string",
- "guided_regex": "string",
- "guided_choice": [
- "string"
], - "guided_grammar": "string",
- "guided_decoding_backend": "string",
- "guided_whitespace_pattern": "string"
}
Response samples
- 200
- 422
null
Compressa Embeddings API is OpenAI compatible API that allows you to create embeddings.
Create Embedding
Request Body schema: application/jsonrequired
model required | string (Model) |
required | Array of Input (integers) or Array of Input (integers) or Input (string) or Array of Input (strings) (Input) |
Encoding Format (string) or Encoding Format (null) (Encoding Format) Default: "float" | |
Dimensions (integer) or Dimensions (null) (Dimensions) | |
User (string) or User (null) (User) | |
Additional Data (any) or Additional Data (null) (Additional Data) |
Responses
Request samples
- Payload
- Python
- cURL
{- "model": "string",
- "input": [
- 0
], - "encoding_format": "float",
- "dimensions": 0,
- "user": "string",
- "additional_data": { }
}
Response samples
- 200
- 422
null
Rerank
Request Body schema: application/jsonrequired
documents required | Array of strings (Documents) [ 1 .. 2048 ] items [ items <= 122880 characters ] |
query required | string (Query) <= 122880 characters |
return_documents | any (Return Documents) Default: false |
model | any (Model) Default: "default/not-specified" |
Responses
Request samples
- Payload
- Python
- cURL
{- "query": "string",
- "documents": [
- "string"
], - "return_documents": false,
- "model": "default/not-specified"
}
Response samples
- 200
- 422
{- "object": "rerank",
- "results": [
- {
- "relevance_score": null,
- "index": null,
- "document": { }
}
], - "model": null,
- "usage": {
- "prompt_tokens": null,
- "total_tokens": null
}, - "id": null,
- "created": null
}
Summary
Description
Request Body schema: multipart/form-datarequired
files required | string <binary> The file to extract |
xml_keep_tags | boolean (Xml Keep Tags) Default: false If True, will retain the XML tags in the output. Otherwise it will simply extract the text from within the tags. Only applies to partition_xml. |
languages | Array of strings (OCR Languages) Default: [] The languages present in the document, for use in partitioning and/or OCR |
ocr_languages | Array of strings (OCR Languages) Default: [] The languages present in the document, for use in partitioning and/or OCR |
skip_infer_table_types | Array of strings (Skip Infer Table Types) Default: [] The document types that you want to skip table extraction with. Default: [] |
Uncompressed Content Type (string) or Uncompressed Content Type (null) (Uncompressed Content Type) If file is gzipped, use this content type after unzipping | |
output_format | string (Output Format) Default: "application/json" Enum: "application/json" "text/csv" The format of the response. Supported formats are application/json and text/csv. Default: application/json. |
coordinates | boolean (Coordinates) Default: false If true, return coordinates for each element. Default: false |
Content type (string) or Content type (null) (Content type) A hint about the content type to use (such as text/markdown), when there are problems processing a specific file. This value is a MIME type in the format type/subtype. | |
encoding | string (Encoding) Default: "utf-8" The encoding method used to decode the text input. Default: utf-8 |
Hi Res Model Name (string) or Hi Res Model Name (null) (Hi Res Model Name) The name of the inference model used when strategy is hi_res | |
include_page_breaks | boolean (Include Page Breaks) Default: false If True, the output will include page breaks if the filetype supports it. Default: false |
pdf_infer_table_structure | boolean (Pdf Infer Table Structure) Default: true Deprecated! Use skip_infer_table_types to opt out of table extraction for any file type. If False and strategy=hi_res, no Table Elements will be extracted from pdf files regardless of skip_infer_table_types contents. |
strategy | string (Strategy) Default: "auto" Enum: "fast" "hi_res" "auto" "ocr_only" The strategy to use for partitioning PDF/image. Options are fast, hi_res, auto. Default: auto |
extract_image_block_types | Array of strings (Image block types to extract) Default: [] The types of elements to extract, for use in extracting image blocks as base64 encoded data stored in metadata fields |
unique_element_ids | boolean (unique_element_ids) Default: false When |
"by_title" (string) or Chunking Strategy (null) (Chunking Strategy) Use one of the supported strategies to chunk the returned elements. Currently supports: by_title | |
Combine Under N Chars (integer) or Combine Under N Chars (null) (Combine Under N Chars) If chunking strategy is set, combine elements until a section reaches a length of n chars. Default: 500 | |
max_characters | integer (Max Characters) Default: 500 If chunking strategy is set, cut off new sections after reaching a length of n chars (hard max). Default: 1500 |
multipage_sections | boolean (Multipage Sections) Default: true If chunking strategy is set, determines if sections can span multiple sections. Default: true |
New after n chars (integer) or New after n chars (null) (New after n chars) If chunking strategy is set, cut off new sections after reaching a length of n chars (soft max). Default: 1500 | |
overlap | integer (Overlap) Default: 0 Specifies the length of a string ("tail") to be drawn from each chunk and prefixed to the next chunk as a context-preserving mechanism. By default, this only applies to split-chunks where an oversized element is divided into multiple chunks by text-splitting. Default: 0 |
overlap_all | boolean (Overlap all) Default: false When |
PDF Starting Page Number (integer) or PDF Starting Page Number (null) (PDF Starting Page Number) When PDF is split into pages before sending it into the API, providing this information will allow the page number to be assigned correctly. |
Responses
Request samples
- Python
- cURL
import requests url = "https://compressa-api.mil-team.ru/v1/layout" headers = { "Authorization": "Bearer TOKEN", "accept": "application/json", } files = {"files": open("path/to/file.pdf", "rb")} data = { "xml_keep_tags": "false", "output_format": "application/json", "coordinates": "true", "strategy": "auto", "languages": ["rus", "eng"] } response = requests.post( url, headers=headers, files=files, data=data ) print(response.json())
Response samples
- 200
- 422
[- {
- "type": "string",
- "element_id": "string",
- "metadata": { },
- "text": "string"
}
]