Skip to main content

Platform Management

Creating Pods (Requires Admin Authorization Token)

Reads config.yaml content and creates automatic compose file

curl -X 'POST' 'http://localhost:8080/create-pods/'

Deploy Pods (Requires Admin Authorization Token)

Reads config.yaml content and sends model deployment requests

curl -X 'POST' 'http://localhost:8080/deploy-pods/'

Running Autotests (Requires User Authorization Token)

Depending on deployed models, runs a set of autotests

curl -X 'POST' 'http://localhost:8080/test/'

Running Performance Tests (Requires User Authorization Token)

GET /v1/performance/

Get information about a running test (if any) or indication that the test is completed and result is saved.

Example:

curl -X 'GET' 'http://localhost:8080/v1/performance/?pod_id=0'

Response schema if test is in progress:

{"model":"Compressa-LLM","job":{"id":"b83cb091-5acd-4254-84b5-e1141ec01711","name":"performance TEST","status":"RUNNING","started_at":"2025-08-03T10:07:30.818815","exception_details":null,"retry":0},"message":null}

Response schema if test is completed

{"model":"Compressa-LLM","job":{"id":"b83cb091-5acd-4254-84b5-e1141ec01711","name":"performance TEST","status":"FINISHED","started_at":"2025-08-03T10:07:30.818815","exception_details":null,"retry":0},"message":"Results are available in /app/resources/performance_results"}

POST /v1/performance/

Start a test. Test parameters are passed in the request, such as number of examples, number of threads, report format.

Example

import requests


response = requests.post(
url="http://localhost:8080/v1/performance",
headers={
"accept": "application/json",
"Content-Type": "application/json"
},
json={
"pod_id": "0",
"payload": {
"report_mode": "pdf",
"num_tasks": 500,
"num_runners": 10,
"generate_prompts": False,
"num_prompts": 0,
"prompt_length": 0,
"max_tokens": 1000,
"report_file": "experiment",
"experiment_name": "Performance Test"
}
}
)

Response schema:

{
"id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"name": "string",
"status": "RUNNING",
"started_at": "2024-03-21T10:10:07.521Z"
}

After testing is complete, reports are saved in the selected format (pdf, md or csv) in the previously mounted RESOURCES_PATH folder

POST /v1/performance/interrupt/

Interrupt a running test.

Example:

curl -X 'POST' \
'http://localhost:8080/v1/performance/interrupt/?pod_id=0' \
-H 'accept: application/json' \
-d ''

Response schema:

{"id":"6be69577-ebe8-4062-a5cb-ad8085c11393","name":"performance TEST","status":"INTERRUPTED","started_at":"2025-08-03T10:18:00.368896","exception_details":null,"retry":0}

Running Quality Tests (Requires User Authorization Token)

GET /v1/metrics/

GET /v1/metrics/status

Get information about a running test or test result if the test is completed.

Example:

curl -X 'GET' \
'http://localhost:8080/v1/metrics/'

Response schema if test is in progress:

{"model":"Compressa-LLM","message":null,"job":{"id":"26908ef2-60b4-4211-9e7f-3a649c200419","name":"QUALITY TEST","status":"RUNNING","started_at":"2025-08-03T10:30:18.573467","exception_details":null,"retry":0}}

Response schema if test is completed

{"model":"Compressa-LLM","job":{"id":"b83cb091-5acd-4254-84b5-e1141ec01711","name":"observability TEST","status":"FINISHED","started_at":"2025-08-03T10:07:30.818815","exception_details":null,"retry":0},"message":"Results are available in /app/resources/quality_tests"}

POST /v1/metrics/

Start a test. Test parameters are passed in the request, such as dataset (standard or custom) and number of randomly selected examples (default - 10, if -1 is specified - all)

Request for standard dataset

curl -X 'POST' 'http://localhost:8080/v1/metrics/?dataset=medical&num_examples=5'

List of available standard datasets

Request for custom dataset

import requests

url = 'http://localhost:8080/v1/metrics/?dataset=custom'
payload = {
"question": [
"What is important when welding titanium and its alloys?",
"What is boric acid needed for in reactor coolant?",
"In what year did the Komsomolets submarine sink?",
],
"answers": [
"Ensure an inert atmosphere in the welding zone, prevent titanium oxidation",
"Boric acid is used as a liquid absorber to control the chain fission reaction",
"In 1989",
]
}

response = requests.post(
url,
headers={
"accept": "application/json",
"Content-Type": "application/json"
},
json=payload
)

Response schema:

{
{"id":"ce79eaee-4541-4341-ac21-b79d9bb7f9bb","name":"QUALITY TEST","status":"RUNNING","started_at":"2025-08-03T10:34:09.113636","exception_details":null,"retry":0}
}

After testing is complete, reports are saved in pdf, csv and json format in the previously mounted RESOURCES_PATH folder

POST /v1/metrics/interrupt/

Interrupt a running test.

Example:

curl -X 'POST' \
'http://localhost:8080/v1/metrics/interrupt/' \
-H 'accept: application/json' \
-d ''

Response schema:

{"id":"ce79eaee-4541-4341-ac21-b79d9bb7f9bb","name":"QUALITY TEST","status":"INTERRUPTED","started_at":"2025-08-03T10:34:09.113636","exception_details":null,"retry":0}