Server Requirements

Specifications

For installing Compressa platform, a server with the following specifications is recommended:

Linux server
3 GPUs (graphics cards) A100 40GB
- 1 - for LLM
- 2 - for embeddings
- 3 - for ETL, Rerank, Audio
8 CPU threads
160 GB RAM
1 TB disk space

It's possible to install embedding, reranking, and audio models on one GPU, however performance and reliability may be lower.

If you're satisfied with lower performance / answer quality or won't install this module, server requirements can be reduced.

This configuration has load limitations. When scaling, you may need to expand available resources.

CUDA Drivers

You need to install the latest compatible drivers.

note

The default CUDA driver version can be installed using the following commands:

sudo apt update
sudo apt install software-properties-common -y
sudo apt install ubuntu-drivers-common -y
sudo ubuntu-drivers autoinstall
sudo apt install nvidia-cuda-toolkit

Docker

Installation instructions for Ubuntu:
https://docs.docker.com/engine/install/ubuntu/

You need to install a version that supports Docker Compose V2.

Nvidia Container Toolkit

Installation instructions for Linux:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

Server Requirements

Specifications​

CUDA Drivers​

Docker​

Nvidia Container Toolkit​

Specifications

CUDA Drivers

Docker

Nvidia Container Toolkit