NVIDIA RTX 6000 & H200 GPU Servers | AI & Machine Learning

NVIDIA GPU Technologies

Enterprise-grade accelerators for the most demanding workloads

memory_alt

Up to 8 GPUs

RTX 6000 Server Edition

check 96GB GDDR7 ECC Memory
check Blackwell Architecture
check PCIe Gen5 x16
check Up to 768GB total vRAM
check Server-Grade Reliability

Ideal for: LLM Inference, GenAI, AI Training

memory_alt

HPC Enterprise

NVIDIA H200

check 141GB HBM3e Memory
check 4.8 TB/s Memory Bandwidth
check Hopper Architecture
check SXM5 / NVLink
check Multi-Instance GPU (MIG)

Ideal for: Large-Scale LLM Training, HPC, Scientific Computing

GPU Server Configurations

Pre-configured GPU servers ready for AI workloads

Inference

GPU Starter

NVIDIA

memory_alt

2x RTX 6000 Server 96GB

192GB Total vRAM

memory

Intel Xeon 6740E 64-Core

128 Threads @ 2.4 GHz

developer_board

256 GB DDR5 ECC

Up to 1.5 TB

storage

2x 1.92TB NVMe SSD

RAID 1

lan

25 Gbps Unmetered

Dual Port

1,999EUR /mo

Training

POPULAR

GPU Pro

NVIDIA

memory_alt

4x RTX 6000 Server 96GB

384GB Total vRAM

memory

Intel Xeon 6740E 64-Core

128 Threads @ 2.4 GHz

developer_board

512 GB DDR5 ECC

Up to 3 TB

storage

4x 3.84TB NVMe SSD

RAID 10

lan

100 Gbps Unmetered

Dual Port

3,499EUR /mo

Maximum Performance

GPU Max

NVIDIA

memory_alt

8x RTX 6000 Server 96GB

768GB Total vRAM

memory

Intel Xeon 6740E 64-Core

128 Threads @ 2.4 GHz

developer_board

1 TB DDR5 ECC

Up to 4 TB

storage

8x 7.68TB NVMe SSD

RAID 10

lan

200 Gbps Unmetered

Dark Fiber Available

5,999EUR /mo

HPC Enterprise

NVIDIA H200 Servers

H200 accelerators with 141GB HBM3e for large-scale model training, scientific computing, and the most demanding HPC workloads.

check H200 141GB HBM3e

check 4.8 TB/s Bandwidth

check NVLink

check Multi-Instance (MIG)

From

Contact Us

dns View GPU Servers

Compatible AI Models

Inference performance estimates with vLLM. Tok/s represents per-user generation.

info Actual performance varies based on context length, batch size, and quantization

GPU Starter

2x RTX 6000 — 192GB vRAM

Llama 3.1 8B

FP16 · ~16GB vRAM

~110 tok/s

Mistral 7B

FP16 · ~14GB vRAM

~120 tok/s

Mixtral 8x7B

FP16 · ~93GB vRAM

~45 tok/s

Llama 3.1 70B

FP16 · ~140GB vRAM

~30 tok/s

Qwen2.5 72B

FP8 · ~72GB vRAM

~40 tok/s

Ideal for inference of models up to 70B and serving multiple 7-8B models in parallel.

GPU Pro

POPULAR

4x RTX 6000 — 384GB vRAM

Llama 3.1 70B

FP16 · ~140GB vRAM

~55 tok/s

Mixtral 8x22B

FP16 · ~262GB vRAM

~25 tok/s

Llama 3.1 405B

FP8 · ~203GB vRAM

~18 tok/s

DeepSeek-V3

INT4 · ~170GB vRAM

~35 tok/s

Qwen2.5 72B

FP16 · ~144GB vRAM

~50 tok/s

Can run 405B models with quantization and multiple concurrent 70B models.

GPU Max

8x RTX 6000 — 768GB vRAM

Llama 3.1 405B

FP16 · ~810GB vRAM

~30 tok/s

Llama 3.1 405B

FP8 · ~405GB vRAM

~45 tok/s

DeepSeek-V3

FP8 · ~340GB vRAM

~30 tok/s

Falcon 180B

FP16 · ~360GB vRAM

~35 tok/s

3x Llama 3.1 70B

FP16 · ~420GB vRAM

~40 tok/s each

Capacity for 405B models in full FP16 or multiple simultaneous 70B instances for high throughput.

speed

Aggregate Throughput with Batching

With vLLM and continuous batching, aggregate throughput scales with concurrent users:

1,500+

tok/s · 8B batch

400+

tok/s · 70B batch

150+

tok/s · 405B batch

High-Performance GPU Infrastructure

Every GPU server includes NVIDIA drivers, CUDA, premium connectivity, and 24/7 support at no additional cost.

verified NVIDIA AI Enterprise (Optional)

shield DDoS Protection

all_inclusive Unlimited Traffic

support_agent 24/7 Support

developer_board CUDA 12.x

memory_alt

Latest Generation GPUs

check NVIDIA RTX 6000 Server Edition
check Up to 8 GPUs per server
check Up to 768GB total vRAM

lan

High-Speed Network

check Up to 200 Gbps connectivity
check Dark Fiber Available
check DDoS protection included

storage

NVMe Storage

check High-performance NVMe SSDs
check Up to 61TB per server
check Hardware RAID

speed

Pre-installed AI Stack

check CUDA, cuDNN, TensorRT
check Optimized NGC containers
check Pre-configured NVIDIA drivers

build

Hardware SLA

check Hardware replacement in 4 hours
check Permanent on-site spare parts
check Proactive 24/7 monitoring

shield

Built-in Security

check Secure Boot
check Volumetric DDoS protection
check Private VLAN available

AI & HPC Use Cases

GPU servers optimized for modern AI workloads

psychology

LLM Training

Train large language models with multi-GPU clusters

smart_toy

GenAI Inference

Deploy ChatGPT-like models at scale

image

Computer Vision

Image recognition and video analysis

biotech

Scientific Computing

Drug discovery, molecular dynamics, simulations

graphic_eq

Voice AI

Speech recognition and synthesis

view_in_ar

3D Rendering

Ray tracing and real-time rendering

query_stats

Data Analytics

Accelerated data processing with RAPIDS

currency_bitcoin

Blockchain

Web3 and cryptographic workloads

Pre-installed AI Software Stack

Every GPU server comes with pre-installed NVIDIA tools. NVIDIA AI Enterprise available as an option.

verified

Optional

NVIDIA AI Enterprise

CUDA

Toolkit 12.x

cuDNN

Deep Neural Networks

TensorRT

Inference Optimizer

Triton

Inference Server

NCCL

Multi-GPU Communication

RAPIDS

Data Science

PyTorch

NGC Container

TensorFlow

NGC Container

vLLM

LLM Serving

Operating Systems

Choose your operating system. Pre-installed with NVIDIA drivers and AI stack ready to use

Ubuntu Server

22.04 / 24.04 LTS

Recommended for AI

Debian

11 / 12

Rocky Linux

8 / 9

AlmaLinux

8 / 9

Windows Server

2022 / 2025

Datacenter Location

All our GPU servers are hosted in our own data center in Madrid, with Tier III+ certification and direct connectivity to major traffic exchange points.

Tier III+

DC Classification

N+1

Power Redundancy

400G

Network Capacity

24/7

Surveillance & Security

Discover our facilities arrow_forward

Frequently Asked Questions

We answer your questions about GPU servers

What NVIDIA GPUs are available? expand_more

We offer NVIDIA RTX 6000 Server Edition (96GB GDDR7 ECC, Blackwell architecture) with up to 8 GPUs per server reaching 768GB of vRAM, and NVIDIA H200 (141GB HBM3e, Hopper architecture) for the most demanding enterprise HPC workloads.

What NVIDIA software is included? expand_more

All GPU servers include CUDA Toolkit 12.x, cuDNN, TensorRT, NCCL for multi-GPU communication, and pre-installed NVIDIA drivers. Optionally, NVIDIA AI Enterprise with Triton Inference Server, RAPIDS for data science, and NGC containers optimized for PyTorch, TensorFlow, and vLLM can be added.

Can I customize the GPU server configuration? expand_more

Yes. In addition to pre-defined configurations, we can build custom GPU servers: number and model of GPUs, processor, RAM, NVMe storage, network, and dark fiber. Contact sales for custom configurations.

What operating systems support GPU servers? expand_more

We offer Ubuntu Server (recommended for AI), Debian, Rocky Linux, AlmaLinux, and Windows Server. All include pre-installed NVIDIA drivers. Ubuntu Server is the most popular choice for AI and machine learning workloads.

What network connectivity do GPU servers offer? expand_more

GPU servers include from 25 Gbps up to 200 Gbps of unlimited connectivity. For distributed multi-node training, we offer dedicated dark fiber. All include DDoS protection and private VLAN available.

How long does GPU server provisioning take? expand_more

Pre-defined catalog configurations are deployed in 24-48 hours with the operating system and NVIDIA stack pre-installed. Custom configurations may require 5-10 business days depending on hardware availability.