What is VoltageGPU API?

VoltageGPU provides two APIs: a GPU Pods API for deploying and managing GPU instances, and an AI Inference API for running 140+ AI models including DeepSeek-R1, Llama 3, and FLUX.

What AI models are available?

VoltageGPU offers 140+ AI models including DeepSeek-R1, DeepSeek-V3, Llama 3.3 70B, Qwen 2.5, Mixtral, FLUX.1, Stable Diffusion, and many more for chat, embeddings, image and video generation.

🤖AI Inference API

VoltageGPU API Reference

Q: Is VoltageGPU API OpenAI compatible?

Yes, the VoltageGPU AI Inference API is fully OpenAI-compatible. You can use existing OpenAI SDKs by simply changing the base URL to https://api.voltagegpu.com/v1

Complete documentation for integrating with the VoltageGPU platform. Access powerful GPU resources through our comprehensive REST API.

Base URL: https://api.voltagegpu.com/v1

🚀 Quick Start

Get started with VoltageGPU AI Inference API in minutes. Here's how to make your first chat completion request:

# Chat Completions - OpenAI Compatible
curl -X POST "https://api.voltagegpu.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'

# Response example:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1704715200,
  "model": "deepseek-ai/DeepSeek-R1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing great, thank you for asking..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 42,
    "total_tokens": 67
  }
}

💡 Tip: Our API is OpenAI-compatible! You can use existing OpenAI SDKs by simply changing the base URL to https://api.voltagegpu.com/v1

🔐 Authentication

All API requests require authentication via Bearer token. Generate your API key from the Dashboard Settings.

# Include your API key in the Authorization header
curl -X POST "https://api.voltagegpu.com/v1/chat/completions" \
  -H "Authorization: Bearer vgpu_sk_xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"model": "deepseek-ai/DeepSeek-R1", "messages": [...]}'

# Or use with OpenAI Python SDK
from openai import OpenAI

client = OpenAI(
    api_key="vgpu_sk_xxxxxxxxxxxx",
    base_url="https://api.voltagegpu.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "Hello!"}]
)

🔑 Keep your API key secure and never share it publicly. Rotate keys regularly from your dashboard.

📚 API Reference

Chat Completions

Generate conversational responses using state-of-the-art language models. OpenAI-compatible endpoint.

Method	Endpoint	Description	Auth
POST	`/v1/chat/completions`	Create a chat completion	✅

Request Body Parameters

`model`	required	Model ID (e.g., "deepseek-ai/DeepSeek-R1")
`messages`	required	Array of message objects with role and content
`max_tokens`	optional	Maximum tokens to generate (default: 1024)
`temperature`	optional	Sampling temperature 0-2 (default: 0.7)
`stream`	optional	Enable streaming responses (default: false)
`top_p`	optional	Nucleus sampling parameter (default: 1)

Text Completions

Generate text completions from a prompt. Legacy endpoint for non-chat models.

Method	Endpoint	Description	Auth
POST	`/v1/completions`	Create a text completion	✅

Embeddings

Generate vector embeddings for text. Useful for semantic search, clustering, and RAG applications.

Method	Endpoint	Description	Auth
POST	`/v1/embeddings`	Create embeddings for text	✅

# Generate embeddings
curl -X POST "https://api.voltagegpu.com/v1/embeddings" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-large-en-v1.5",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Image Generation

Generate images from text prompts using Stable Diffusion, FLUX, and other image models.

Method	Endpoint	Description	Auth
POST	`/v1/images/generations`	Generate images from text	✅

# Generate an image
curl -X POST "https://api.voltagegpu.com/v1/images/generations" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "black-forest-labs/FLUX.1-schnell",
    "prompt": "A beautiful sunset over mountains, digital art",
    "n": 1,
    "size": "1024x1024"
  }'

Models

List and retrieve information about available AI models.

Method	Endpoint	Description	Auth
GET	`/v1/models`	List all available models	✅
GET	`/v1/models/:id`	Get model details	✅

🧠 Available Models

Access 140+ state-of-the-art AI models through our unified API:

💬 Large Language Models

deepseek-ai/DeepSeek-R1
deepseek-ai/DeepSeek-V3
Qwen/Qwen2.5-72B-Instruct
meta-llama/Llama-3.3-70B-Instruct
mistralai/Mixtral-8x22B-Instruct-v0.1

🎨 Image Generation

black-forest-labs/FLUX.1-schnell
black-forest-labs/FLUX.1-dev
stabilityai/stable-diffusion-xl-base-1.0

🔍 Embeddings

BAAI/bge-large-en-v1.5
sentence-transformers/all-MiniLM-L6-v2

🎬 Video Generation

Lightricks/LTX-Video
genmo/mochi-1-preview

📋 View the full model catalog at voltagegpu.com/models

📦 SDK Integration

Use our API with popular SDKs by simply changing the base URL:

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="vgpu_sk_xxxxxxxxxxxx",
    base_url="https://api.voltagegpu.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing"}
    ],
    max_tokens=1024
)

print(response.choices[0].message.content)

JavaScript/TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'vgpu_sk_xxxxxxxxxxxx',
  baseURL: 'https://api.voltagegpu.com/v1'
});

const response = await client.chat.completions.create({
  model: 'deepseek-ai/DeepSeek-R1',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing' }
  ],
  max_tokens: 1024
});

console.log(response.choices[0].message.content);

⚡ Streaming Responses

Enable real-time streaming for chat completions:

# Enable streaming with stream: true
curl -X POST "https://api.voltagegpu.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [{"role": "user", "content": "Write a poem"}],
    "stream": true
  }'

# Response is Server-Sent Events (SSE):
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":"The"}}]}
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":" sun"}}]}
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":" sets"}}]}
...
data: [DONE]

⚠️ Error Handling

All API errors follow a consistent JSON format:

{
  "error": {
    "message": "Invalid API key provided",
    "type": "authentication_error",
    "code": "invalid_api_key",
    "status": 401
  }
}

Common Status Codes

`200`	Success - Request completed successfully
`400`	Bad Request - Invalid parameters
`401`	Unauthorized - Invalid or missing API key
`403`	Forbidden - Insufficient credits or permissions
`404`	Not Found - Model or resource not found
`429`	Too Many Requests - Rate limit exceeded
`500`	Internal Server Error - Server-side issue

🚦 Rate Limits & Pricing

Our API uses a pay-per-token pricing model with competitive rates:

Rate Limits: 1000 requests per minute (contact us for higher limits)
Pricing: Based on model and token usage
Billing: Deducted from your account balance in real-time

Rate limit headers are included in all responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1704715200

💰 Check your usage and add credits at voltagegpu.com/billing

🆘 Support & Resources

🧠