VoltageGPU LogoVoltageGPU
🤖AI Inference API

VoltageGPU API Reference

Complete documentation for integrating with the VoltageGPU platform. Access powerful GPU resources through our comprehensive REST API.

Base URL: https://api.voltagegpu.com/v1

🚀 Quick Start

Get started with VoltageGPU AI Inference API in minutes. Here's how to make your first chat completion request:

# Chat Completions - OpenAI Compatible
curl -X POST "https://api.voltagegpu.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'

# Response example:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1704715200,
  "model": "deepseek-ai/DeepSeek-R1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing great, thank you for asking..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 42,
    "total_tokens": 67
  }
}

💡 Tip: Our API is OpenAI-compatible! You can use existing OpenAI SDKs by simply changing the base URL to https://api.voltagegpu.com/v1

🔐 Authentication

All API requests require authentication via Bearer token. Generate your API key from the Dashboard Settings.

# Include your API key in the Authorization header
curl -X POST "https://api.voltagegpu.com/v1/chat/completions" \
  -H "Authorization: Bearer vgpu_sk_xxxxxxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"model": "deepseek-ai/DeepSeek-R1", "messages": [...]}'

# Or use with OpenAI Python SDK
from openai import OpenAI

client = OpenAI(
    api_key="vgpu_sk_xxxxxxxxxxxx",
    base_url="https://api.voltagegpu.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[{"role": "user", "content": "Hello!"}]
)

🔑 Keep your API key secure and never share it publicly. Rotate keys regularly from your dashboard.

📚 API Reference

Chat Completions

Generate conversational responses using state-of-the-art language models. OpenAI-compatible endpoint.

MethodEndpointDescriptionAuth
POST/v1/chat/completionsCreate a chat completion

Request Body Parameters

modelrequiredModel ID (e.g., "deepseek-ai/DeepSeek-R1")
messagesrequiredArray of message objects with role and content
max_tokensoptionalMaximum tokens to generate (default: 1024)
temperatureoptionalSampling temperature 0-2 (default: 0.7)
streamoptionalEnable streaming responses (default: false)
top_poptionalNucleus sampling parameter (default: 1)

Text Completions

Generate text completions from a prompt. Legacy endpoint for non-chat models.

MethodEndpointDescriptionAuth
POST/v1/completionsCreate a text completion

Embeddings

Generate vector embeddings for text. Useful for semantic search, clustering, and RAG applications.

MethodEndpointDescriptionAuth
POST/v1/embeddingsCreate embeddings for text
# Generate embeddings
curl -X POST "https://api.voltagegpu.com/v1/embeddings" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-large-en-v1.5",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Image Generation

Generate images from text prompts using Stable Diffusion, FLUX, and other image models.

MethodEndpointDescriptionAuth
POST/v1/images/generationsGenerate images from text
# Generate an image
curl -X POST "https://api.voltagegpu.com/v1/images/generations" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "black-forest-labs/FLUX.1-schnell",
    "prompt": "A beautiful sunset over mountains, digital art",
    "n": 1,
    "size": "1024x1024"
  }'

Models

List and retrieve information about available AI models.

MethodEndpointDescriptionAuth
GET/v1/modelsList all available models
GET/v1/models/:idGet model details

🧠 Available Models

Access 140+ state-of-the-art AI models through our unified API:

💬 Large Language Models

  • deepseek-ai/DeepSeek-R1
  • deepseek-ai/DeepSeek-V3
  • Qwen/Qwen2.5-72B-Instruct
  • meta-llama/Llama-3.3-70B-Instruct
  • mistralai/Mixtral-8x22B-Instruct-v0.1

🎨 Image Generation

  • black-forest-labs/FLUX.1-schnell
  • black-forest-labs/FLUX.1-dev
  • stabilityai/stable-diffusion-xl-base-1.0

🔍 Embeddings

  • BAAI/bge-large-en-v1.5
  • sentence-transformers/all-MiniLM-L6-v2

🎬 Video Generation

  • Lightricks/LTX-Video
  • genmo/mochi-1-preview

📋 View the full model catalog at voltagegpu.com/models

📦 SDK Integration

Use our API with popular SDKs by simply changing the base URL:

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    api_key="vgpu_sk_xxxxxxxxxxxx",
    base_url="https://api.voltagegpu.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing"}
    ],
    max_tokens=1024
)

print(response.choices[0].message.content)

JavaScript/TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'vgpu_sk_xxxxxxxxxxxx',
  baseURL: 'https://api.voltagegpu.com/v1'
});

const response = await client.chat.completions.create({
  model: 'deepseek-ai/DeepSeek-R1',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing' }
  ],
  max_tokens: 1024
});

console.log(response.choices[0].message.content);

⚡ Streaming Responses

Enable real-time streaming for chat completions:

# Enable streaming with stream: true
curl -X POST "https://api.voltagegpu.com/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [{"role": "user", "content": "Write a poem"}],
    "stream": true
  }'

# Response is Server-Sent Events (SSE):
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":"The"}}]}
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":" sun"}}]}
data: {"id":"chatcmpl-123","choices":[{"delta":{"content":" sets"}}]}
...
data: [DONE]

⚠️ Error Handling

All API errors follow a consistent JSON format:

{
  "error": {
    "message": "Invalid API key provided",
    "type": "authentication_error",
    "code": "invalid_api_key",
    "status": 401
  }
}

Common Status Codes

200Success - Request completed successfully
400Bad Request - Invalid parameters
401Unauthorized - Invalid or missing API key
403Forbidden - Insufficient credits or permissions
404Not Found - Model or resource not found
429Too Many Requests - Rate limit exceeded
500Internal Server Error - Server-side issue

🚦 Rate Limits & Pricing

Our API uses a pay-per-token pricing model with competitive rates:

  • Rate Limits: 1000 requests per minute (contact us for higher limits)
  • Pricing: Based on model and token usage
  • Billing: Deducted from your account balance in real-time

Rate limit headers are included in all responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1704715200

💰 Check your usage and add credits at voltagegpu.com/billing

🆘 Support & Resources