API Documentation

Sign in to see your personal API key injected into the examples below. Sign in →

Prompt for AI

Paste this prompt into Claude, ChatGPT, or any AI assistant to have it implement the Vibe Earning API integration for you.

Show prompt ▸

You are integrating with the Vibe Earning LLM Grid API — a distributed AI inference service.
The API lets you submit prompts to open-source LLMs (llama3, mistral, gemma, etc.) running on volunteer hardware.

## Authentication
All requests require a Bearer token in the Authorization header:
  Authorization: Bearer YOUR_API_KEY

## Base URL
  https://www.llmondemand.com/api/v1

## Submit an Inference Job
POST /v1/jobs
Content-Type: application/json

Body fields:
  model        (string, required)  — Ollama model name, e.g. "llama3:8b", "mistral:7b", "gemma:2b"
  prompt       (string, required)  — The prompt to run
  model_match  (string, optional)  — "exact" (default) or "family" to allow compatible model variants
  tag          (string, optional)  — Label for grouping usage stats (max 64 chars)
  webhook_url  (string, optional)  — URL to POST the completed result to (async callback)
  priority     (integer, optional) — Higher = processed sooner; defaults to your subscription's priority boost when omitted
  timeout_seconds (integer, optional) — Auto-cancelled if still claimed/running this long after being claimed; defaults to 600 (10 minutes)

Success response (201):
{
  "job_id": "uuid",
  "status": "pending",
  "created_at": "2026-06-20T12:00:00Z"
}

Jobs are always accepted (201) as long as your API key is active — quota never blocks
submission, it only pauses processing (see "Token / Prompt Limits" below).

Error responses:
  422 — validation error (missing model/prompt, unsupported model, etc.)

## Poll Job Status
GET /v1/jobs/:id

Response when completed:
{
  "job_id": "uuid",
  "status": "completed",
  "output": "The model's response text...",
  "output_tokens": 142,
  "input_tokens": 34,
  "duration_ms": 8200,
  "completed_at": "2026-06-20T12:01:23Z"
}

Statuses: pending → claimed → running → completed | failed

## Webhook Callback (optional)
If you supply webhook_url, the API will POST to it on completion with the same payload as GET /v1/jobs/:id.

Every webhook request includes an X-Vibe-Signature header for verification:
  X-Vibe-Signature: sha256=

The signature is HMAC-SHA256 of the raw JSON body, keyed with your API key.
Verify it on your server before trusting the payload:

  Ruby:
    expected = "sha256=" + OpenSSL::HMAC.hexdigest("SHA256", YOUR_API_KEY, request.raw_post)
    halt 401 unless Rack::Utils.secure_compare(expected, request.env["HTTP_X_VIBE_SIGNATURE"])

  Python:
    import hmac, hashlib
    expected = "sha256=" + hmac.new(api_key.encode(), request.data, hashlib.sha256).hexdigest()
    if not hmac.compare_digest(expected, request.headers["X-Vibe-Signature"]):
        abort(401)

  Node.js:
    const crypto = require("crypto");
    const expected = "sha256=" + crypto.createHmac("sha256", API_KEY).update(rawBody).digest("hex");
    if (!crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(sig))) return res.sendStatus(401);

Always use a constant-time comparison to prevent timing attacks.

## List Jobs
GET /v1/jobs
Query params: status=completed|failed|pending, tag=my-project, page=1

## Cancel a Job
DELETE /v1/jobs/:id          — cancel one pending job
DELETE /v1/jobs/cancel_all   — cancel all pending jobs

## Retry a Failed Job
POST /v1/jobs/:id/retry

## Increase or Decrease Priority
PATCH /v1/jobs/:id/priority
Body: { "priority": 10 }    — positive to increase, negative to decrease

## List Available Models
GET /v1/models
Returns array of models currently online in the grid with worker_count.

## Usage Statistics
GET /v1/usage
Returns token usage and job counts grouped by day/tag.

## Token / Prompt Limits
There are no per-model or per-request prompt length or output token limits.
Each model runs with its own baked-in context window on the worker's Ollama instance.
Rate limits are subscription-level quotas on total tokens used (daily/weekly/monthly).
Your current quota headroom is visible at GET /v1/usage.
If you exceed your quota, submission still succeeds — the job is accepted and stays
"pending" — but it will not be picked up by a worker until your quota resets.

## File Input (Blob Upload)
Blobs must be attached to an existing job. The required sequence is:

  1. Submit the job first:
     POST /v1/jobs   →  { "job_id": "uuid", ... }

  2. Create a blob slot for that job:
     POST /v1/blobs
     Body: { "job_id": "uuid", "blob_type": "image" }
     Response: { "blob_id": "uuid", "upload_url": "https://s3...", "expires_in": 900 }

  3. Upload the file directly to the presigned S3 URL (PUT, no auth header):
     PUT    (binary body, expires in 15 minutes)

  4. Confirm the upload:
     POST /v1/blobs/:blob_id/confirm
     Response: { "status": "confirmed" }

After confirmation the blob is attached to the job and visible to the assigned worker.
Supported blob_type values: "image" (default). Do NOT pass blob_ids in the job creation
payload — blobs must reference an existing job_id at creation time.

## Best Practices
- For fast responses use model_match "family" so the grid can route to any compatible variant.
- Supply a webhook_url for async flows instead of polling.
- Use the tag field to track usage per feature/project.
- Check GET /v1/models first to see which models are currently online before submitting.
- Prefer smaller models (7b–8b) for lower latency; use larger models only when quality demands it.
- Always verify X-Vibe-Signature on incoming webhooks using a constant-time comparison.

Base URL: https://www.llmondemand.com/api/v1

Authentication

Include your API key as a Bearer token on every request:

Authorization: Bearer YOUR_API_KEY

Submit a Job

curl -X POST https://www.llmondemand.com/api/v1/jobs \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3:8b",
    "model_match": "family",
    "prompt": "Summarize this document",
    "tag": "my-project",
    "webhook_url": "https://yourapp.com/hooks/llm"
  }'

require 'net/http'
require 'json'

uri = URI('https://www.llmondemand.com/api/v1/jobs')
req = Net::HTTP::Post.new(uri)
req['Authorization'] = 'Bearer YOUR_API_KEY'
req['Content-Type'] = 'application/json'
req.body = {
  model: 'llama3:8b',
  model_match: 'family',
  prompt: 'Summarize this document',
  tag: 'my-project'
}.to_json

res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(req) }
job = JSON.parse(res.body)
puts job['job_id']

import httpx

client = httpx.Client(headers={"Authorization": "Bearer YOUR_API_KEY"})

response = client.post(
    "https://www.llmondemand.com/api/v1/jobs",
    json={
        "model": "llama3:8b",
        "model_match": "family",
        "prompt": "Summarize this document",
        "tag": "my-project"
    }
)
job = response.json()
print(job["job_id"])

const res = await fetch("https://www.llmondemand.com/api/v1/jobs", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "llama3:8b",
    model_match: "family",
    prompt: "Summarize this document",
    tag: "my-project"
  })
});
const job = await res.json();
console.log(job.job_id);

Poll for Results

Jobs are processed asynchronously. Poll GET /v1/jobs/:id until status is completed or failed, or supply a webhook_url to receive a POST callback.

curl https://www.llmondemand.com/api/v1/jobs/JOB_ID \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response (completed):
{
  "id": "JOB_ID",
  "status": "completed",
  "output": "Here is the summary...",
  "output_tokens": 142,
  "completed_at": "2026-06-20T12:34:56Z"
}

Job Parameters

Field	Type	Required	Description
`model`	string	Yes	Model name, e.g. `llama3:8b`
`prompt`	string	Yes	The prompt text
`model_match`	string	No	`exact` (default) or `family` to allow compatible variants
`tag`	string	No	Label for grouping usage stats
`webhook_url`	string	No	URL to POST the result to when done
`priority`	integer	No	Higher = processed sooner (defaults to your subscription's priority boost)
`timeout_seconds`	integer	No	Auto-cancelled if still claimed/running this long after being claimed (default 600 = 10 minutes)

All API Endpoints

Method	Path	Description
`POST`	`/v1/jobs`	Submit a new inference job
`GET`	`/v1/jobs`	List your jobs (filterable by status, tag)
`GET`	`/v1/jobs/:id`	Poll job status and result
`DELETE`	`/v1/jobs/:id`	Cancel a pending job
`DELETE`	`/v1/jobs/cancel_all`	Cancel all pending jobs
`POST`	`/v1/jobs/:id/retry`	Retry a failed job
`PATCH`	`/v1/jobs/:id/priority`	Increase or decrease job priority
`GET`	`/v1/models`	List available grid models
`GET`	`/v1/usage`	Usage statistics
`POST`	`/v1/blobs`	Get presigned S3 upload URL for file input
`POST`	`/v1/blobs/:id/confirm`	Confirm blob upload complete