Sign in to see your personal API key injected into the examples below. Sign in →
Paste this prompt into Claude, ChatGPT, or any AI assistant to have it implement the Vibe Earning API integration for you.
You are integrating with the Vibe Earning LLM Grid API — a distributed AI inference service.
The API lets you submit prompts to open-source LLMs (llama3, mistral, gemma, etc.) running on volunteer hardware.
## Authentication
All requests require a Bearer token in the Authorization header:
Authorization: Bearer YOUR_API_KEY
## Base URL
https://www.llmondemand.com/api/v1
## Submit an Inference Job
POST /v1/jobs
Content-Type: application/json
Body fields:
model (string, required) — Ollama model name, e.g. "llama3:8b", "mistral:7b", "gemma:2b"
prompt (string, required) — The prompt to run
model_match (string, optional) — "exact" (default) or "family" to allow compatible model variants
tag (string, optional) — Label for grouping usage stats (max 64 chars)
webhook_url (string, optional) — URL to POST the completed result to (async callback)
priority (integer, optional) — Higher = processed sooner; defaults to your subscription's priority boost when omitted
timeout_seconds (integer, optional) — Auto-cancelled if still claimed/running this long after being claimed; defaults to 600 (10 minutes)
Success response (201):
{
"job_id": "uuid",
"status": "pending",
"created_at": "2026-06-20T12:00:00Z"
}
Jobs are always accepted (201) as long as your API key is active — quota never blocks
submission, it only pauses processing (see "Token / Prompt Limits" below).
Error responses:
422 — validation error (missing model/prompt, unsupported model, etc.)
## Poll Job Status
GET /v1/jobs/:id
Response when completed:
{
"job_id": "uuid",
"status": "completed",
"output": "The model's response text...",
"output_tokens": 142,
"input_tokens": 34,
"duration_ms": 8200,
"completed_at": "2026-06-20T12:01:23Z"
}
Statuses: pending → claimed → running → completed | failed
## Webhook Callback (optional)
If you supply webhook_url, the API will POST to it on completion with the same payload as GET /v1/jobs/:id.
Every webhook request includes an X-Vibe-Signature header for verification:
X-Vibe-Signature: sha256=
The signature is HMAC-SHA256 of the raw JSON body, keyed with your API key.
Verify it on your server before trusting the payload:
Ruby:
expected = "sha256=" + OpenSSL::HMAC.hexdigest("SHA256", YOUR_API_KEY, request.raw_post)
halt 401 unless Rack::Utils.secure_compare(expected, request.env["HTTP_X_VIBE_SIGNATURE"])
Python:
import hmac, hashlib
expected = "sha256=" + hmac.new(api_key.encode(), request.data, hashlib.sha256).hexdigest()
if not hmac.compare_digest(expected, request.headers["X-Vibe-Signature"]):
abort(401)
Node.js:
const crypto = require("crypto");
const expected = "sha256=" + crypto.createHmac("sha256", API_KEY).update(rawBody).digest("hex");
if (!crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(sig))) return res.sendStatus(401);
Always use a constant-time comparison to prevent timing attacks.
## List Jobs
GET /v1/jobs
Query params: status=completed|failed|pending, tag=my-project, page=1
## Cancel a Job
DELETE /v1/jobs/:id — cancel one pending job
DELETE /v1/jobs/cancel_all — cancel all pending jobs
## Retry a Failed Job
POST /v1/jobs/:id/retry
## Increase or Decrease Priority
PATCH /v1/jobs/:id/priority
Body: { "priority": 10 } — positive to increase, negative to decrease
## List Available Models
GET /v1/models
Returns array of models currently online in the grid with worker_count.
## Usage Statistics
GET /v1/usage
Returns token usage and job counts grouped by day/tag.
## Token / Prompt Limits
There are no per-model or per-request prompt length or output token limits.
Each model runs with its own baked-in context window on the worker's Ollama instance.
Rate limits are subscription-level quotas on total tokens used (daily/weekly/monthly).
Your current quota headroom is visible at GET /v1/usage.
If you exceed your quota, submission still succeeds — the job is accepted and stays
"pending" — but it will not be picked up by a worker until your quota resets.
## File Input (Blob Upload)
Blobs must be attached to an existing job. The required sequence is:
1. Submit the job first:
POST /v1/jobs → { "job_id": "uuid", ... }
2. Create a blob slot for that job:
POST /v1/blobs
Body: { "job_id": "uuid", "blob_type": "image" }
Response: { "blob_id": "uuid", "upload_url": "https://s3...", "expires_in": 900 }
3. Upload the file directly to the presigned S3 URL (PUT, no auth header):
PUT (binary body, expires in 15 minutes)
4. Confirm the upload:
POST /v1/blobs/:blob_id/confirm
Response: { "status": "confirmed" }
After confirmation the blob is attached to the job and visible to the assigned worker.
Supported blob_type values: "image" (default). Do NOT pass blob_ids in the job creation
payload — blobs must reference an existing job_id at creation time.
## Best Practices
- For fast responses use model_match "family" so the grid can route to any compatible variant.
- Supply a webhook_url for async flows instead of polling.
- Use the tag field to track usage per feature/project.
- Check GET /v1/models first to see which models are currently online before submitting.
- Prefer smaller models (7b–8b) for lower latency; use larger models only when quality demands it.
- Always verify X-Vibe-Signature on incoming webhooks using a constant-time comparison.
Base URL: https://www.llmondemand.com/api/v1
Include your API key as a Bearer token on every request:
Authorization: Bearer YOUR_API_KEY
curl -X POST https://www.llmondemand.com/api/v1/jobs \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama3:8b",
"model_match": "family",
"prompt": "Summarize this document",
"tag": "my-project",
"webhook_url": "https://yourapp.com/hooks/llm"
}'
require 'net/http'
require 'json'
uri = URI('https://www.llmondemand.com/api/v1/jobs')
req = Net::HTTP::Post.new(uri)
req['Authorization'] = 'Bearer YOUR_API_KEY'
req['Content-Type'] = 'application/json'
req.body = {
model: 'llama3:8b',
model_match: 'family',
prompt: 'Summarize this document',
tag: 'my-project'
}.to_json
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |http| http.request(req) }
job = JSON.parse(res.body)
puts job['job_id']
import httpx
client = httpx.Client(headers={"Authorization": "Bearer YOUR_API_KEY"})
response = client.post(
"https://www.llmondemand.com/api/v1/jobs",
json={
"model": "llama3:8b",
"model_match": "family",
"prompt": "Summarize this document",
"tag": "my-project"
}
)
job = response.json()
print(job["job_id"])
const res = await fetch("https://www.llmondemand.com/api/v1/jobs", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({
model: "llama3:8b",
model_match: "family",
prompt: "Summarize this document",
tag: "my-project"
})
});
const job = await res.json();
console.log(job.job_id);
Jobs are processed asynchronously. Poll GET /v1/jobs/:id until status is completed or failed, or supply a webhook_url to receive a POST callback.
curl https://www.llmondemand.com/api/v1/jobs/JOB_ID \
-H "Authorization: Bearer YOUR_API_KEY"
# Response (completed):
{
"id": "JOB_ID",
"status": "completed",
"output": "Here is the summary...",
"output_tokens": 142,
"completed_at": "2026-06-20T12:34:56Z"
}
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model name, e.g. llama3:8b |
prompt | string | Yes | The prompt text |
model_match | string | No | exact (default) or family to allow compatible variants |
tag | string | No | Label for grouping usage stats |
webhook_url | string | No | URL to POST the result to when done |
priority | integer | No | Higher = processed sooner (defaults to your subscription's priority boost) |
timeout_seconds | integer | No | Auto-cancelled if still claimed/running this long after being claimed (default 600 = 10 minutes) |
| Method | Path | Description |
|---|---|---|
POST | /v1/jobs | Submit a new inference job |
GET | /v1/jobs | List your jobs (filterable by status, tag) |
GET | /v1/jobs/:id | Poll job status and result |
DELETE | /v1/jobs/:id | Cancel a pending job |
DELETE | /v1/jobs/cancel_all | Cancel all pending jobs |
POST | /v1/jobs/:id/retry | Retry a failed job |
PATCH | /v1/jobs/:id/priority | Increase or decrease job priority |
GET | /v1/models | List available grid models |
GET | /v1/usage | Usage statistics |
POST | /v1/blobs | Get presigned S3 upload URL for file input |
POST | /v1/blobs/:id/confirm | Confirm blob upload complete |