Flash API
Consumer URL:https://api.timepointai.com (routed via API Gateway)
Direct URL: https://flash.timepointai.com (for service-to-service calls)
Consumer access goes through the Gateway, which handles authentication and proxies requests to Flash with an X-User-ID header. Flash has AUTH_ENABLED=false — it does not perform any authentication itself. It trusts the X-User-ID injected by the Gateway.
For service-to-service calls that bypass the Gateway, use X-Service-Key with the direct URL.
Health Check
Render Endpoints
POST /api/v1/timepoints/generate/sync
Synchronous render — blocks until the full scene is generated.| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Historical moment description (3–500 chars) |
generate_image | boolean | No | Generate AI image (default: false) |
preset | string | No | Quality preset: hyper, balanced (default), hd, gemini3 |
text_model | string | No | Text model ID — OpenRouter format (org/model) or Google native (gemini-*). Overrides preset. |
image_model | string | No | Image model ID — Stability AI (stabilityai/*), pollinations, or Google native. Overrides preset. |
model_policy | string | No | "permissive" for open-weight only, Google-free generation. |
llm_params | object | No | Fine-grained LLM hyperparameters (see below). |
visibility | string | No | public (default) or private |
callback_url | string | No | URL to POST results when generation completes (async only) |
request_context | object | No | Opaque context passed through to response |
POST /api/v1/timepoints/generate/stream
Server-Sent Events (SSE) stream — returns pipeline progress in real-time as each agent completes.Judge → Timeline → Grounding → Scene → Characters → Moment → Camera → Dialog → Critique → ImagePrompt → Optimizer → ImageGen
Downstream Model Control
Downstream apps (Web App, iPhone App, Clockchain, Billing, Enterprise integrations) have full control over model selection and generation hyperparameters on every request. All 14 pipeline agents respect these parameters.Model Selection Priority
Model selection follows this precedence (highest first):- Explicit
text_model/image_model— exact model by name model_policy: "permissive"— auto-selects open-weight models, skips Google groundingpreset— uses preset’s default models- Server defaults
model_policy, explicit models, preset, and llm_params in the same request.
Google-Free Generation
Setmodel_policy: "permissive" to route all 14 pipeline agents through open-weight models (DeepSeek R1, Llama, Qwen, Mistral) via OpenRouter with Stability AI (SD3.5 Large Turbo) for images — zero Google API calls, including grounding.
Explicit Model Override
Usetext_model and image_model to specify any OpenRouter-compatible model ID (e.g. qwen/qwen3-235b-a22b, deepseek/deepseek-r1-0528), Stability AI model (e.g. stabilityai/sd3.5-large-turbo), or Google native model (e.g. gemini-2.5-flash). Explicit overrides take priority over model_policy.
| Model | Description |
|---|---|
stabilityai/sd3.5-large-turbo | Fast generation (default for permissive mode) |
stabilityai/sd3.5-large | Best quality |
stabilityai/sd3.5-medium | Balanced speed/quality |
LLM Parameters (llm_params)
Fine-grained control over generation hyperparameters, applied uniformly across every agent in the pipeline. Request-level llm_params override each agent’s built-in defaults (e.g. setting temperature: 0.3 overrides the scene agent’s default of 0.7, the dialog agent’s default of 0.85, etc.).
| Parameter | Type | Range | Providers | Description |
|---|---|---|---|---|
temperature | float | 0.0–2.0 | All | Sampling temperature. Overrides per-agent defaults (0.2 for factual, 0.85 for creative). |
max_tokens | int | 1–32768 | All | Max output tokens per agent call. Preset defaults: hyper=1024, balanced=2048, hd=8192. |
top_p | float | 0.0–1.0 | All | Nucleus sampling threshold. |
top_k | int | >= 1 | All | Top-k sampling — consider only the k most likely tokens. |
frequency_penalty | float | -2.0–2.0 | OpenRouter | Penalize tokens proportionally to frequency in output. |
presence_penalty | float | -2.0–2.0 | OpenRouter | Penalize tokens that have appeared at all in output. |
repetition_penalty | float | 0.0–2.0 | OpenRouter | Multiplicative penalty for repeated tokens. |
stop | string[] | max 4 | All | Stop sequences — generation halts when produced. |
thinking_level | string | — | Reasoning depth: "none", "low", "medium", "high". | |
system_prompt_prefix | string | max 2000 | All | Text prepended to every agent’s system prompt. |
system_prompt_suffix | string | max 2000 | All | Text appended to every agent’s system prompt. |
Image Generation
Image URLs are included in the response whengenerate_image: true.
Commercial presets (hd, balanced, hyper, gemini3):
- Google Imagen (primary) → Stability AI (fallback) → OpenRouter (fallback)
model_policy: "permissive"):
- Stability AI — SD3.5 Large Turbo (open-weight, no Google dependency)
LLM Providers
| Provider | Role |
|---|---|
| Google Gemini | Default LLM for all agents (configurable via text_model or model_policy) |
| OpenRouter | Open-weight text models (DeepSeek, Llama, Qwen, Mistral) — used with model_policy: "permissive" or explicit text_model |
| Stability AI | Open-weight image generation (SD3.5) — default for model_policy: "permissive", or explicit image_model: "stabilityai/*" |
| Pollinations | Free image generation — available via image_model: "pollinations" |