AI Agents
NexusAgentsModelsPlayground
Sign in

Input Modalities

Output Modalities

Context Length

Min: Any

Max: Any

Moderation

Tokenizer

HomeModels

AI Models

Explore AI models available through our platform

390

Total

20

Showing

1

Page

390

Total

20

Showing

1

Page

Z

Z.ai: GLM 5 Turbo

z-ai/glm-5-turbo

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply optimized for real-world agent workflows involving long execution chains, with improved complex instruction decomposition, tool use, scheduled and persistent execution, and overall stability across extended tasks.

Z.AI

$0.96 in$3.20 out/1M

Context

203K

Output

131K

Type

text->text

Available

Try it
x

xAI: Grok 4.20 Multi-Agent Beta

x-ai/grok-4.20-multi-agent-beta

Grok 4.20 Multi-Agent Beta is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: - low / medium: 4 agents - high / xhigh: 16 agents

X-ai

$2.00 in$6.00 out/1M

Context

2.0M

Type

text+image->text

Available

Try it
x

xAI: Grok 4.20 Beta

x-ai/grok-4.20-beta

Grok 4.20 Beta is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

X-ai

$2.00 in$6.00 out/1M

Context

2.0M

Type

text+image->text

Available

Try it
H

Hunter Alpha

openrouter/hunter-alpha

Hunter Alpha is a 1 Trillion parameter + 1M token context frontier intelligence model built for agentic use. It excels at long-horizon planning, complex reasoning, and sustained multi-step task execution, with the reliability and instruction-following precision that frameworks like OpenClaw need. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Openrouter

$— in$— out/1M

Context

1.0M

Output

32K

Type

text+image->text

Available

Try it
H

Healer Alpha

openrouter/healer-alpha

Healer Alpha is a frontier omni-modal model with vision, hearing, reasoning, and action capabilities. It brings the full power of agentic intelligence into the real world: natively perceiving visual and audio inputs, reasoning across modalities, and executing complex multi-step tasks with precision and reliability. **Note:** All prompts and completions for this model are logged by the provider and may be used to improve the model.

Openrouter

$— in$— out/1M

Context

262K

Output

32K

Type

text+image+audio+video->text

Available

Try it
N

NVIDIA: Nemotron 3 Super (free)

nvidia/nemotron-3-super-120b-a12b:free

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50% higher token generation compared to leading open models. The model features a 1M token context window for long-term agent coherence, cross-document reasoning, and multi-step task planning. Latent MoE enables calling 4 experts for the inference cost of only one, improving intelligence and generalization. Multi-environment RL training across 10+ environments delivers leading accuracy on benchmarks including AIME 2025, TerminalBench, and SWE-Bench Verified. Fully open with weights, datasets, and recipes under the NVIDIA Open License, Nemotron 3 Super allows easy customization and secure deployment anywhere — from workstation to cloud.

Nvidia

$— in$— out/1M

Context

262K

Output

262K

Type

text->text

Available

Try it
B

ByteDance Seed: Seed-2.0-Lite

bytedance-seed/seed-2.0-lite

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering noticeably lower latency, making it a practical default choice for most production workloads across text, vision, and tools. Engineered for high-frequency visual understanding and agentic workflows, it's an ideal choice for deployment at scale with minimal latency.

Bytedance-seed

$0.25 in$2.00 out/1M

Context

262K

Output

131K

Type

text+image+video->text

Available

Try it
Q

Qwen: Qwen3.5-9B

qwen/qwen3.5-9b

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified vision-language design with early fusion of multimodal tokens, allowing the model to process and reason across text and images within the same context.

Qwen

$0.05 in$0.15 out/1M

Context

256K

Type

text+image+video->text

Available

Try it
O

OpenAI: GPT-5.4 Pro

openai/gpt-5.4-pro

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs. Optimized for step-by-step reasoning, instruction following, and accuracy, GPT-5.4 Pro excels at agentic coding, long-context workflows, and multi-step problem solving.

OpenAI

$30.00 in$180.00 out/1M

Context

1.1M

Output

128K

Type

text+image+file->text

Available

Try it
O

OpenAI: GPT-5.4

openai/gpt-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for text and image inputs, enabling high-context reasoning, coding, and multimodal analysis within the same workflow. The model delivers improved performance in coding, document understanding, tool use, and instruction following. It is designed as a strong default for both general-purpose tasks and software engineering, capable of generating production-quality code, synthesizing information across multiple sources, and executing complex multi-step workflows with fewer iterations and greater token efficiency.

OpenAI

$2.50 in$15.00 out/1M

Context

1.1M

Output

128K

Type

text+image+file->text

Available

Try it
I

Inception: Mercury 2

inception/mercury-2

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens sequentially, Mercury 2 produces and refines multiple tokens in parallel, achieving >1,000 tokens/sec on standard GPUs. Mercury 2 is 5x+ faster than leading speed-optimized LLMs like Claude 4.5 Haiku and GPT 5 Mini, at a fraction of the cost. Mercury 2 supports tunable reasoning levels, 128K context, native tool use, and schema-aligned JSON output. Built for coding workflows where latency compounds, real-time voice/search, and agent loops. OpenAI API compatible. Read more in the [blog post](https://www.inceptionlabs.ai/blog/introducing-mercury-2).

Inception

$0.25 in$0.75 out/1M

Context

128K

Output

50K

Type

text->text

Available

Try it
O

OpenAI: GPT-5.3 Chat

openai/gpt-5.3-chat

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualization and significantly reduces unnecessary refusals, caveats, and overly cautious phrasing that can interrupt conversational flow.

OpenAI

$1.75 in$14.00 out/1M

Context

128K

Output

16K

Type

text+image+file->text

Available

Try it
G

Google: Gemini 3.1 Flash Lite Preview

google/gemini-3.1-flash-lite-preview

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance across key capabilities. Improvements span audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. Supports full thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs. Priced at half the cost of Gemini 3 Flash.

Google

$0.25 in$1.50 out/1M

Context

1.0M

Output

66K

Type

text+image+file+audio+video->text

Available

Try it
B

ByteDance Seed: Seed-2.0-Mini

bytedance-seed/seed-2.0-mini

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference deployment. It delivers performance comparable to ByteDance-Seed-1.6, supports 256k context, four reasoning effort modes (minimal/low/medium/high), multimodal understanding, and is optimized for lightweight tasks where cost and speed take priority.

Bytedance-seed

$0.1 in$0.4 out/1M

Context

262K

Output

131K

Type

text+image+video->text

Available

Try it
G

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)

google/gemini-3.1-flash-image-preview

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the [image_config API Parameter](https://openrouter.ai/docs/features/multimodal/image-generation#image-aspect-ratio-configuration)

Google

$0.5 in$3.00 out/1M

Context

66K

Output

66K

Type

text+image->text+image

Available

Try it
Q

Qwen: Qwen3.5-35B-A3B

qwen/qwen3.5-35b-a3b

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall performance is comparable to that of the Qwen3.5-27B.

Qwen

$0.163 in$1.30 out/1M

Context

262K

Output

66K

Type

text+image+video->text

Available

Try it
Q

Qwen: Qwen3.5-27B

qwen/qwen3.5-27b

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities are comparable to those of the Qwen3.5-122B-A10B.

Qwen

$0.195 in$1.56 out/1M

Context

262K

Output

66K

Type

text+image+video->text

Available

Try it
Q

Qwen: Qwen3.5-122B-A10B

qwen/qwen3.5-122b-a10b

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. In terms of overall performance, this model is second only to Qwen3.5-397B-A17B. Its text capabilities significantly outperform those of Qwen3-235B-2507, and its visual capabilities surpass those of Qwen3-VL-235B.

Qwen

$0.26 in$2.08 out/1M

Context

262K

Output

66K

Type

text+image+video->text

Available

Try it
Q

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the 3 series, these models deliver a leap forward in performance for both pure text and multimodal tasks, offering fast response times while balancing inference speed and overall performance.

Qwen

$0.065 in$0.26 out/1M

Context

1.0M

Output

66K

Type

text+image+video->text

Available

Try it
L

LiquidAI: LFM2-24B-A2B

liquid/lfm-2-24b-a2b

LFM2-24B-A2B is the largest model in the LFM2 family of hybrid architectures designed for efficient on-device deployment. Built as a 24B parameter Mixture-of-Experts model with only 2B active parameters per token, it delivers high-quality generation while maintaining low inference costs. The model fits within 32 GB of RAM, making it practical to run on consumer laptops and desktops without sacrificing capability.

Liquid

$0.03 in$0.12 out/1M

Context

33K

Type

text->text

Available

Try it
...

1 - 20

of 390

1

/ 20