Chrome Extension AI: How to Use Built-in On-Device and Cloud AI APIs in 2026

The AI landscape for developers just shifted. Android introduced a hybrid inference API that lets apps switch seamlessly between on-device and cloud AI depending on context. Chrome did the same — quietly, and in a way that directly benefits Chrome extension AI API developers.

As of Chrome 138+, Gemini Nano runs locally in the browser. No API key. No server round-trip. No per-token billing. Your extension can call natural language AI directly from a content script or background service worker.

This guide covers everything: what Chrome’s built-in AI APIs are, how they compare to cloud inference, when to use each, and practical code you can ship today.

What Is Chrome’s Built-in On-Device AI?

Chrome ships with Gemini Nano — a small but capable language model that runs entirely on the user’s device using the GPU and NPU. It powers a family of task-specific APIs exposed to extensions and web apps through the window.ai namespace (and equivalent APIs accessible from extension contexts).

The model never leaves the device. Processing happens locally. That means:

Zero API cost per request
No data sent to external servers
Works offline
Near-zero latency (no network round-trip)

The tradeoff is capability. Gemini Nano is not GPT-4o or Gemini Ultra. It is optimized for common, well-defined tasks — summarizing, translating, rewriting, classifying — not open-ended reasoning or complex multi-step tasks.

Chrome exposes five built-in AI APIs:

API	Purpose
Prompt API	General-purpose language model prompting
Summarizer API	Condensing long text into summaries
Translator API	Language translation between 12+ languages
Language Detection API	Detecting the language of a text string
Writer / Rewriter API	Generating or rephrasing content

Let us walk through each.

The 5 Chrome Built-in AI APIs for Extension Developers

1. Prompt API — General-Purpose On-Device Inference

The Prompt API is the most flexible of the five. It lets you send free-form prompts to Gemini Nano and receive a response — just like calling an LLM API, but entirely on-device.

// Check availability before using
const { available } = await window.ai.languageModel.capabilities();

if (available === 'readily') {
  const session = await window.ai.languageModel.create({
    systemPrompt: 'You are a helpful assistant for Chrome extension users.'
  });

  const result = await session.prompt(
    'Summarize the key points from this support ticket: ' + ticketText
  );

  console.log(result);
  session.destroy(); // Free memory when done
}

Best for: Classification, intent detection, Q&A, custom instructions that do not fit a specialized API.

Availability check is mandatory. Not all devices have Gemini Nano ready — older hardware or users who have never triggered a download may return 'after-download' or 'no'. Always handle gracefully.

2. Summarizer API — Condense Any Text

The Summarizer API is purpose-built for distillation. Pass it a block of text, get back a summary in the format you choose: key-points, tl;dr, teaser, or headline.

const summarizerCapabilities = await window.ai.summarizer.capabilities();

if (summarizerCapabilities.available !== 'no') {
  const summarizer = await window.ai.summarizer.create({
    type: 'key-points',
    format: 'markdown',
    length: 'short'
  });

  // Trigger download if needed
  if (summarizerCapabilities.available === 'after-download') {
    await summarizer.ready;
  }

  const summary = await summarizer.summarize(articleText);
  console.log(summary);
  summarizer.destroy();
}

Best for: Article digests, email triage, meeting notes, content previews in sidepanels.

3. Translator API — Multilingual Without an API Key

The Translator API handles translation between a growing list of language pairs. Unlike Google Translate or DeepL, there is no cost and no network dependency after the language pack is downloaded.

const languagePair = { sourceLanguage: 'en', targetLanguage: 'es' };
const translatorCapabilities = await translation.canTranslate(languagePair);

if (translatorCapabilities !== 'no') {
  const translator = await translation.createTranslator(languagePair);

  if (translatorCapabilities === 'after-download') {
    await translator.ready;
  }

  const translated = await translator.translate('Hello, how can I help you?');
  console.log(translated); // "Hola, ¿cómo puedo ayudarte?"
}

Best for: Real-time sidepanel translation, multilingual form helpers, customer support extensions.

4. Language Detection API — Know What Language You Are Working With

Before translating or routing content, you need to know what language it is in. The Language Detection API gives you that with confidence scores — entirely on-device.

const detector = await translation.createDetector();
const results = await detector.detect(userInputText);

// results is an array of { detectedLanguage, confidence }
const topResult = results[0];
console.log(topResult.detectedLanguage); // e.g. "fr"
console.log(topResult.confidence);       // e.g. 0.97

Best for: Auto-routing to the correct translation pair, tagging content by language, skipping translation when unnecessary.

5. Writer and Rewriter APIs — Generate and Refine Content

The Writer API generates new content from a prompt and context. The Rewriter API takes existing content and transforms it — changing tone, length, or formality — without rewriting intent.

// Writer: generate from scratch
const writer = await window.ai.writer.create({
  tone: 'formal',
  length: 'short',
  format: 'plain-text'
});

const reply = await writer.write(
  'Write a professional response to a refund request',
  { context: 'Customer says: "I want a refund for order #1234"' }
);

// Rewriter: transform existing content
const rewriter = await window.ai.rewriter.create({
  tone: 'more-casual',
  length: 'shorter'
});

const simplified = await rewriter.rewrite(technicalDocText);

Best for: Smart reply suggestions, tone adjustment for emails, content rephrasing for accessibility.

On-Device vs Cloud AI: The Full Comparison

This is the core decision every extension developer faces. Here is the unvarnished comparison:

	Built-in AI (On-Device)	Cloud API (OpenAI, Gemini, Claude)
Latency	~50–200ms (local GPU)	500ms–3s+ (network)
Privacy	Data never leaves device	Data sent to provider servers
Cost	Free	Per-token billing
Offline Support	Full (after model download)	None
Model Quality	Good for simple tasks	Excellent for complex reasoning
Context Window	Limited (~4K tokens typical)	Large (8K–200K+ tokens)
Availability	Chrome 138+, compatible hardware	Any browser, any device
Streaming	Supported	Supported
Fine-tuning	Not available	Available (some providers)
Setup Complexity	No keys, no backend	API keys, CORS handling, rate limits

Neither option is universally better. The right answer depends on what you are building.

When to Use Built-in AI vs External APIs

Use Chrome’s built-in AI when:

The task is well-defined. Summarization, translation, classification, and rewriting are exactly what these APIs are optimized for. Do not use the cloud for what Gemini Nano does well locally.
Privacy is a selling point. If your extension handles sensitive documents, emails, or health data, on-device inference is a hard requirement, not a nice-to-have. Users and enterprise buyers will ask.
Your users include power users with no internet. Researchers, writers, and analysts who use extensions offline need AI that works offline.
You have zero infrastructure budget. Building a side project or MVP? On-device AI means no API billing, no key management, no backend proxy.
Latency matters more than accuracy. Real-time sidepanel suggestions, autocomplete, or live translation require sub-second responses that cloud APIs cannot reliably deliver.

Use cloud APIs when:

You need high-quality reasoning. Complex analysis, multi-step instructions, code generation, or nuanced content creation require a larger model. Gemini Nano will not out-perform GPT-4o or Gemini 1.5 Pro on hard tasks.
You need large context windows. Analyzing a full PDF, summarizing an entire repository, or processing long legal documents requires more than Gemini Nano’s context limit.
Your users are on older hardware or non-Chrome browsers. Built-in AI requires Chrome 138+ and compatible GPU/NPU. Firefox and Safari users get nothing.
You need model-specific capabilities. Vision inputs, structured JSON output with complex schemas, or tool use at scale require cloud providers.
Consistency and auditability matter. Cloud providers offer versioned models, audit logs, and compliance certifications that on-device inference cannot match.

The hybrid approach (recommended)

The smartest architecture mirrors what Android’s hybrid inference API does: attempt on-device first, fall back to cloud when necessary.

async function smartSummarize(text) {
  // Try on-device first
  const capabilities = await window.ai.summarizer.capabilities();

  if (capabilities.available === 'readily') {
    const summarizer = await window.ai.summarizer.create({
      type: 'tl;dr',
      length: 'short'
    });
    const result = await summarizer.summarize(text);
    summarizer.destroy();
    return { result, source: 'on-device' };
  }

  // Fall back to cloud
  const response = await fetch('https://api.openai.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${apiKey}`
    },
    body: JSON.stringify({
      model: 'gpt-4o-mini',
      messages: [
        { role: 'system', content: 'Summarize the following text in 2-3 sentences.' },
        { role: 'user', content: text }
      ]
    })
  });

  const data = await response.json();
  return { result: data.choices[0].message.content, source: 'cloud' };
}

This pattern gives users the best of both: fast, private, free inference when available — with reliable fallback when not.

Practical Use Cases for On-Device AI in Extensions

Smart Email Assistant

A Gmail or Outlook extension that reads the current email thread and suggests a reply — entirely on-device. Use the Prompt API for intent detection, Writer API for draft generation, and Rewriter API for tone adjustment.

No email content ever leaves the browser. That is a feature, not a footnote.

Content Summarizer Sidepanel

A reading extension that adds a sidepanel to any article page. When opened, it calls the Summarizer API with the page’s main content and displays a key-points list in under 200ms. Works offline. Zero cost at scale.

// content script
chrome.runtime.onMessage.addListener(async (msg, sender, sendResponse) => {
  if (msg.type === 'summarize') {
    const articleText = document.querySelector('article')?.innerText || document.body.innerText;
    const capabilities = await window.ai.summarizer.capabilities();

    if (capabilities.available !== 'no') {
      const summarizer = await window.ai.summarizer.create({
        type: 'key-points',
        format: 'markdown',
        length: 'medium'
      });

      if (capabilities.available === 'after-download') {
        await summarizer.ready;
      }

      const summary = await summarizer.summarize(articleText.slice(0, 8000));
      summarizer.destroy();
      sendResponse({ summary });
    }
  }
  return true; // Keep message channel open for async response
});

A sidepanel extension that translates selected text as the user highlights it. The Language Detection API identifies the source language automatically. The Translator API handles conversion. The result appears in the sidebar within milliseconds.

Add a cloud fallback for unsupported language pairs and you have a production-grade translation tool built entirely with browser-native APIs.

Manifest V3 Compatibility Notes

All five Chrome built-in AI APIs are compatible with Manifest V3. A few things to keep in mind:

Service workers cannot use streaming responses the same way as pages. Use promptStreaming() in content scripts or sidepanels, not background service workers.
Session objects are not persistent. Service workers are terminated when idle. Create new sessions on demand; do not try to persist window.ai.languageModel sessions across wake cycles.
No special permissions required for built-in AI APIs. You do not need to declare any new permissions in manifest.json to use these APIs.
Content Security Policy. Built-in AI APIs make no external network requests, so they are fully compatible with strict CSP settings.

Getting Started Today

Chrome’s built-in AI APIs are available in Chrome 138+ with the “Prompt API for Chrome Extensions” flag enabled for development. For production, Summarizer, Translator, and Language Detection are stable. The Prompt API, Writer, and Rewriter are in origin trial as of early 2026.

Check the Chrome AI APIs origin trial status for the latest availability.

To test your extension’s AI features across different device capabilities, use ExtensionBooster’s free developer tools — including a Chrome extension AI compatibility checker and automated store listing analyzer that helps you communicate your extension’s AI features clearly to users.

The era of “AI tax” — where every inference request burns API budget — is ending for common browser tasks. Start with on-device. Fall back to cloud only when you must.

Chrome Extension AI: How to Use Built-in On-Device and Cloud AI APIs in 2026

What Is Chrome’s Built-in On-Device AI?

The 5 Chrome Built-in AI APIs for Extension Developers

1. Prompt API — General-Purpose On-Device Inference

2. Summarizer API — Condense Any Text

3. Translator API — Multilingual Without an API Key

4. Language Detection API — Know What Language You Are Working With

5. Writer and Rewriter APIs — Generate and Refine Content

On-Device vs Cloud AI: The Full Comparison

When to Use Built-in AI vs External APIs

Use Chrome’s built-in AI when:

Use cloud APIs when:

The hybrid approach (recommended)

Practical Use Cases for On-Device AI in Extensions

Smart Email Assistant

Content Summarizer Sidepanel

Real-Time Translator Sidebar

Manifest V3 Compatibility Notes

Getting Started Today

Related Articles

Building Accessible Chrome Extensions: Keyboard, Screen Reader, and WCAG Compliance

I Built the Same Chrome Extension With 5 Different Frameworks. Here's What Actually Happened.

5 Best Email Marketing Services to Grow Your Chrome Extension (2026)

What Is Chrome’s Built-in On-Device AI?

The 5 Chrome Built-in AI APIs for Extension Developers

1. Prompt API — General-Purpose On-Device Inference

2. Summarizer API — Condense Any Text

3. Translator API — Multilingual Without an API Key

4. Language Detection API — Know What Language You Are Working With

5. Writer and Rewriter APIs — Generate and Refine Content

On-Device vs Cloud AI: The Full Comparison

When to Use Built-in AI vs External APIs

Use Chrome’s built-in AI when:

Use cloud APIs when:

The hybrid approach (recommended)

Practical Use Cases for On-Device AI in Extensions

Smart Email Assistant

Content Summarizer Sidepanel

Real-Time Translator Sidebar

Manifest V3 Compatibility Notes

Getting Started Today

Related Guides

Related Articles

Building Accessible Chrome Extensions: Keyboard, Screen Reader, and WCAG Compliance

I Built the Same Chrome Extension With 5 Different Frameworks. Here's What Actually Happened.

5 Best Email Marketing Services to Grow Your Chrome Extension (2026)