Scribe

Name: Scribe
Author: anthropics

anthropics/knowledge-work-plugins

1.4k installs
23.1k repo stars
Updated July 28, 2026
anthropics/knowledge-work-plugins

scribe is an agent skill for reference skill for zoom ai services scribe. use after routing to a transcription workflow when handling uploaded or stored media, build-platform jwt auth, fast mode.

About

The scribe skill is designed for reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode. If the user needs live meeting media without file-based upload/batch jobs, route to ../rtms/SKILL.md. If the user needs Zoom REST API inventory for AI Services paths, chain ../rest-api/SKILL.md. Invoke when the user asks about scribe or related SKILL.md workflows.

synchronous single-file transcription (POST /aiservices/scribe/transcribe).
asynchronous batch jobs (/aiservices/scribe/jobs*).
browser microphone pseudo-streaming via repeated short file uploads.
webhook-driven batch status updates.
Build-platform JWT generation and credential handling.

Scribe by the numbers

1,432 all-time installs (skills.sh)
+85 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #384 of 1,881 Marketing & SEO skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

scribe capabilities & compatibility

Capabilities: synchronous single file transcription (post /ais · asynchronous batch jobs (/aiservices/scribe/jobs · browser microphone pseudo streaming via repeated · webhook driven batch status updates
Use cases: seo

From the docs

What scribe says it does

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode transcription,

SKILL.md

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast

SKILL.md

npx skills add https://github.com/anthropics/knowledge-work-plugins --skill scribe

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/anthropics/knowledge-work-plugins/scribe.svg)](https://skillselion.com/skills/anthropics/knowledge-work-plugins/scribe)

Installs	1.4k
repo stars	★ 23.1k
Security audit	2 / 3 scanners passed
Last updated	July 28, 2026
Repository	anthropics/knowledge-work-plugins ↗

How do I reference skill for zoom ai services scribe. use after routing to a transcription workflow when handling uploaded or stored media, build-platform jwt auth, fast mode?

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode.

Who is it for?

Developers using scribe workflows documented in SKILL.md.

Skip if: Skip when the task falls outside scribe scope or needs a different stack.

When should I use this skill?

User asks about scribe or related SKILL.md workflows.

What you get

Completed scribe workflow with documented commands, files, and expected deliverables.

JWT generation code
transcription integration config

By the numbers

JWT expiration should be kept to one hour or less
Uses HS256 algorithm for bearer token signing

Files

SKILL.mdMarkdownGitHub ↗

Zoom AI Services Scribe

Background reference for Zoom AI Services Scribe across:

synchronous single-file transcription (POST /aiservices/scribe/transcribe)
asynchronous batch jobs (/aiservices/scribe/jobs*)
browser microphone pseudo-streaming via repeated short file uploads
webhook-driven batch status updates
Build-platform JWT generation and credential handling

Official docs:

https://developers.zoom.us/docs/ai-services/
https://developers.zoom.us/docs/ai-services/scribe/
https://developers.zoom.us/docs/api/ai-services/
https://developers.zoom.us/api-hub/ai-services/methods/endpoints.json
Quickstart sample: https://github.com/zoom/scribe-quickstart/

Routing Guardrail

If the user needs uploaded or stored media transcribed into text, route here first.
If the user needs live meeting media without file-based upload/batch jobs, route to ../rtms/SKILL.md.
If the user needs Zoom REST API inventory for AI Services paths, chain ../rest-api/SKILL.md.
If the user needs webhook signature patterns or generic HMAC receiver hardening, optionally chain ../webhooks/SKILL.md.

Quick Links

1. concepts/auth-and-processing-modes.md 2. scenarios/high-level-scenarios.md 3. examples/fast-mode-node.md 4. examples/batch-webhook-pipeline.md 5. references/api-reference.md 6. references/environment-variables.md 7. references/samples-validation.md 8. references/versioning-and-drift.md 9. troubleshooting/common-drift-and-breaks.md 10. RUNBOOK.md

Core Workflow

1. Get Build-platform credentials and generate an HS256 JWT. 2. Choose fast mode for one short file or batch mode for stored archives / large sets. 3. Submit the transcription request. 4. For batch jobs, poll job/file status or receive webhook notifications. 5. Persist and post-process transcript JSON.

Hosted Fast-Mode Guardrail

The formal fast-mode API limits are 100 MB and 2 hours, but hosted browser flows can still time out before the upstream response returns.
Current deployed-sample observations:
~17.2 MB MP4 completed in about 26s
~38.6 MB MP4 completed in about 26-37s
~59.2 MB MP4 completed in about 32-34s on the backend
some ~59.2 MB browser requests still surfaced as frontend 504 while backend logs later showed 200
Treat frontend 504 plus backend 200 as a browser/edge timeout race, not an automatic transcription failure.
For hosted UIs, prefer an async request/polling wrapper for fast mode instead of holding the browser open for the full upstream response.
For larger or less predictable media, prefer batch mode even when the file is still within the formal fast-mode size limit.

Browser Microphone Pattern

scribe does not expose a documented real-time streaming API surface.
If you want a browser microphone experience, use pseudo-streaming:

1. capture microphone audio in short chunks 2. upload each chunk through the async fast-mode wrapper 3. poll for completion 4. append chunk transcripts in sequence

Recommended starting cadence:
chunk size: 5 seconds
acceptable range: 5-10 seconds
in-flight chunk requests: 2-3
This is a practical UI pattern for incremental transcript updates, not a substitute for rtms.
Treat this as a fallback demo pattern, not the preferred production architecture.
It adds repeated upload overhead, chunk-boundary drift, browser codec/container variability, and transcript stitching complexity.
If the user asks for actual live stream ingestion, low-latency continuous media, or server-push media transport, route to ../rtms/SKILL.md instead.

Endpoint Surface

Mode	Method	Path	Use
Fast	`POST`	`/aiservices/scribe/transcribe`	Synchronous transcription for one file
Batch	`POST`	`/aiservices/scribe/jobs`	Submit asynchronous batch job
Batch	`GET`	`/aiservices/scribe/jobs`	List jobs
Batch	`GET`	`/aiservices/scribe/jobs/{jobId}`	Inspect job summary/state
Batch	`DELETE`	`/aiservices/scribe/jobs/{jobId}`	Cancel queued/processing job
Batch	`GET`	`/aiservices/scribe/jobs/{jobId}/files`	Inspect per-file results

High-Level Scenarios

On-demand clip transcription after a user uploads one recording.
Batch transcription of stored S3 call archives.
Webhook-driven ETL pipeline that writes transcripts to your database/search index.
Re-transcription of Zoom-managed recordings after exporting them to your own storage.
Offline compliance or QA workflows that need timestamps, channel separation, and speaker hints.

Chaining

Stored Zoom recordings -> ../rest-api/SKILL.md + scribe
Webhook verification hardening -> ../webhooks/SKILL.md
Real-time live transcript/media -> ../rtms/SKILL.md
Cross-product routing -> ../general/SKILL.md

Operations

RUNBOOK.md - 5-minute preflight and debugging checklist.

Auth and Processing Modes

Authentication Model

Scribe uses a Build-platform JWT bearer token.

JWT shape:

algorithm: HS256
issuer claim: Build-platform credential identifier used by the Scribe API
expiration: keep to one hour or less

Node example:

import { KJUR } from 'jsrsasign';

export function generateJWT(apiKey, apiSecret) {
  const iat = Math.round(Date.now() / 1000) - 30;
  const exp = iat + 60 * 60;
  return KJUR.jws.JWS.sign(
    'HS256',
    JSON.stringify({ alg: 'HS256', typ: 'JWT' }),
    JSON.stringify({ iss: apiKey, iat, exp }),
    apiSecret,
  );
}

Credential Naming Drift

Zoom docs currently use inconsistent labels across AI Services pages:

API key / API secret
SDK key / SDK secret
Build platform credentials

For implementation, treat them as the Build-platform JWT issuer/secret pair used to sign Scribe requests. Verify the exact labels in the current portal UI before shipping.

Fast Mode vs Batch Mode

Mode	Best for	Transport	Result timing
Fast mode	One short file, interactive UX	`POST /transcribe`	Immediate synchronous JSON
Batch mode	Archives, long media, many files	`POST /jobs` then status/webhook	Asynchronous

Fast Mode Request Shape

required: file, config
common config: language, word_time_offsets, channel_separation, timestamps, output_format, profanity_filter, diarization

Batch Mode Request Shape

required: input, output, config
input modes: SINGLE, PREFIX, MANIFEST
storage provider currently surfaced in the OpenAPI as S3
optional webhook callback: notifications.webhook_url + notifications.secret

Operational Choice

Choose fast mode when:

user uploads one file
latency matters more than throughput
file size and duration are manageable
you are building pseudo-streaming over short microphone chunks from a browser UI

Choose batch mode when:

many files must be processed
transcripts can arrive later
storage-centric workflows fit better than direct upload

Browser Microphone Pseudo-Streaming

Scribe is file-oriented, so a browser microphone UX should be modeled as repeated short uploads, not a long-lived stream.

Recommended pattern: 1. capture browser microphone audio with MediaRecorder 2. flush short chunks to your backend 3. submit each chunk through the async fast-mode wrapper 4. poll by request ID 5. append transcript chunks in order

Recommended starting values:

chunk size: 5 seconds
acceptable range: 5-10 seconds
concurrent in-flight chunks: 2-3

Why this works:

lowers the chance of frontend 504 on longer synchronous requests
gives incremental transcript updates without waiting for one long request

Guardrail:

this is pseudo-streaming over file uploads
this is not the preferred production design for live audio capture
use it only when a lightweight browser demo or rough incremental transcript is acceptable
avoid it when you need stable low-latency live transcription, lower overhead, or stronger continuity across utterances
for true live media streams, low-latency server ingest, or continuous in-meeting audio, use rtms

Batch Job + Webhook Pipeline

Use batch mode when you need to process stored archives asynchronously.

Flow

submit batch job
  -> receive job_id
  -> poll /jobs or wait for webhook
  -> inspect /jobs/{jobId}/files
  -> ingest transcript outputs

Submit Example

curl -X POST https://api.zoom.us/v2/aiservices/scribe/jobs   -H "Authorization: Bearer $TOKEN"   -H "Content-Type: application/json"   -d '{
    "input": {
      "mode": "PREFIX",
      "source": "S3",
      "uri": "s3://example-bucket/audio/",
      "auth": {
        "aws": {
          "access_key_id": "...",
          "secret_access_key": "...",
          "session_token": "..."
        }
      }
    },
    "output": {
      "destination": "S3",
      "uri": "s3://example-bucket/transcripts/",
      "layout": "PREFIX",
      "auth": {
        "aws": {
          "access_key_id": "...",
          "secret_access_key": "...",
          "session_token": "..."
        }
      }
    },
    "config": {
      "language": "en-US",
      "word_time_offsets": true,
      "channel_separation": true
    },
    "notifications": {
      "webhook_url": "https://example.com/webhooks/scribe",
      "secret": "replace-me"
    }
  }'

Webhook Verification Pattern

import crypto from 'crypto';

function verifyZoomWebhook(rawBody, timestamp, signature, secret) {
  const message = `v0:${timestamp}:${rawBody}`;
  const expected = `sha256=${crypto.createHmac('sha256', secret).update(message).digest('hex')}`;
  return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
}

Fast Mode Node Example

Minimal backend proxy for synchronous transcription.

import express from 'express';
import multer from 'multer';
import { KJUR } from 'jsrsasign';

const app = express();
const upload = multer({ storage: multer.memoryStorage() });
app.use(express.json());

function generateJWT() {
  const iat = Math.round(Date.now() / 1000) - 30;
  const exp = iat + 60 * 60;
  return KJUR.jws.JWS.sign(
    'HS256',
    JSON.stringify({ alg: 'HS256', typ: 'JWT' }),
    JSON.stringify({ iss: process.env.ZOOM_API_KEY, iat, exp }),
    process.env.ZOOM_API_SECRET,
  );
}

app.post('/transcribe', upload.single('file'), async (req, res) => {
  const token = generateJWT();
  const config = {
    language: req.body.language || 'en-US',
    word_time_offsets: true,
    channel_separation: false,
  };

  let response;
  if (req.file) {
    const form = new FormData();
    form.append('file', new Blob([new Uint8Array(req.file.buffer)]), req.file.originalname);
    form.append('config', JSON.stringify(config));
    response = await fetch('https://api.zoom.us/v2/aiservices/scribe/transcribe', {
      method: 'POST',
      headers: { Authorization: `Bearer ${token}` },
      body: form,
    });
  } else {
    response = await fetch('https://api.zoom.us/v2/aiservices/scribe/transcribe', {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${token}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        file: req.body.file,
        config,
      }),
    });
  }

  const text = await response.text();
  res.status(response.status).type('application/json').send(text);
});

Use this pattern when:

the caller uploads a file to your backend and you forward it as multipart
or the caller already has a URL-accessible media file and you submit the JSON URL form

Zoom AI Services Scribe API Reference

Canonical sources:

OpenAPI JSON: https://developers.zoom.us/api-hub/ai-services/methods/endpoints.json
Docs overview: https://developers.zoom.us/docs/ai-services/scribe/
Base URL: https://api.zoom.us/v2

Endpoint Inventory

Method	Endpoint	Summary	Operation ID
POST	`/aiservices/scribe/transcribe`	Scribe (Synchronous)	`createFastAsr`
POST	`/aiservices/scribe/jobs`	Submit Batch Scribe Job	`submitBatchAsr`
GET	`/aiservices/scribe/jobs`	List Batch Jobs	`listBatchJobs`
GET	`/aiservices/scribe/jobs/{jobId}`	Get Batch Job Status	`getBatchJobStatus`
DELETE	`/aiservices/scribe/jobs/{jobId}`	Cancel Batch Job	`cancelBatchJob`
GET	`/aiservices/scribe/jobs/{jobId}/files`	List Batch Job Files	`listBatchJobFiles`

Request Shapes

`POST /aiservices/scribe/transcribe`

Required top-level fields:

file
config

Common config fields:

language
word_time_offsets
channel_separation
timestamps
output_format
profanity_filter
diarization

Response keys:

request_id
duration_sec
model
result

`POST /aiservices/scribe/jobs`

Required top-level fields:

input
output
config

Input subfields:

mode (SINGLE, PREFIX, MANIFEST)
source (S3 in current spec)
uri
manifest
filters.include_globs
filters.exclude_globs
auth.aws.access_key_id
auth.aws.secret_access_key
auth.aws.session_token

Output subfields:

destination
uri
layout (SINGLE, PREFIX, ADJACENT)
auth.aws.*

Config subfields:

language
word_time_offsets
channel_separation
diarization
profanity_filter
output_format
segmentation_mode

Optional:

reference_id
notifications.webhook_url
notifications.secret

Response keys:

job_id
state
submitted_at

`GET /aiservices/scribe/jobs`

Query params:

state
page_size
next_page_token

Response keys:

jobs
next_page_token

`GET /aiservices/scribe/jobs/{jobId}`

Path params:

jobId

Response keys:

job_id
state
submitted_at
summary

`GET /aiservices/scribe/jobs/{jobId}/files`

Path params:

jobId

Query params:

page_size
next_page_token

Response keys:

files
next_page_token

Current Limits and Constraints Observed in Sources

Batch manifest max: 1000 file URIs.
include_globs max items: 10.
exclude_globs max items: 10.
Audio/media formats called out in docs: WAV, MP3, M4A, MP4.
Batch job rate limit label in the OpenAPI description: LIGHT.

Environment Variables

Required for JWT Auth

Variable	Required	Description
`ZOOM_API_KEY`	Yes	Build-platform issuer key used in the JWT `iss` claim
`ZOOM_API_SECRET`	Yes	Build-platform signing secret for `HS256` JWT generation

Do not treat shell placeholders such as ${ZOOM_API_KEY} as valid configured values.

Common App Variables

Variable	Required	Description
`PORT`	No	Local server port
`LANGUAGE`	No	Default language code such as `en-US`

Batch / S3 Variables

Variable	Required for batch	Description
`S3_INPUT_URI`	Usually	Input prefix or file URI
`S3_OUTPUT_URI`	Usually	Output transcript destination
`AWS_ACCESS_KEY_ID`	If not using pre-signed access	AWS credential
`AWS_SECRET_ACCESS_KEY`	If not using pre-signed access	AWS credential
`AWS_SESSION_TOKEN`	Often	Temporary credential token

Webhook Variables

Variable	Required	Description
`WEBHOOK_URL`	Optional	Public HTTPS callback for batch notifications
`WEBHOOK_SECRET`	Optional but recommended	HMAC secret used to verify Zoom callback signatures

Where to Find These Values

Build-platform credentials: Zoom developer portal / Build app credential page.
S3 URIs: your cloud storage path design.
AWS credentials: IAM or STS-issued temporary credentials.
Webhook URL: public HTTPS endpoint you control.

Scribe 5-Minute Preflight Runbook

Use this before deep debugging.

1) Confirm the Right Product

File-based or storage-based transcription -> stay on scribe.
Live meeting media stream or botless live transcription -> use rtms instead.
Meeting bot that joins and records before transcription -> chain Meeting SDK Linux first.

2) Confirm Credentials

Build-platform issuer credential pair available.
JWT generation uses HS256 with one-hour-or-less expiry.
Secret stays server-side.
Reject placeholder values such as ${ZOOM_API_KEY} and ${ZOOM_API_SECRET}. They can make a naive health check look configured while every real call still fails.

3) Confirm Mode Selection

Fast mode for one short file and immediate JSON response.
Batch mode for many files, long recordings, or archive-style processing.
Browser microphone pseudo-streaming for short repeated chunks uploaded through the async fast-mode wrapper.
Fast mode current limits from the API spec:
maximum file size: 100 MB
maximum duration: 2 hours
If fast mode is exposed through a hosted browser UI, prefer an async wrapper:
browser uploads once
backend returns 202 with a request ID
frontend polls for completion

This avoids losing successful transcriptions to edge/client timeout races.

Observed hosted timing from the deployed sample:
~17.2 MB MP4 completed in ~26s
~38.6 MB MP4 completed in ~26-37s
~59.2 MB MP4 completed in ~32-34s on the backend
some ~59.2 MB requests still surfaced as frontend 504 even though the backend later completed with 200

Treat these as deployment observations, not hard API guarantees.

Recommended starting browser mic cadence:
chunk size: 5 seconds
acceptable range: 5-10 seconds
keep only 2-3 chunks in flight at once

This gives incremental transcript updates without trying to hold a single long browser request open.

For browser mic capture, rotate the recorder per chunk so each uploaded blob is a standalone file.

Do not assume MediaRecorder.start(timeslice) later chunks will always be independently transcribable.

Do not treat this as the default production solution for live transcription.

Prefer rtms when the user actually needs a live-audio product instead of a browser demo.

4) Confirm Storage / Webhook Inputs

Fast mode file URL or upload path resolves.
Batch input/output URIs are valid.
AWS or pre-signed access is set correctly for S3 mode.
Webhook URL is public HTTPS if you expect notifications.

5) Confirm Post-Processing Contract

Decide whether downstream code expects text_display, segments, or word-level timings.
Decide whether channel separation or diarization is required before shipping.

6) Quick Probes

JWT generation works locally.
POST /aiservices/scribe/transcribe succeeds with a known small file.
For browser-uploaded files, backend forwarding should use multipart/form-data to Zoom, not a JSON data: URI wrapper.
Batch submit returns 201 with job_id.
Webhook signature verification works with the configured secret.

7) Fast Decision Tree

401/auth failure -> wrong credential pair or expired JWT.
Fast mode returns schema error -> wrong request body or config fields.
Fast mode returns 413 Request Entity Too Large before the app logs anything -> reverse proxy limit, not Scribe.
Frontend returns 504 but backend logs later show 200 -> browser/edge timeout race; poll by request ID instead of assuming failure.
Browser mic feature needs true continuous low-latency media instead of chunked uploads -> switch to rtms, not scribe.
Browser mic chunk 1 works but chunk 2 onward is empty -> recorder/container boundary issue; restart the recorder for each chunk.
Batch jobs queue but never complete -> storage auth / URI / webhook issues.
Missing transcripts for some files -> inspect /jobs/{jobId}/files before re-submitting whole batch.

8) Source Checkpoints

Official docs

https://developers.zoom.us/docs/ai-services/
https://developers.zoom.us/docs/ai-services/scribe/
https://developers.zoom.us/api-hub/ai-services/methods/endpoints.json

Raw docs in repo tooling output

tools/zoom-crawler/raw-docs/developers.zoom.us/docs/ai-services/

High-Level Scenarios

Scenario 1: On-Demand Upload Transcription

Use fast mode when a user uploads one file and expects a transcript immediately.

Flow: 1. Browser uploads file to your backend. 2. Backend generates Build JWT. 3. Backend calls POST /aiservices/scribe/transcribe. 4. Backend returns transcript JSON to the caller.

Common downstream uses:

post-call summaries
ticket enrichment
searchable clip libraries
internal review or handoff notes

Scenario 2: Batch S3 Archive Transcription

Use batch mode when call archives or media libraries already live in S3.

Flow: 1. Build a batch request with input prefix and output prefix. 2. Submit POST /aiservices/scribe/jobs. 3. Track state by webhook or polling. 4. Read /jobs/{jobId}/files for per-file success/failure. 5. Ingest outputs into search, analytics, or storage.

Common downstream uses:

compliance and audit logging
searchable webinar or podcast archives
bulk transcript backfills
QA scoring inputs

Scenario 3: Zoom Recording Export + Re-Transcription

Use when you must re-process Zoom-managed recordings with your own transcript settings.

Skill chain:

zoom-rest-api to fetch/download recordings
scribe to transcribe exported media

Typical reasons:

you need your own retention/search pipeline
you need different transcript settings than Zoom-managed defaults
you want to enrich recordings with your own summarization or tagging flow

Scenario 4: Compliance / QA Processing

Use batch mode when transcripts must be generated offline for audits, QA scoring, or archival search.

Prefer:

word_time_offsets=true when reviewers need precise excerpts
channel_separation=true for stereo call recordings
webhook + queue ingestion instead of synchronous polling for large volumes

Scenario 5: Customer Support Voice-to-Insights Pipeline

Use when support call recordings should feed operational analytics instead of stopping at raw transcript text.

Flow: 1. Ingest call recordings from storage or exported meeting assets. 2. Transcribe with scribe. 3. Store transcript plus speaker/timing metadata. 4. Run downstream sentiment, keyword, escalation, or QA logic in your own pipeline.

Guardrail:

keep scribe focused on transcription
do sentiment analysis, keyword detection, or scoring in downstream services after transcript generation

Scenario 6: Browser Microphone Incremental Transcript

Use when a web page should capture microphone audio and show transcript updates every few seconds without switching to RTMS.

Flow: 1. Browser captures microphone audio with MediaRecorder. 2. Browser flushes one chunk every 5 seconds. 3. Backend accepts each chunk as a normal fast-mode upload through the async wrapper. 4. Frontend polls by request ID and appends transcript chunks in order.

Guardrail:

this is pseudo-streaming over repeated file uploads
this is best kept as a lightweight demo or constrained fallback
do not choose it first for a true live-transcription product
if the requirement is truly live media stream ingestion or lower-latency continuous audio, route to rtms

Common Drift and Breaks

1. Auth fails even though credentials look correct

Likely causes:

wrong credential pair from the portal
expired JWT
mixing Build-platform credentials with non-Build Zoom app credentials
valid-looking key/secret pair that is not authorized for AI Services Scribe

Check:

iss value
exp window
current credential labels in portal
if the API responds with {"code":124,"message":"Invalid Access token"}, treat that as a real upstream auth failure, not a transport problem

2. Fast mode request shape mismatch

Docs show JSON with file URL, but the official quickstart also proxies multipart upload and forwards a FormData request.

Use one clear model per service boundary:

client upload -> your backend multipart
backend upload proxy -> multipart/form-data to /aiservices/scribe/transcribe
backend URL-based submit -> JSON body with file URL

Symptoms:

browser request stays pending for a long time
backend eventually returns timeout or empty upstream reply

Preferred fix:

treat uploaded files and URL-based files as two separate request paths instead of forcing both through one JSON shape

3. Fast mode returns `413 Request Entity Too Large`

Likely cause:

reverse proxy rejected the upload before the request reached your app

Known deployment check:

if nginx fronts the app, raise client_max_body_size to match or exceed your server-side upload limit

Current Scribe fast-mode API limits:

maximum file size: 100 MB
maximum duration: 2 hours

4. Fast mode returns `504 Gateway time-out`

Likely cause:

the request reached your backend, but synchronous processing took too long for the edge/proxy path

Observed deployment behavior:

public HTTPS can time out even when the same request path is valid and the backend is healthy
observed hosted sample timings:
~17.2 MB MP4: ~26s
~38.6 MB MP4: ~26-37s
~59.2 MB MP4: ~32-34s backend completion, but some browser requests still timed out first

Guardrail:

use fast mode for smaller, interactive files
use batch mode for large uploads or longer media where waiting synchronously through the web UI is brittle
add request-level logging for:
file name
file size
mime type
upstream elapsed time
response payload size and top-level keys

so you can tell whether the origin completed successfully while the browser/edge timed out

for hosted UIs, wrap fast mode in an async request/polling flow instead of holding the browser open for the entire upstream response
if nginx access logs show 499 while app logs later show zoom_request_finished status: 200, the transcription succeeded and only the browser-side request path was lost

5. Batch job accepted but outputs never appear

Likely causes:

S3 URI/auth mismatch
expired STS credentials
output layout/URI mismatch
webhook endpoint unreachable if you rely on callbacks

Check:

/jobs/{jobId} summary
/jobs/{jobId}/files
cloud storage permissions

6. Webhook verification fails

Current sample pattern uses:

x-zm-signature
x-zm-request-timestamp
HMAC-SHA256 with sha256= prefix

If verification fails:

confirm raw body capture before JSON parsing
confirm timestamp header was included in the signed string
confirm the shared secret matches the job notification config

7. Health check says credentials exist, but API calls still fail

Likely cause:

environment file contains literal placeholders such as ${ZOOM_API_KEY} or ${ZOOM_API_SECRET}

Guardrail:

only treat credentials as present if they are real values, not unresolved shell placeholders
fail fast with a clear credential error before attempting Zoom calls

8. Wrong product chosen

Symptoms:

trying to use Scribe for live in-meeting media
trying to use RTMS for offline archive transcription

Guardrail:

file/storage transcription -> scribe
live meeting media -> rtms

9. Browser microphone chunk 1 works but later chunks are empty

Likely cause:

the browser emitted a valid first container chunk, but later MediaRecorder timeslice blobs were partial WebM/Opus clusters without fresh container headers

Symptoms:

chunk 1 transcribes normally
chunk 2 onward returns empty transcript text or much weaker results
auth and request flow still look healthy

Preferred fix:

do not rely on one long MediaRecorder.start(timeslice) session for standalone chunk uploads
rotate the recorder per chunk instead:
start recorder
record one chunk window
stop recorder
upload that blob
start a new recorder for the next chunk

Guardrail:

treat browser microphone pseudo-streaming as a file-container problem first, not a Scribe-language-model problem

Related skills

Seo AuditRun structured SEO audits on their SaaS site or content hub and receive a prioritized action plan.167k41.1k

CopywritingGenerate, rewrite, or strengthen persuasive website and landing-page copy that converts visitors into users.158k41.1k

Viral Short FormQuickly generate high-retention hooks, scripts, and outlines for TikTok, Reels, YouTube Shorts, and carousels.132k64

Viral HooksWrite and critique viral hooks for short-form video opening sequences.123k64

Viral Captions And CtasOptimize social media captions and CTAs for viral short-form video reach and saves.123k64

Viral Youtube ShortsWrite and diagnose YouTube Shorts for Shorts Feed and long-form funnel.123k64

Forks & variants (1)

Scribe has 1 known copy in the catalog totaling 13 installs. They canonicalize to this original listing.

zoom - 13 installs

How it compares

Pick scribe when you need documented Scribe API auth and processing modes inside agent pipelines rather than generic speech-to-text library setup.

FAQ

What does scribe do?

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode.

When should I use scribe?

User asks about scribe or related SKILL.md workflows.

Is scribe safe to install?

Review the Security Audits panel on this page before installing in production.

Marketing & SEOseo

About

Scribe by the numbers

scribe capabilities & compatibility

What scribe says it does

Add your badge

How do I reference skill for zoom ai services scribe. use after routing to a transcription workflow when handling uploaded or stored media, build-platform jwt auth, fast mode?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

Zoom AI Services Scribe

Routing Guardrail

Quick Links

Core Workflow

Hosted Fast-Mode Guardrail

Browser Microphone Pattern

Endpoint Surface

High-Level Scenarios

Chaining

Operations

Auth and Processing Modes

Authentication Model

Credential Naming Drift

Fast Mode vs Batch Mode

Fast Mode Request Shape

Batch Mode Request Shape

Operational Choice

Browser Microphone Pseudo-Streaming

Batch Job + Webhook Pipeline

Flow

Submit Example

Webhook Verification Pattern

Fast Mode Node Example

Zoom AI Services Scribe API Reference

Endpoint Inventory

Request Shapes

POST /aiservices/scribe/transcribe

POST /aiservices/scribe/jobs

GET /aiservices/scribe/jobs

GET /aiservices/scribe/jobs/{jobId}

GET /aiservices/scribe/jobs/{jobId}/files

Current Limits and Constraints Observed in Sources

Environment Variables

Required for JWT Auth

Common App Variables

Batch / S3 Variables

Webhook Variables

Where to Find These Values

Samples Validation

What the official quickstart confirms

Useful implementation details from the sample

Caveats from the sample

What the blog posts add

Versioning and Drift

Naming Drift in Docs

Product Positioning Drift

Workflow-Claim Drift

API Surface Drift Watchpoints

Review Trigger

Scribe 5-Minute Preflight Runbook

1) Confirm the Right Product

2) Confirm Credentials

3) Confirm Mode Selection

4) Confirm Storage / Webhook Inputs

5) Confirm Post-Processing Contract

6) Quick Probes

7) Fast Decision Tree

8) Source Checkpoints

Official docs

Raw docs in repo tooling output

High-Level Scenarios

Scenario 1: On-Demand Upload Transcription

Scenario 2: Batch S3 Archive Transcription

Scenario 3: Zoom Recording Export + Re-Transcription

Scenario 4: Compliance / QA Processing

Scenario 5: Customer Support Voice-to-Insights Pipeline

Scenario 6: Browser Microphone Incremental Transcript

Common Drift and Breaks

1. Auth fails even though credentials look correct

`POST /aiservices/scribe/transcribe`

`POST /aiservices/scribe/jobs`

`GET /aiservices/scribe/jobs`

`GET /aiservices/scribe/jobs/{jobId}`

`GET /aiservices/scribe/jobs/{jobId}/files`

3. Fast mode returns `413 Request Entity Too Large`

4. Fast mode returns `504 Gateway time-out`