Vss Generate Video Calibration

Name: Vss Generate Video Calibration
Author: nvidia

nvidia/skills

Generate video calibration outputs through the NVIDIA VSS workflow with an agent that follows the published skill execution path.

Overview

Vss-generate-video-calibration is an agent skill for the Build phase that runs NVIDIA VSS-oriented video calibration generation workflows under NVSkills-Eval-tested skill execution.

Install

npx skills add https://github.com/nvidia/skills --skill vss-generate-video-calibration

What is this skill?

NVSkills-Eval external profile benchmarked with 6 evaluation tasks and 2 attempts per task
Agents exercised: claude-code and codex with security, correctness, discoverability, effectiveness, and efficiency dimen
Pass threshold documented at 50% for the evaluation run
Overall publication verdict recorded as FAIL pending remediation before broad workflow reliance
6 evaluation tasks in the NVSkills-Eval dataset
50% pass threshold

Compatible agents: Claude Code, Codex

Adoption & trust: 1 installs on skills.sh; 1.1k GitHub stars; trending (+100% hot-view momentum).

What problem does it solve?

You need reproducible video calibration outputs in a GPU video stack but lack a governed agent workflow that was validated for safety and correctness.

Who is it for?

Builders already in NVIDIA VSS or video analytics experiments who want eval-documented agent behavior for calibration generation tasks.

Skip if: General image editing, unrelated web app CRUD, or teams that need a skill with a passing NVSkills-Eval verdict without their own re-validation.

When should I use this skill?

When an agent needs to execute the vss-generate-video-calibration workflow for relevant video calibration tasks in an NVSkills-Eval external local profile context.

What do I get? / Deliverables

The agent loads the VSS calibration skill, follows the evaluated workflow dimensions, and produces calibration-oriented outputs you can inspect before downstream video pipelines consume them.

Video calibration generation outputs per the skill workflow
Traceable skill execution consistent with NVSkills-Eval skill_execution checks

Recommended Skills

Video Editagentspace-so/runcomfy-agent-skills

Video Edit is a RunComfy-focused agent skill that acts as a smart router between your edit intent and the correct model …211k installs·15 stars

Image To Videoagentspace-so/runcomfy-agent-skills

Image-to-Video on RunComfy picks the right i2v model for each intent—HappyHorse for general animation, Wan 2.7 with audi…210k installs·15 stars

Image Editagentspace-so/runcomfy-agent-skills

Image Edit is a RunComfy Pro Pack agent skill that acts as a smart router between your edit intent and the right model i…210k installs·15 stars

Flux Kontextagentspace-so/runcomfy-agent-skills

Flux Kontext Pro on RunComfy packages Black Forest Labs' precise local edit model with documented prompting patterns and…210k installs·15 stars

Nano Banana 2agentspace-so/runcomfy-agent-skills

Nano Banana 2 on RunComfy wraps Google's Gemini-family flash text-to-image model with prompting patterns for fast iterat…210k installs·15 stars

Nano Banana Editagentspace-so/runcomfy-agent-skills

Nano Banana Edit on RunComfy documents Google's image-to-image edit endpoint for identity-preserving changes, background…210k installs·15 stars

Journey fit

Primary fit

BuildIntegrations & version control

Video calibration generation is an integration step in the product pipeline, so Build is the canonical shelf where agents wire media/vision tooling. Integrations subphase fits skills that call external NVIDIA evaluation or VSS pipelines rather than shipping end-user UI alone.

How it compares

Specialized NVIDIA video-calibration skill package, not a generic FFmpeg scripting prompt or an MCP media server.

Common Questions / FAQ

Who is vss-generate-video-calibration for?

Developers and ML engineers wiring NVIDIA video perception workflows who delegate calibration generation steps to Claude Code, Codex, or similar agents.

When should I use vss-generate-video-calibration?

During Build integrations when you are producing or refreshing video calibration artifacts as part of a VSS or external-profile evaluation pipeline.

Is vss-generate-video-calibration safe to install?

Check the Security Audits panel on this page; the bundled NVSkills-Eval summary includes explicit security checks for secret leakage and destructive commands, and the documented run reported an overall FAIL verdict you should weigh before trusting unattended execution.

SKILL.md

READMESKILL.md - Vss Generate Video Calibration

# Evaluation Report

Evaluation of the `vss-generate-video-calibration` skill before publication through NVSkills-Eval.

This benchmark summarizes 3-Tier Evaluation from NVSkills-Eval results for the skill. The goal is to document whether the skill is safe, discoverable, effective, and useful for agents before it is published for broader workflow use.

## Evaluation Summary

- Skill: `vss-generate-video-calibration`
- Evaluation date: 2026-06-01
- NVSkills-Eval profile: `external`
- Environment: `local`
- Dataset: 6 evaluation tasks
- Attempts per task: 2
- Pass threshold: 50%
- Overall verdict: FAIL

## Agents Used

- `claude-code`
- `codex`

## Metrics Used

Reported benchmark dimensions:

- Security: checks whether skill-assisted execution avoids unsafe behavior such as secret leakage, destructive commands, or unauthorized access.
- Correctness: checks whether the agent follows the expected workflow and produces the correct final output.
- Discoverability: checks whether the agent loads the skill when relevant and avoids using it when irrelevant.
- Effectiveness: checks whether the agent performs measurably better with the skill than without it.
- Efficiency: checks whether the agent uses fewer tokens and avoids redundant work.

Underlying evaluation signals used in this run:

- `security` (Security): checks for unsafe operations, secret leakage, and unauthorized access.
- `skill_execution` (Skill Execution): verifies that the agent loaded the expected skill and workflow.
- `skill_efficiency` (Efficiency): checks routing quality, decoy avoidance, and redundant tool usage.
- `accuracy` (Accuracy): grades final-answer correctness against the reference answer.
- `goal_accuracy` (Goal Accuracy): checks whether the overall user task completed successfully.
- `behavior_check` (Behavior Check): verifies expected behavior steps, including safety expectations.
- `token_efficiency` (Token Efficiency): compares token usage with and without the skill.

## Test Tasks

The benchmark dataset contained 6 evaluation tasks:

- Positive tasks: 6 tasks where the skill was expected to activate.
- Negative tasks: 0 tasks where no skill was expected.
- Unlabeled tasks: 0 tasks where positive/negative intent could not be inferred.

Task composition is derived from the evaluation dataset when possible. Entries with `expected_skill` set are treated as positive skill-activation cases, while entries with `expected_skill: null` are treated as negative activation cases.

## Results

| Dimension | Num | `claude-code` | `codex` |
|---|---:|---:|---:|
| Security | 8 | 96% (+12%) | 79% (+12%) |
| Correctness | 8 | 87% (+1%) | 82% (+26%) |
| Discoverability | 8 | 89% (+9%) | 69% (+7%) |
| Effectiveness | 8 | 57% (-3%) | 55% (+24%) |
| Efficiency | 8 | 71% (+14%) | 53% (+6%) |

Score values show skill-assisted performance. Values in parentheses show uplift versus the no-skill baseline when baseline data is available.

## Tier 1: Static Validation Summary

Tier 1 validation passed with observations. NVSkills-Eval ran 9 checks and found 4 total findings.

Top findings:

- MEDIUM QUALITY/quality_correctness: SKILL_SPEC recommended field missing: 'metadata.author' (`skills/vss-generate-video-calibration/SKILL.md`)
- MEDIUM SCHEMA/author_missing: Author not specified in metadata (`skills/vss-generate-video-calibration/SKILL.md`)
- MEDIUM SECURITY/Unknown (SDI-2): The script uses a curl-pipe-sh pattern to download and execute the `uv` installer from astral.sh without any integrity v (`references/sample-dataset.md:132`)
- MEDIUM SECURITY/Unknown (SQP-2): SSL verification is explicitly disabled (`ssl_verify: false`) in the RTSP capture request, and the Python script also im (`references/rtsp.md:106`)

## Tier 2: Deduplication Summary

Tier 2 validation reported findings. NVSkills-Eval ran 2 checks and found 4 total findings.

Top findings:

- HIGH DUPLICATE/duplicate: Duplicate content found across references/sample-dataset.md and references/videos.md:
  "# iterating

What is this skill?

NVSkills-Eval external profile benchmarked with 6 evaluation tasks and 2 attempts per task

Agents exercised: claude-code and codex with security, correctness, discoverability, effectiveness, and efficiency dimen

Pass threshold documented at 50% for the evaluation run

Overall publication verdict recorded as FAIL pending remediation before broad workflow reliance

6 evaluation tasks in the NVSkills-Eval dataset

50% pass threshold

Compatible agents: Claude Code, Codex

Adoption & trust: 1 installs on skills.sh; 1.1k GitHub stars; trending (+100% hot-view momentum).

What do I get? / Deliverables

The agent loads the VSS calibration skill, follows the evaluated workflow dimensions, and produces calibration-oriented outputs you can inspect before downstream video pipelines consume them.

Video calibration generation outputs per the skill workflow

Traceable skill execution consistent with NVSkills-Eval skill_execution checks

Journey fit

Primary fit

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Vss Generate Video Calibration

# Evaluation Report

Evaluation of the `vss-generate-video-calibration` skill before publication through NVSkills-Eval.

This benchmark summarizes 3-Tier Evaluation from NVSkills-Eval results for the skill. The goal is to document whether the skill is safe, discoverable, effective, and useful for agents before it is published for broader workflow use.

## Evaluation Summary

- Skill: `vss-generate-video-calibration`
- Evaluation date: 2026-06-01
- NVSkills-Eval profile: `external`
- Environment: `local`
- Dataset: 6 evaluation tasks
- Attempts per task: 2
- Pass threshold: 50%
- Overall verdict: FAIL

## Agents Used

- `claude-code`
- `codex`

## Metrics Used

Reported benchmark dimensions:

- Security: checks whether skill-assisted execution avoids unsafe behavior such as secret leakage, destructive commands, or unauthorized access.
- Correctness: checks whether the agent follows the expected workflow and produces the correct final output.
- Discoverability: checks whether the agent loads the skill when relevant and avoids using it when irrelevant.
- Effectiveness: checks whether the agent performs measurably better with the skill than without it.
- Efficiency: checks whether the agent uses fewer tokens and avoids redundant work.

Underlying evaluation signals used in this run:

- `security` (Security): checks for unsafe operations, secret leakage, and unauthorized access.
- `skill_execution` (Skill Execution): verifies that the agent loaded the expected skill and workflow.
- `skill_efficiency` (Efficiency): checks routing quality, decoy avoidance, and redundant tool usage.
- `accuracy` (Accuracy): grades final-answer correctness against the reference answer.
- `goal_accuracy` (Goal Accuracy): checks whether the overall user task completed successfully.
- `behavior_check` (Behavior Check): verifies expected behavior steps, including safety expectations.
- `token_efficiency` (Token Efficiency): compares token usage with and without the skill.

## Test Tasks

The benchmark dataset contained 6 evaluation tasks:

- Positive tasks: 6 tasks where the skill was expected to activate.
- Negative tasks: 0 tasks where no skill was expected.
- Unlabeled tasks: 0 tasks where positive/negative intent could not be inferred.

Task composition is derived from the evaluation dataset when possible. Entries with `expected_skill` set are treated as positive skill-activation cases, while entries with `expected_skill: null` are treated as negative activation cases.

## Results

| Dimension | Num | `claude-code` | `codex` |
|---|---:|---:|---:|
| Security | 8 | 96% (+12%) | 79% (+12%) |
| Correctness | 8 | 87% (+1%) | 82% (+26%) |
| Discoverability | 8 | 89% (+9%) | 69% (+7%) |
| Effectiveness | 8 | 57% (-3%) | 55% (+24%) |
| Efficiency | 8 | 71% (+14%) | 53% (+6%) |

Score values show skill-assisted performance. Values in parentheses show uplift versus the no-skill baseline when baseline data is available.

## Tier 1: Static Validation Summary

Tier 1 validation passed with observations. NVSkills-Eval ran 9 checks and found 4 total findings.

Top findings:

- MEDIUM QUALITY/quality_correctness: SKILL_SPEC recommended field missing: 'metadata.author' (`skills/vss-generate-video-calibration/SKILL.md`)
- MEDIUM SCHEMA/author_missing: Author not specified in metadata (`skills/vss-generate-video-calibration/SKILL.md`)
- MEDIUM SECURITY/Unknown (SDI-2): The script uses a curl-pipe-sh pattern to download and execute the `uv` installer from astral.sh without any integrity v (`references/sample-dataset.md:132`)
- MEDIUM SECURITY/Unknown (SQP-2): SSL verification is explicitly disabled (`ssl_verify: false`) in the RTSP capture request, and the Python script also im (`references/rtsp.md:106`)

## Tier 2: Deduplication Summary

Tier 2 validation reported findings. NVSkills-Eval ran 2 checks and found 4 total findings.

Top findings:

- HIGH DUPLICATE/duplicate: Duplicate content found across references/sample-dataset.md and references/videos.md:
  "# iterating

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is vss-generate-video-calibration for?

When should I use vss-generate-video-calibration?

Is vss-generate-video-calibration safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is vss-generate-video-calibration for?

When should I use vss-generate-video-calibration?

Is vss-generate-video-calibration safe to install?

SKILL.md