Ocr Super Surya

Name: Ocr Super Surya
Author: aktsmm

aktsmm/agent-skills

Give your agent a named OCR workflow around the Surya stack when ingesting scans, PDFs, or screenshots into text pipelines.

Overview

OCR Super Surya is an agent skill for the Build phase that supports Surya-oriented OCR so solo builders can extract text from images and documents in agent workflows.

Install

npx skills add https://github.com/aktsmm/agent-skills --skill ocr-super-surya

What is this skill?

Skill slug ocr-super-surya signals Surya-based OCR for agent-driven document workflows
Suited to turning images and scanned pages into machine-readable text in dev pipelines
Pairs with content and knowledge-base builds that need local or scripted OCR steps
Licensed CC BY-NC-SA 4.0 with explicit AI/ML training restriction in upstream readme

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 506 installs on skills.sh; 17 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You have image or scan inputs but no repeatable OCR step your coding agent can invoke while building document features.

Who is it for?

Indie builders adding OCR to personal tools, research notebooks, or internal doc pipelines where Surya is the chosen engine.

Skip if: Teams needing guaranteed commercial licensing, turnkey cloud OCR with SLAs, or skills with full procedural docs already visible in Prism.

When should I use this skill?

Building pipelines that need OCR on images or scans before downstream text processing.

What do I get? / Deliverables

You can run a documented Surya OCR path so extracted text feeds downstream parsing, RAG, or validation in your project.

Extracted plain or structured text from inputs

Recommended Skills

Lark Maillarksuite/cli

Feishu email skill covering compose, send, reply, forward, search, drafts, attachments, contacts, and mail rules via lar…209k installs·13.7k stars

Lark Slideslarksuite/cli

Template and markup for building themed Lark Office slide presentations, including title slide styling for company meeti…162k installs·13.7k stars

Pptxanthropics/skills

pptx is Anthropic’s agent skill for PowerPoint work inside Claude-powered coding and assistant flows. Solo builders reac…138k installs·148k stars

Pdfanthropics/skills

pdf is a journey-wide Anthropic agent skill for anything involving PDF files: reading and extracting text or tables, mer…130k installs·148k stars

Lark Markdownlarksuite/cli

CLI-oriented skill for Lark Drive native Markdown: create, read, overwrite, diff, and localized patch with clear boundar…125k installs·13.7k stars

Docxanthropics/skills

End-to-end Word document skill for creation, extraction, and structured editing of professional .docx files using pandoc…118k installs·148k stars

Journey fit

Primary fit

BuildAgent skills & templates

Build is the primary shelf because OCR is applied while constructing document ingestion, RAG, or automation features. Agent-tooling captures skills that equip the coding agent with specialized document perception capabilities rather than generic UI work.

Also useful

GrowContent & marketing

How it compares

Skill-packaged OCR guidance, not a hosted document API marketplace entry.

Common Questions / FAQ

Who is ocr-super-surya for?

Solo builders and agents automating text extraction from scans and images during product development.

When should I use ocr-super-surya?

In Build when implementing ingestion, CLI tools, or agent actions that must OCR images before search or LLM processing.

Is ocr-super-surya safe to install?

Check the Security Audits panel on this page and read the upstream CC BY-NC-SA license plus AI-training restrictions before relying on it commercially.

SKILL.md

READMESKILL.md - Ocr Super Surya

# Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

## English

Copyright (c) 2025-2026 yamapan (aktsmm)

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0
International License.

You are free to:

- **Share** — copy and redistribute the material in any medium or format
- **Adapt** — remix, transform, and build upon the material

Under the following terms:

- **Attribution** — You must give appropriate credit, provide a link to the
  license, and indicate if changes were made. You may do so in any reasonable manner,
  but not in any way that suggests the licensor endorses you or your use.

- **NonCommercial** — You may not use the material for commercial purposes.
  *(Please contact the author if you wish to use this material for commercial purposes.)*

- **ShareAlike** — If you remix, transform, or build upon the material, you must
  distribute your contributions under the same license as the original.

No additional restrictions — You may not apply legal terms or technological
measures that legally restrict others from doing anything the license permits.

**AI/ML Training Restriction** — Use of this content for AI/ML training, data
mining, or other analytical purposes is prohibited without explicit permission.

Full license text: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode

---

## 日本語

Copyright (c) 2025-2026 yamapan (aktsmm)

この作品はクリエイティブ・コモンズ 表示-非営利-継承 4.0 国際ライセンスの下に提供されています。

あなたは以下の条件に従う限り、自由に：

- **共有** — どのようなメディアやフォーマットでも資料を複製・再配布できます
- **翻案** — 資料をリミックス、変形、および加工することができます

以下の条件に従ってください：

- **表示** — あなたは適切なクレジットを表示し、ライセンスへのリンクを提供し、
  変更があったらその旨を示さなければなりません。これらは合理的であればどのような方法で
  行っても構いませんが、許諾者があなたやあなたの利用行為を支持していると示唆するような
  方法は除きます。

- **非営利** — あなたは営利目的でこの資料を利用してはなりません。
  （※商用利用をご希望の場合は、別途ご連絡ください。）

- **継承** — もしあなたがこの資料をリミックス、変形、または加工した場合、
  あなたはあなたの貢献部分を元の作品と同じライセンスの下で配布しなければなりません。

追加的な制約は課せません — あなたは、このライセンスが他の者に許諾することを法的に
制限するような法的条項や技術的手段を適用してはなりません。

**AI/MLトレーニング制限** — 本コンテンツをAI/MLモデルのトレーニング、データマイニング、
その他の解析目的での使用は明示的な許可なく禁止されています。

ライセンス全文: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.ja

---

## Special Permission for Microsoft Employees / Microsoft 社員向け特別許諾

### English

Microsoft Corporation employees are granted permission to use, copy, modify, and
distribute this material for any purpose within the scope of their employment
duties at Microsoft, including internal business use and customer-facing
activities, without the NonCommercial restriction of this license.

This special permission applies only to work performed as part of official
Microsoft business activities.

### 日本語

Microsoft Corporation の社員は、Microsoft での業務の範疇において、本資料を社内業務
および顧客対応を含むあらゆる目的で使用、複製、改変、配布することが許諾されます。
この場合、本ライセンスの「非営利」制限は適用されません。

この特別許諾は、Microsoft の公式な業務活動の一環として行われる作業にのみ適用されます。

---

## Disclaimer / 免責事項

### English

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR
A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

### 日本語

本ソフトウェアは「現状のまま」で提供され、明示または黙示を問わず、商品性、
特定目的への適合性、および権利非侵害についての保証を含むがこれに限定されない、
いかなる種類の保証も伴いません。作者または著作権者は、契約行為、不法行為、
またはそれ以外であろうと、ソフトウェアに起因または関連し、あるいはソフトウェアの
使用またはその他の扱いによって生じる一切の請求、損害、その他の責任について
責任を負いません。


#!/usr/bin/env python3
"""
OCR Helper - Surya OCR wrapper for common tasks.

Usage:
    from ocr_helper import ocr_image, ocr_pdf
    
    # Single image
    text = ocr_image("screenshot.png")
    
    # PDF (all pages)
    results = ocr_pdf("document.pdf")
    
    # With verbose logging
    text = ocr_image("image.png", verbose=True)
"""

import os
import logging
from pathlib import Path
from typing import Optional

# Configure logging
logge

What is this skill?

Skill slug ocr-super-surya signals Surya-based OCR for agent-driven document workflows

Suited to turning images and scanned pages into machine-readable text in dev pipelines

Pairs with content and knowledge-base builds that need local or scripted OCR steps

Licensed CC BY-NC-SA 4.0 with explicit AI/ML training restriction in upstream readme

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 506 installs on skills.sh; 17 GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Primary fit

BuildAgent skills & templates

Also useful

GrowContent & marketing

SKILL.md

READMESKILL.md - Ocr Super Surya

# Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

## English

Copyright (c) 2025-2026 yamapan (aktsmm)

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0
International License.

You are free to:

- **Share** — copy and redistribute the material in any medium or format
- **Adapt** — remix, transform, and build upon the material

Under the following terms:

- **Attribution** — You must give appropriate credit, provide a link to the
  license, and indicate if changes were made. You may do so in any reasonable manner,
  but not in any way that suggests the licensor endorses you or your use.

- **NonCommercial** — You may not use the material for commercial purposes.
  *(Please contact the author if you wish to use this material for commercial purposes.)*

- **ShareAlike** — If you remix, transform, or build upon the material, you must
  distribute your contributions under the same license as the original.

No additional restrictions — You may not apply legal terms or technological
measures that legally restrict others from doing anything the license permits.

**AI/ML Training Restriction** — Use of this content for AI/ML training, data
mining, or other analytical purposes is prohibited without explicit permission.

Full license text: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode

---

## 日本語

Copyright (c) 2025-2026 yamapan (aktsmm)

この作品はクリエイティブ・コモンズ 表示-非営利-継承 4.0 国際ライセンスの下に提供されています。

あなたは以下の条件に従う限り、自由に：

- **共有** — どのようなメディアやフォーマットでも資料を複製・再配布できます
- **翻案** — 資料をリミックス、変形、および加工することができます

以下の条件に従ってください：

- **表示** — あなたは適切なクレジットを表示し、ライセンスへのリンクを提供し、
  変更があったらその旨を示さなければなりません。これらは合理的であればどのような方法で
  行っても構いませんが、許諾者があなたやあなたの利用行為を支持していると示唆するような
  方法は除きます。

- **非営利** — あなたは営利目的でこの資料を利用してはなりません。
  （※商用利用をご希望の場合は、別途ご連絡ください。）

- **継承** — もしあなたがこの資料をリミックス、変形、または加工した場合、
  あなたはあなたの貢献部分を元の作品と同じライセンスの下で配布しなければなりません。

追加的な制約は課せません — あなたは、このライセンスが他の者に許諾することを法的に
制限するような法的条項や技術的手段を適用してはなりません。

**AI/MLトレーニング制限** — 本コンテンツをAI/MLモデルのトレーニング、データマイニング、
その他の解析目的での使用は明示的な許可なく禁止されています。

ライセンス全文: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.ja

---

## Special Permission for Microsoft Employees / Microsoft 社員向け特別許諾

### English

Microsoft Corporation employees are granted permission to use, copy, modify, and
distribute this material for any purpose within the scope of their employment
duties at Microsoft, including internal business use and customer-facing
activities, without the NonCommercial restriction of this license.

This special permission applies only to work performed as part of official
Microsoft business activities.

### 日本語

Microsoft Corporation の社員は、Microsoft での業務の範疇において、本資料を社内業務
および顧客対応を含むあらゆる目的で使用、複製、改変、配布することが許諾されます。
この場合、本ライセンスの「非営利」制限は適用されません。

この特別許諾は、Microsoft の公式な業務活動の一環として行われる作業にのみ適用されます。

---

## Disclaimer / 免責事項

### English

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR
A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

### 日本語

本ソフトウェアは「現状のまま」で提供され、明示または黙示を問わず、商品性、
特定目的への適合性、および権利非侵害についての保証を含むがこれに限定されない、
いかなる種類の保証も伴いません。作者または著作権者は、契約行為、不法行為、
またはそれ以外であろうと、ソフトウェアに起因または関連し、あるいはソフトウェアの
使用またはその他の扱いによって生じる一切の請求、損害、その他の責任について
責任を負いません。


#!/usr/bin/env python3
"""
OCR Helper - Surya OCR wrapper for common tasks.

Usage:
    from ocr_helper import ocr_image, ocr_pdf
    
    # Single image
    text = ocr_image("screenshot.png")
    
    # PDF (all pages)
    results = ocr_pdf("document.pdf")
    
    # With verbose logging
    text = ocr_image("image.png", verbose=True)
"""

import os
import logging
from pathlib import Path
from typing import Optional

# Configure logging
logge

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is ocr-super-surya for?

When should I use ocr-super-surya?

Is ocr-super-surya safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is ocr-super-surya for?

When should I use ocr-super-surya?

Is ocr-super-surya safe to install?

SKILL.md