Sf Datacloud Retrieve

Name: Sf Datacloud Retrieve
Author: jaganpro

jaganpro/sf-skills

Configure and retrieve Salesforce Data Cloud hybrid search indexes on structured DMOs for agent and app retrieval pipelines.

Overview

sf-datacloud-retrieve is an agent skill for the Build phase that helps configure Salesforce Data Cloud hybrid search retrieval indexes on structured DMOs.

Install

npx skills add https://github.com/jaganpro/sf-skills --skill sf-datacloud-retrieve

What is this skill?

Hybrid search index JSON scaffold on structured Data Cloud DMOs
Chunk DMO and vector DMO naming patterns for index and chunk artifacts
Passage extraction chunking with max_tokens 512 and strip_html option
e5_large_v2 embeddings with 1024 dimensions and HNSW index configuration
Part of sf-datacloud-* family with shared CREDITS and UPSTREAM maintenance docs
Hybrid searchType with e5_large_v2 embedding (1024 dimension example)
Passage extraction max_tokens 512 in chunking example

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 872 installs on skills.sh; 418 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You need a correct Data Cloud hybrid index definition across source, chunk, and vector DMOs without guessing embedding and chunking JSON.

Who is it for?

Indie builders or consultants shipping Salesforce Data Cloud search or RAG features who already work in the Data Cloud metadata model.

Skip if: Greenfield apps with no Salesforce org, or teams that only need generic vector DB setup outside Data Cloud.

When should I use this skill?

Defining or troubleshooting Salesforce Data Cloud retrieve/hybrid search index metadata on structured DMOs.

What do I get? / Deliverables

You leave with a structured index configuration template—search type, chunking, and vector embedding blocks—ready to adapt to your DMO and text fields.

Hybrid search index configuration JSON
Chunk and vector DMO naming pattern for your index

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildIntegrations & version control

Data Cloud index and retrieve setup happens while wiring CRM data into product features and agents. Integrations is the canonical shelf for Salesforce Data Cloud DMO, chunk, and vector index configuration skills.

How it compares

Salesforce Data Cloud–specific retrieve scaffolding—not a generic Pinecone or pgvector integration skill.

Common Questions / FAQ

Who is sf-datacloud-retrieve for?

It is for developers and admins building hybrid search on Salesforce Data Cloud structured DMOs, especially alongside other sf-datacloud-* skills.

When should I use sf-datacloud-retrieve?

Use it in Build while defining chunk and vector indexes, embedding models, and field-level chunking before you connect agents or apps to Data Cloud retrieve APIs.

Is sf-datacloud-retrieve safe to install?

Treat it as configuration guidance for production orgs; review the Security Audits panel on this page and validate JSON against your org policies before deploy.

SKILL.md

READMESKILL.md - Sf Datacloud Retrieve

# Credits & Acknowledgments

Primary contributor: **Gnanasekaran Thoppae**

This skill is part of the `sf-datacloud-*` family. Shared attribution, upstream source mapping, and maintenance notes live in:
- [../sf-datacloud/CREDITS.md](../sf-datacloud/CREDITS.md)
- [../sf-datacloud/UPSTREAM.md](../sf-datacloud/UPSTREAM.md)


{
  "label": "<INDEX_NAME>",
  "developerName": "<INDEX_NAME>",
  "description": "Hybrid search index on a structured Data Cloud DMO",
  "sourceDmoDeveloperName": "<SOURCE_DMO>__dlm",
  "chunkDmoName": "<INDEX_NAME> chunk",
  "chunkDmoDeveloperName": "<INDEX_NAME>_chunk",
  "vectorDmoName": "<INDEX_NAME> index",
  "vectorDmoDeveloperName": "<INDEX_NAME>_index",
  "searchType": "HYBRID",
  "vectorEmbedding": {
    "vectorEmbeddingRelatedFields": []
  },
  "rankingConfigurations": [],
  "chunkingConfiguration": {
    "fieldLevelConfigurations": [
      {
        "sourceDmoDeveloperName": "<SOURCE_DMO>__dlm",
        "sourceDmoFieldDeveloperName": "<TEXT_FIELD>__c",
        "config": {
          "id": "passage_extraction",
          "userValues": [
            { "id": "max_tokens", "value": "512" },
            { "id": "strip_html", "value": "true" }
          ]
        }
      }
    ]
  },
  "vectorEmbeddingConfiguration": {
    "embeddingModel": {
      "id": "e5_large_v2",
      "userValues": [
        { "id": "dimension", "value": "1024" },
        { "id": "max_token_limit", "value": "512" }
      ]
    },
    "index": {
      "id": "HNSW",
      "userValues": []
    },
    "similarityMetric": "COSINE"
  }
}


{
  "label": "My_kav",
  "developerName": "My_kav",
  "sourceDmoDeveloperName": "ssot__KnowledgeArticleVersion__dlm",
  "chunkDmoName": "My_kav chunk",
  "chunkDmoDeveloperName": "My_kav_chunk",
  "vectorDmoName": "My_kav index",
  "vectorDmoDeveloperName": "My_kav_index",
  "searchType": "VECTOR",
  "vectorEmbedding": {
    "vectorEmbeddingRelatedFields": []
  },
  "chunkingConfiguration": {
    "fieldLevelConfigurations": [
      {
        "sourceDmoDeveloperName": "ssot__KnowledgeArticleVersion__dlm",
        "sourceDmoFieldDeveloperName": "ssot__Name__c",
        "config": {
          "id": "passage_extraction",
          "userValues": [
            { "id": "strip_html", "value": "true" },
            { "id": "max_tokens", "value": "512" }
          ]
        }
      }
    ]
  },
  "vectorEmbeddingConfiguration": {
    "embeddingModel": {
      "id": "e5_large_v2",
      "userValues": [
        { "id": "dimension", "value": "1024" },
        { "id": "max_token_limit", "value": "512" }
      ]
    },
    "index": {
      "id": "HNSW",
      "userValues": []
    },
    "similarityMetric": "COSINE"
  },
  "rankingConfigurations": []
}


MIT License

Copyright (c) 2024-2025 Jag Valaiyapathy

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


# sf-datacloud-retrieve

Query and search workflows for Salesforce Data Cloud.

## Use this skill for

- quick SQL counts
- paginated SQL (`sqlv2`)
- async query lifecycles
- table describe
- vector search
-

What is this skill?

Hybrid search index JSON scaffold on structured Data Cloud DMOs

Chunk DMO and vector DMO naming patterns for index and chunk artifacts

Passage extraction chunking with max_tokens 512 and strip_html option

e5_large_v2 embeddings with 1024 dimensions and HNSW index configuration

Part of sf-datacloud-* family with shared CREDITS and UPSTREAM maintenance docs

Hybrid searchType with e5_large_v2 embedding (1024 dimension example)

Passage extraction max_tokens 512 in chunking example

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 872 installs on skills.sh; 418 GitHub stars; 3/3 security scanners passed (skills.sh audits).

SKILL.md

READMESKILL.md - Sf Datacloud Retrieve

# Credits & Acknowledgments

Primary contributor: **Gnanasekaran Thoppae**

This skill is part of the `sf-datacloud-*` family. Shared attribution, upstream source mapping, and maintenance notes live in:
- [../sf-datacloud/CREDITS.md](../sf-datacloud/CREDITS.md)
- [../sf-datacloud/UPSTREAM.md](../sf-datacloud/UPSTREAM.md)


{
  "label": "<INDEX_NAME>",
  "developerName": "<INDEX_NAME>",
  "description": "Hybrid search index on a structured Data Cloud DMO",
  "sourceDmoDeveloperName": "<SOURCE_DMO>__dlm",
  "chunkDmoName": "<INDEX_NAME> chunk",
  "chunkDmoDeveloperName": "<INDEX_NAME>_chunk",
  "vectorDmoName": "<INDEX_NAME> index",
  "vectorDmoDeveloperName": "<INDEX_NAME>_index",
  "searchType": "HYBRID",
  "vectorEmbedding": {
    "vectorEmbeddingRelatedFields": []
  },
  "rankingConfigurations": [],
  "chunkingConfiguration": {
    "fieldLevelConfigurations": [
      {
        "sourceDmoDeveloperName": "<SOURCE_DMO>__dlm",
        "sourceDmoFieldDeveloperName": "<TEXT_FIELD>__c",
        "config": {
          "id": "passage_extraction",
          "userValues": [
            { "id": "max_tokens", "value": "512" },
            { "id": "strip_html", "value": "true" }
          ]
        }
      }
    ]
  },
  "vectorEmbeddingConfiguration": {
    "embeddingModel": {
      "id": "e5_large_v2",
      "userValues": [
        { "id": "dimension", "value": "1024" },
        { "id": "max_token_limit", "value": "512" }
      ]
    },
    "index": {
      "id": "HNSW",
      "userValues": []
    },
    "similarityMetric": "COSINE"
  }
}


{
  "label": "My_kav",
  "developerName": "My_kav",
  "sourceDmoDeveloperName": "ssot__KnowledgeArticleVersion__dlm",
  "chunkDmoName": "My_kav chunk",
  "chunkDmoDeveloperName": "My_kav_chunk",
  "vectorDmoName": "My_kav index",
  "vectorDmoDeveloperName": "My_kav_index",
  "searchType": "VECTOR",
  "vectorEmbedding": {
    "vectorEmbeddingRelatedFields": []
  },
  "chunkingConfiguration": {
    "fieldLevelConfigurations": [
      {
        "sourceDmoDeveloperName": "ssot__KnowledgeArticleVersion__dlm",
        "sourceDmoFieldDeveloperName": "ssot__Name__c",
        "config": {
          "id": "passage_extraction",
          "userValues": [
            { "id": "strip_html", "value": "true" },
            { "id": "max_tokens", "value": "512" }
          ]
        }
      }
    ]
  },
  "vectorEmbeddingConfiguration": {
    "embeddingModel": {
      "id": "e5_large_v2",
      "userValues": [
        { "id": "dimension", "value": "1024" },
        { "id": "max_token_limit", "value": "512" }
      ]
    },
    "index": {
      "id": "HNSW",
      "userValues": []
    },
    "similarityMetric": "COSINE"
  },
  "rankingConfigurations": []
}


MIT License

Copyright (c) 2024-2025 Jag Valaiyapathy

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


# sf-datacloud-retrieve

Query and search workflows for Salesforce Data Cloud.

## Use this skill for

- quick SQL counts
- paginated SQL (`sqlv2`)
- async query lifecycles
- table describe
- vector search
-

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is sf-datacloud-retrieve for?

When should I use sf-datacloud-retrieve?

Is sf-datacloud-retrieve safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is sf-datacloud-retrieve for?

When should I use sf-datacloud-retrieve?

Is sf-datacloud-retrieve safe to install?

SKILL.md