Redis Semantic Cache

Name: Redis Semantic Cache
Author: redis

redis/agent-skills

Wire Redis LangCache so LLM calls use semantic cache-aside with tuned thresholds and separate caches per task.

Overview

redis-semantic-cache is an agent skill for the Build phase that configures Redis LangCache similarity thresholds and per-task caches for LLM responses.

Install

npx skills add https://github.com/redis/agent-skills --skill redis-semantic-cache

What is this skill?

Documents cache-aside flow for LLM responses via Redis LangCache (preview)
Tunable similarity thresholds (e.g. 0.95 strict vs 0.8 higher hit rate)
Separate cache IDs per task (support vs code generation)
Python LangCache client examples with server URL, cache_id, and API key env vars
Similarity threshold examples: 0.95 (stricter) and 0.8 (looser) documented in SKILL.md
Recommends separate cache IDs for distinct LLM tasks (e.g. support vs code)

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 70 GitHub stars; trending (+100% hot-view momentum).

What problem does it solve?

Your agent app repeats similar LLM prompts and burns tokens because every request hits the model with no semantic reuse.

Who is it for?

Builders on Redis Cloud preview LangCache who want Python-ready cache-aside patterns for distinct LLM workloads.

Skip if: Projects without Redis LangCache, or teams that need exact string match caching only with no embedding similarity.

When should I use this skill?

When adding or tuning Redis LangCache for LLM cache-aside with similarity search and per-use-case cache separation.

What do I get? / Deliverables

LangCache searches run with task-appropriate thresholds and isolated cache IDs before falling back to live generation.

Configured LangCache client calls with thresholds
Per-task cache_id separation pattern

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildIntegrations & version control

Semantic caching is implemented while integrating the LLM stack into your product backend. LangCache client setup, thresholds, and multi-cache IDs are integration work against Redis Cloud—not generic prompt tuning alone.

Also useful

OperateInfrastructure & cost

Also useful

ShipPerformance

How it compares

Integration skill for managed semantic cache—not a substitute for application-level prompt versioning or eval harnesses.

Common Questions / FAQ

Who is redis-semantic-cache for?

Solo and indie developers integrating LLM APIs who already use or plan Redis LangCache on Redis Cloud.

When should I use redis-semantic-cache?

During Build integrations when you wire production LLM routes and need threshold and multi-cache tuning before Ship load testing.

Is redis-semantic-cache safe to install?

Treat API keys and cache IDs as secrets; confirm preview-service terms and review Security Audits on this Prism page before install.

SKILL.md

READMESKILL.md - Redis Semantic Cache

{
  "name": "redis-semantic-cache",
  "version": "1.0.0",
  "description": "Redis LangCache — cache-aside flow for LLM responses, similarity threshold tuning, per-task cache separation.",
  "author": {
    "name": "Redis",
    "email": "support@redis.com"
  },
  "homepage": "https://redis.io",
  "repository": "https://github.com/redis/agent-skills",
  "license": "MIT",
  "keywords": ["redis", "semantic-cache", "langcache", "llm", "ai"]
}


# Configure Semantic Cache Properly

> **Note:** LangCache is currently in preview on Redis Cloud. Features and behavior may change.

Tune similarity threshold and cache separation for optimal LangCache results.

**Correct:** Tune similarity threshold for your use case.

```python
from langcache import LangCache

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY")
)

# Stricter matching - fewer false positives (0.95 = very similar)
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.95
)

# Looser matching - higher hit rate (0.8 = somewhat similar)
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.8
)
```

**Correct:** Use separate caches for different use cases.

```python
# Create different cache IDs in Redis Cloud for different LLM tasks
support_cache = LangCache(
    server_url=server_url,
    cache_id="support-cache-id",
    api_key=api_key
)

code_cache = LangCache(
    server_url=server_url,
    cache_id="code-cache-id",
    api_key=api_key
)
```

**Incorrect:** Using a single cache for all LLM tasks.

```python
# All tasks share one cache - responses may not be relevant
result = lang_cache.search(prompt="How do I reset my password?")
# Could return a code snippet if someone asked a similar coding question
```

**Best practices:**
- Start with threshold 0.9, adjust based on your use case
- Use custom attributes to filter results within a single cache
- Monitor cache hit rates to evaluate effectiveness
- Use separate cache IDs for fundamentally different LLM tasks

Reference: [LangCache Best Practices](https://redis.io/docs/latest/develop/ai/langcache/)


# Use LangCache for LLM Response Caching

> **Note:** LangCache is currently in preview on Redis Cloud. Features and behavior may change.

LangCache is a fully-managed semantic caching service on Redis Cloud that reduces LLM costs and latency.

**How it works:**
1. Your app sends a prompt to LangCache via `POST /v1/caches/{cacheId}/entries/search`
2. LangCache generates an embedding and searches for similar cached responses
3. If found (cache hit), returns the cached response instantly
4. If not found (cache miss), your app calls the LLM and stores the response

**Correct:** Use the LangCache Python SDK.

```python
from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY")
)

# Search for cached response
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.9
)

if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    # Store for future queries
    lang_cache.set(
        prompt="What is Redis?",
        response=response
    )
```

**LangCache REST API:**

```bash
# Search cache
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries/search" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Redis?"}'

# Store a response
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Redis?", "response": "Redis is an in-memory database..."}'
```

**With custom attributes for filtering:**

```python
# Store with attributes
lang_cache.set(
    prompt="What is Redis?",
    response="Redis is an in-memory database...",
    attributes=

What is this skill?

Documents cache-aside flow for LLM responses via Redis LangCache (preview)

Tunable similarity thresholds (e.g. 0.95 strict vs 0.8 higher hit rate)

Separate cache IDs per task (support vs code generation)

Python LangCache client examples with server URL, cache_id, and API key env vars

Similarity threshold examples: 0.95 (stricter) and 0.8 (looser) documented in SKILL.md

Recommends separate cache IDs for distinct LLM tasks (e.g. support vs code)

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 70 GitHub stars; trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildIntegrations & version control

Also useful

OperateInfrastructure & cost

Also useful

ShipPerformance

SKILL.md

READMESKILL.md - Redis Semantic Cache

{
  "name": "redis-semantic-cache",
  "version": "1.0.0",
  "description": "Redis LangCache — cache-aside flow for LLM responses, similarity threshold tuning, per-task cache separation.",
  "author": {
    "name": "Redis",
    "email": "support@redis.com"
  },
  "homepage": "https://redis.io",
  "repository": "https://github.com/redis/agent-skills",
  "license": "MIT",
  "keywords": ["redis", "semantic-cache", "langcache", "llm", "ai"]
}


# Configure Semantic Cache Properly

> **Note:** LangCache is currently in preview on Redis Cloud. Features and behavior may change.

Tune similarity threshold and cache separation for optimal LangCache results.

**Correct:** Tune similarity threshold for your use case.

```python
from langcache import LangCache

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY")
)

# Stricter matching - fewer false positives (0.95 = very similar)
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.95
)

# Looser matching - higher hit rate (0.8 = somewhat similar)
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.8
)
```

**Correct:** Use separate caches for different use cases.

```python
# Create different cache IDs in Redis Cloud for different LLM tasks
support_cache = LangCache(
    server_url=server_url,
    cache_id="support-cache-id",
    api_key=api_key
)

code_cache = LangCache(
    server_url=server_url,
    cache_id="code-cache-id",
    api_key=api_key
)
```

**Incorrect:** Using a single cache for all LLM tasks.

```python
# All tasks share one cache - responses may not be relevant
result = lang_cache.search(prompt="How do I reset my password?")
# Could return a code snippet if someone asked a similar coding question
```

**Best practices:**
- Start with threshold 0.9, adjust based on your use case
- Use custom attributes to filter results within a single cache
- Monitor cache hit rates to evaluate effectiveness
- Use separate cache IDs for fundamentally different LLM tasks

Reference: [LangCache Best Practices](https://redis.io/docs/latest/develop/ai/langcache/)


# Use LangCache for LLM Response Caching

> **Note:** LangCache is currently in preview on Redis Cloud. Features and behavior may change.

LangCache is a fully-managed semantic caching service on Redis Cloud that reduces LLM costs and latency.

**How it works:**
1. Your app sends a prompt to LangCache via `POST /v1/caches/{cacheId}/entries/search`
2. LangCache generates an embedding and searches for similar cached responses
3. If found (cache hit), returns the cached response instantly
4. If not found (cache miss), your app calls the LLM and stores the response

**Correct:** Use the LangCache Python SDK.

```python
from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY")
)

# Search for cached response
result = lang_cache.search(
    prompt="What is Redis?",
    similarity_threshold=0.9
)

if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    # Store for future queries
    lang_cache.set(
        prompt="What is Redis?",
        response=response
    )
```

**LangCache REST API:**

```bash
# Search cache
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries/search" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Redis?"}'

# Store a response
curl -X POST "https://$HOST/v1/caches/$CACHE_ID/entries" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is Redis?", "response": "Redis is an in-memory database..."}'
```

**With custom attributes for filtering:**

```python
# Store with attributes
lang_cache.set(
    prompt="What is Redis?",
    response="Redis is an in-memory database...",
    attributes=

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is redis-semantic-cache for?

When should I use redis-semantic-cache?

Is redis-semantic-cache safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is redis-semantic-cache for?

When should I use redis-semantic-cache?

Is redis-semantic-cache safe to install?

SKILL.md