
langchain-ai/langsmith-skills
3 skills6.9k installs390 starsGitHub
Install
npx skills add https://github.com/langchain-ai/langsmith-skillsSkills in this repo
1Langsmith TraceLangsmith-trace is a procedural agent skill for solo builders who ship LangChain, LangGraph, or custom LLM stacks and need LangSmith as the system of record for runs. It splits cleanly into two jobs: instrumenting applications so spans and chains land in the right project, and using the langsmith CLI plus API patterns to list, filter, and export traces when something misbehaves in staging or production. The skill stresses checking LANGSMITH_PROJECT in the environment or .env before any query so you are not debugging the wrong workspace. Setup covers required API keys, optional default project naming, and workspace IDs for organization keys. For OSS LangChain apps, enabling LANGSMITH_TRACING is positioned as the fast path while the skill still supports deeper CLI-driven analysis. It fits indie operators who wear every hat: you add tracing during integration work, then rely on the same skill months later when support tickets reference bad completions or runaway token use.2.3kinstalls2Langsmith EvaluatorLangSmith Evaluator is an agent skill for builders who treat LLM apps and agents as products that need measurable quality, not one-off prompt tweaks. It walks through the three pillars LangSmith expects: defining evaluators (including LLM-as-Judge and custom code), wiring run functions that capture outputs and trajectories from your agent, and executing evaluations locally with evaluate() or through LangSmith’s auto-run path after upload. The skill anchors setup on environment variables—LANGSMITH_API_KEY as required, LANGSMITH_PROJECT to know which trace project holds your data, optional workspace scoping, and OpenAI for judge models—and prefers passing --api-key to CLI commands when keys live outside the shell. Python and TypeScript snippets reduce copy-paste friction for solo developers who already trace in LangSmith but have not packaged a repeatable eval harness. Invoke it when you are building or hardening evaluation pipelines, not when you only need generic unit tests with no trace linkage. It complements ship-phase testing and grow-phase analytics by making pass rates and judge scores first-class artifacts you can cite in release notes.2.3kinstalls3Langsmith DatasetLangsmith-dataset is an agent skill that walks you through creating, uploading, and maintaining evaluation datasets in LangSmith—so you can test LLM apps and agents with repeatable inputs instead of ad-hoc chat checks. It is aimed at solo and indie builders shipping agent features who already use or plan to use LangSmith for traces and evals. Invoke it when you need a structured dataset (final response, single-step, full trajectory, or RAG-oriented examples), when you want CLI-first dataset operations, or when you are extending datasets from the Python or JavaScript SDK. The skill stresses authentication setup, reading LANGSMITH_PROJECT to align data with the right trace project, and the langsmith CLI installer. That matters because weak eval data hides regressions until users hit production; a named dataset in LangSmith becomes the backbone for later experiment runs, comparisons, and safer iteration on prompts and tools.2.2kinstalls