
Python Testing
Write, run, and structure pytest suites for Microsoft Agent Framework Python packages with the repo’s uv/poe workflow and coverage expectations.
Overview
Python Testing is an agent skill most often used in Ship (also Build, Operate) that enforces Agent Framework Python pytest conventions, coverage goals, and uv/poe test commands.
Install
npx skills add https://github.com/microsoft/agent-framework --skill python-testingWhat is this skill?
- 85% test coverage target across core packages and critical paths
- Two-stage CI: unit-only (`-m "not integration"`) per PR commit, full suite on merge
- `asyncio_mode = auto`—use async def without `@pytest.mark.asyncio`
- Parallel package testing via `uv run poe test` with `-P`, `-A`, and `-C` coverage flags
- Separation of unit vs integration tests using pytest markers
- Strives for at least 85% test coverage on core packages
- PR commits run unit tests only; full suite including integration runs on merge
Adoption & trust: 30 installs on skills.sh; 11.2k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are changing Agent Framework Python code but do not know how this repo runs tests, marks integration cases, or expects async pytest to be written.
Who is it for?
Developers extending or patching microsoft/agent-framework Python packages who need correct pytest structure and Poe entrypoints.
Skip if: Greenfield apps unrelated to Agent Framework, or teams using only JavaScript test stacks without this monorepo layout.
When should I use this skill?
Creating, modifying, or running tests in the Agent Framework Python codebase.
What do I get? / Deliverables
Tests follow repo conventions, hit the staged unit-then-integration workflow, and align with the documented coverage focus before you open a PR.
- New or updated pytest modules
- Correct unit vs integration markers
- Coverage-aware test run commands for CI parity
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Ship/testing because the skill governs verification before merge and release of framework code. It encodes pytest markers, async auto mode, integration gates, and `uv run poe test` commands—not product feature implementation.
Where it fits
Add a new core agent helper and scaffold matching unit tests before opening a PR.
Run `uv run poe test -A -m "not integration"` to mirror pre-merge CI on a feature branch.
Reproduce a flaky integration test locally with `uv run poe test -A -m integration` after a production regression report.
How it compares
Repo-specific testing playbook, not a generic “write unit tests” chat prompt.
Common Questions / FAQ
Who is python-testing for?
Contributors and advanced solo builders working inside the Agent Framework Python workspace who need maintainer-aligned pytest and coverage practices.
When should I use python-testing?
In Ship when creating, modifying, or running tests; in Build when adding packages features that need new tests; in Operate when reproducing CI failures locally with `uv run poe test`.
Is python-testing safe to install?
It is documentation for test authoring and local test execution; review Security Audits on this page and limit agent shell access to trusted repos.
SKILL.md
READMESKILL.md - Python Testing
# Python Testing We strive for at least 85% test coverage across the codebase, with a focus on core packages and critical paths. Tests should be fast, reliable, and maintainable. When adding new code, check that the relevant sections of the codebase are covered by tests, and add new tests as needed. When modifying existing code, update or add tests to cover the changes. We run tests in two stages, for a PR each commit is tested with unit tests only (using `-m "not integration"`), and the full suite including integration tests is run when merging. ## Running Tests ```bash # Run tests for all packages in parallel uv run poe test # Run tests for a specific workspace package uv run poe test -P core # Run all selected tests in a single pytest invocation uv run poe test -A # With coverage uv run poe test -A -C uv run poe test -P core -C # Run only unit tests (exclude integration tests) uv run poe test -A -m "not integration" # Run only integration tests uv run poe test -A -m integration ``` Direct package execution still works when you need it: ```bash uv run --directory packages/core poe test ``` ## Test Configuration - **Async mode**: `asyncio_mode = "auto"` is enabled — do NOT use `@pytest.mark.asyncio`, but do mark tests with `async def` and use `await` for async calls - **Timeout**: Default 60 seconds per test - **Import mode**: `importlib` for cross-package isolation - **Parallelization**: Large packages (core, ag-ui, orchestrations, anthropic) use `pytest-xdist` (`-n auto --dist worksteal`) in their `poe test` task. The aggregate `uv run poe test -A` sweep also uses xdist across the selected packages. ## Test Directory Structure Test directories must NOT contain `__init__.py` files. Non-core packages must place tests in a uniquely-named subdirectory: ``` packages/anthropic/ ├── tests/ │ └── anthropic/ # Unique subdirectory matching package name │ ├── conftest.py │ └── test_client.py ``` Core package can use `tests/` directly with topic subdirectories: ``` packages/core/ ├── tests/ │ ├── conftest.py │ ├── core/ │ │ └── test_agents.py │ └── openai/ │ └── test_client.py ``` ## Fixture Guidelines - Use `conftest.py` for shared fixtures within a test directory - Before adding new fixtures, check if existing ones can be reused or extended - Use descriptive names: `mapper`, `test_request`, `mock_client` ## File Naming - Files starting with `test_` are test files — do not use this prefix for helpers - Use `conftest.py` for shared utilities ## Integration Tests Integration tests require external services (OpenAI, Azure, etc.) and are controlled by three markers: 1. **`@pytest.mark.flaky`** — marks the test as potentially flaky since it depends on external services 2. **`@pytest.mark.integration`** — used for test selection, so integration tests can be included/excluded with `-m integration` / `-m "not integration"` 3. **`@skip_if_..._integration_tests_disabled`** decorator — skips the test when the required API keys or service endpoints are missing ### Adding New Integration Tests All three markers must be applied to every new integration test: ```python @pytest.mark.flaky @pytest.mark.integration @skip_if_openai_integration_tests_disabled async def test_openai_chat_completion() -> None: ... ``` For test files where all tests are integration tests (e.g., Azure Functions, Durable Task), use the module-level `pytestmark` list: ```python pytestmark = [ pytest.mark.flaky, pytest.mark.integration, pytest.mark.sample("01_single_agent"), pytest.mark.usefixtures("function_app_for_test"), ] ``` ### CI Workflow The merge CI workflow (`python-merge-tests.yml`) splits integration tests into parallel jobs by provider with change-based detection: - **Unit tests** — always run all n