
Model Quantization
Shrink LLM weights with INT8, INT4, or GPTQ methods to cut VRAM, speed inference, and ship smaller models on edge or budget GPUs.
npx skills add https://github.com/martinholovsky/claude-skills-generator --skill model-quantization| Installs | 142 |
|---|---|
| Repository | martinholovsky/claude-skills-generator ↗ |