Ali1688 Sourcing

Name: Ali1688 Sourcing
Author: zhuhongyin

zhuhongyin/global-ecom-skills

Scrape 1688.com listings to compare factory MOQs, tiered wholesale prices, and production hubs before you commit to a SKU.

Overview

Ali1688 Sourcing is an agent skill for the Validate phase that finds 1688 factory and wholesale price listings via a Python scraper for ecommerce supplier research.

Install

npx skills add https://github.com/zhuhongyin/global-ecom-skills --skill ali1688-sourcing

What is this skill?

CLI scraper for 1688.com keyword search with optional province filtering
Structured product records with MOQ-style quantity tiers and per-tier pricing
Reference maps for major China production clusters by product category
Certification hints (CE, FCC, ROHS, FDA, UL) for export-minded sourcing checks
Falls back to mock data when requests/BeautifulSoup are not installed
Reference data covers 6 product-category production hub groupings (e.g. office furniture, small appliances)
5 export certification types documented (CE, FCC, ROHS, FDA, UL)

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 543 installs on skills.sh; 1 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

You cannot estimate real wholesale tiers or factory options for a product idea without manually trawling Chinese wholesale marketplaces.

Who is it for?

Indie ecommerce operators comparing MOQs and factory regions on 1688 before ordering samples or listing on Shopify/Amazon.

Skip if: Builders who need US/EU retail analytics, automated purchase orders, or compliance filing without human supplier vetting.

When should I use this skill?

You need 1688 factory or wholesale price research with keyword search, optional province filter, and structured JSON output for margin modeling.

What do I get? / Deliverables

You get JSON-structured 1688 product candidates with tiered prices and metadata you can plug into margin spreadsheets and shortlists.

JSON export of 1688 products with tiered price tiers and URLs
Console or file output from scrape_1688.py runs for supplier shortlists

Recommended Skills

Agent Browservercel-labs/agent-browser

agent-browser is a Node-installed browser automation CLI built for AI agents that need dependable programmatic web inter…428k installs·35.5k stars

Lark Imlarksuite/cli

Lark IM is a Larksuite agent skill that exposes Feishu/Lark instant messaging to Claude Code, Cursor, and similar agents…210k installs·13.7k stars

Lark Calendarlarksuite/cli

lark-calendar is an agent skill for Feishu/Lark Calendar v4 exposed via lark-cli. Solo builders and small teams who alre…209k installs·13.7k stars

Lark Sheetslarksuite/cli

Skill for programmatic Feishu spreadsheet and worksheet management—create tables, bulk data IO, lookup, and export—using…209k installs·13.7k stars

Lark Vclarksuite/cli

lark-vc is an agent skill for Feishu/Lark video conferencing history and artifacts through lark-cli. After calls end, so…208k installs·13.7k stars

Lark Contactlarksuite/cli

CLI skill for Lark directory lookup: search employees and fetch metadata by open_id, with clear boundaries vs IM, calend…208k installs·13.7k stars

Journey fit

Primary fit

ValidatePricing & offer

Validate is where solo ecommerce builders prove unit economics and supplier fit before inventory risk. Pricing subphase fits wholesale tier comparison and landed-cost thinking sourced from factory listings.

How it compares

Lightweight Python scraper skill—not a licensed sourcing agent, ERP, or customs brokerage workflow.

Common Questions / FAQ

Who is ali1688-sourcing for?

Solo and small-shop sellers researching factory wholesale prices and production areas on 1688 during product validation.

When should I use ali1688-sourcing?

Use it in Validate when modeling unit economics, comparing tiered MOQ pricing, or shortlisting suppliers before sample orders.

Is ali1688-sourcing safe to install?

Check the Security Audits panel on this page; the skill runs outbound HTTP scraping—review code, credentials, and robots/terms compliance before use.

SKILL.md

READMESKILL.md - Ali1688 Sourcing

#!/usr/bin/env python3
"""
1688 Factory & Wholesale Price Scraper
Find factories and wholesale prices on 1688.com

Usage:
    python scrape_1688.py --keyword "升降桌" --limit 20
    python scrape_1688.py --keyword "desk converter" --province 浙江
"""

import argparse
import json
import re
import time
import random
from dataclasses import dataclass, asdict
from datetime import datetime
from typing import List, Optional
from urllib.parse import quote_plus

try:
    import requests
    from bs4 import BeautifulSoup
    HAS_REQUESTS = True
except ImportError:
    HAS_REQUESTS = False
    print("Warning: requests/beautifulsoup4 not installed, will use mock data")


PRODUCTION_AREAS = {
    "办公家具": ["浙江安吉", "广东佛山", "江苏苏州"],
    "小家电": ["广东顺德", "浙江慈溪"],
    "箱包": ["浙江平湖", "广东花都"],
    "玩具": ["广东澄海", "浙江云和"],
    "纺织品": ["浙江绍兴", "江苏南通"],
    "五金工具": ["浙江永康", "广东东莞"],
}

CERTIFICATIONS = {
    "CE": "欧洲安全认证",
    "FCC": "美国联邦通信认证",
    "ROHS": "环保认证",
    "FDA": "美国食品药品认证",
    "UL": "美国安全认证",
}

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
]


@dataclass
class PriceTier:
    quantity: str
    price: float


@dataclass
class Product:
    product_id: str
    title: str
    url: str
    image_url: str
    price_tiers: List[PriceTier]
    starting_price: float
    moq: int
    material: str
    colors: List[str]
    size: str
    weight: str
    packaging: str
    lead_time: str
    certifications: List[str]
    customization: bool
    sample_available: bool
    sample_price: Optional[float]


@dataclass
class Factory:
    rank: int
    company_name: str
    company_url: str
    verified: bool
    verification_type: str
    location: dict
    main_products: List[str]
    factory_info: dict
    products: List[Product]
    trade_info: dict
    ratings: dict
    transaction_history: dict
    contact: dict
    notes: str


@dataclass
class SourcingGuide:
    recommended_factories: List[dict]
    negotiation_tips: List[str]
    quality_checklist: List[str]
    shipping_options: List[dict]


@dataclass
class Ali1688SearchResult:
    keyword: str
    search_time: str
    total_results: int
    returned_results: int
    lowest_price: float
    highest_price: float
    average_price: float
    median_price: float
    main_production_areas: List[str]
    recommended_starting_price: float
    wholesale_price: dict
    factories: List[Factory]
    sourcing_guide: SourcingGuide
    price_for_calculator: dict


class Ali1688Scraper:
    
    def __init__(self):
        self.base_url = "https://www.1688.com"
        self.session = None
        if HAS_REQUESTS:
            self.session = requests.Session()
            self.session.headers.update({
                "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
                "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
                "Accept-Encoding": "gzip, deflate, br",
                "Connection": "keep-alive",
            })
    
    def _get_headers(self) -> dict:
        return {
            "User-Agent": random.choice(USER_AGENTS),
            "Referer": self.base_url,
        }
    
    def _fetch_page(self, url: str, retries: int = 3) -> Optional[str]:
        if not HAS_REQUESTS or not self.session:
            return None
        
        for attempt in range(retries):
            try:
                time.sleep(random.uniform(1, 2))
                response = self.session.get(
                    url,
                    headers=self._get_headers(),
                    timeout=30,
                    allow_redirects=True
                )
                
                if response.status_code == 200:
                    response.encoding = 'utf-8'

What is this skill?

CLI scraper for 1688.com keyword search with optional province filtering

Structured product records with MOQ-style quantity tiers and per-tier pricing

Reference maps for major China production clusters by product category

Certification hints (CE, FCC, ROHS, FDA, UL) for export-minded sourcing checks

Falls back to mock data when requests/BeautifulSoup are not installed

Reference data covers 6 product-category production hub groupings (e.g. office furniture, small appliances)

5 export certification types documented (CE, FCC, ROHS, FDA, UL)

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 543 installs on skills.sh; 1 GitHub stars; 2/3 security scanners passed (skills.sh audits).

SKILL.md

READMESKILL.md - Ali1688 Sourcing

#!/usr/bin/env python3
"""
1688 Factory & Wholesale Price Scraper
Find factories and wholesale prices on 1688.com

Usage:
    python scrape_1688.py --keyword "升降桌" --limit 20
    python scrape_1688.py --keyword "desk converter" --province 浙江
"""

import argparse
import json
import re
import time
import random
from dataclasses import dataclass, asdict
from datetime import datetime
from typing import List, Optional
from urllib.parse import quote_plus

try:
    import requests
    from bs4 import BeautifulSoup
    HAS_REQUESTS = True
except ImportError:
    HAS_REQUESTS = False
    print("Warning: requests/beautifulsoup4 not installed, will use mock data")


PRODUCTION_AREAS = {
    "办公家具": ["浙江安吉", "广东佛山", "江苏苏州"],
    "小家电": ["广东顺德", "浙江慈溪"],
    "箱包": ["浙江平湖", "广东花都"],
    "玩具": ["广东澄海", "浙江云和"],
    "纺织品": ["浙江绍兴", "江苏南通"],
    "五金工具": ["浙江永康", "广东东莞"],
}

CERTIFICATIONS = {
    "CE": "欧洲安全认证",
    "FCC": "美国联邦通信认证",
    "ROHS": "环保认证",
    "FDA": "美国食品药品认证",
    "UL": "美国安全认证",
}

USER_AGENTS = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0",
]


@dataclass
class PriceTier:
    quantity: str
    price: float


@dataclass
class Product:
    product_id: str
    title: str
    url: str
    image_url: str
    price_tiers: List[PriceTier]
    starting_price: float
    moq: int
    material: str
    colors: List[str]
    size: str
    weight: str
    packaging: str
    lead_time: str
    certifications: List[str]
    customization: bool
    sample_available: bool
    sample_price: Optional[float]


@dataclass
class Factory:
    rank: int
    company_name: str
    company_url: str
    verified: bool
    verification_type: str
    location: dict
    main_products: List[str]
    factory_info: dict
    products: List[Product]
    trade_info: dict
    ratings: dict
    transaction_history: dict
    contact: dict
    notes: str


@dataclass
class SourcingGuide:
    recommended_factories: List[dict]
    negotiation_tips: List[str]
    quality_checklist: List[str]
    shipping_options: List[dict]


@dataclass
class Ali1688SearchResult:
    keyword: str
    search_time: str
    total_results: int
    returned_results: int
    lowest_price: float
    highest_price: float
    average_price: float
    median_price: float
    main_production_areas: List[str]
    recommended_starting_price: float
    wholesale_price: dict
    factories: List[Factory]
    sourcing_guide: SourcingGuide
    price_for_calculator: dict


class Ali1688Scraper:
    
    def __init__(self):
        self.base_url = "https://www.1688.com"
        self.session = None
        if HAS_REQUESTS:
            self.session = requests.Session()
            self.session.headers.update({
                "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
                "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
                "Accept-Encoding": "gzip, deflate, br",
                "Connection": "keep-alive",
            })
    
    def _get_headers(self) -> dict:
        return {
            "User-Agent": random.choice(USER_AGENTS),
            "Referer": self.base_url,
        }
    
    def _fetch_page(self, url: str, retries: int = 3) -> Optional[str]:
        if not HAS_REQUESTS or not self.session:
            return None
        
        for attempt in range(retries):
            try:
                time.sleep(random.uniform(1, 2))
                response = self.session.get(
                    url,
                    headers=self._get_headers(),
                    timeout=30,
                    allow_redirects=True
                )
                
                if response.status_code == 200:
                    response.encoding = 'utf-8'

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is ali1688-sourcing for?

When should I use ali1688-sourcing?

Is ali1688-sourcing safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is ali1688-sourcing for?

When should I use ali1688-sourcing?

Is ali1688-sourcing safe to install?

SKILL.md