115 lines
4.2 KiB
Markdown
115 lines
4.2 KiB
Markdown
# PolySearch
|
|
|
|
Multi-engine web + image search with smart proxy distribution, circuit breakers, structured AI agent output, and a REST API.
|
|
|
|
```bash
|
|
node src/index.js -q "quantum computing" -t both -l 10 -m agent
|
|
```
|
|
|
|
## Features
|
|
|
|
- **Web + Image search** — both result types, one tool
|
|
- **Smart proxy distribution** — least-used selection per hour, balanced across providers
|
|
- **Circuit breaker per proxy** — exponential backoff on failure, auto-recovery
|
|
- **Multi-provider proxy system** — add Webshare, Oxylabs, BrightData in one file each
|
|
- **Multi-engine architecture** — add Brave, Bing, Google in one file each
|
|
- **Per-provider metrics** — requests, success rate, latency, hourly usage grouped by provider
|
|
- **Dual output modes**:
|
|
- `human` — colorized terminal
|
|
- `agent` — structured JSON with statistics
|
|
- **REST API** — single search, batch search, auth with API keys. See [API.md](API.md)
|
|
|
|
## Requirements
|
|
|
|
Node.js 18+
|
|
|
|
## Quick start
|
|
|
|
```bash
|
|
# Single image search
|
|
node src/index.js -q "vintage radio"
|
|
|
|
# Web search
|
|
node src/index.js -q "quantum computing" -t web
|
|
|
|
# Both types, AI agent JSON
|
|
node src/index.js -q "spacex starship" -t both -l 10 -m agent
|
|
|
|
# Show proxy metrics after a search
|
|
node src/index.js -q "mars rover" -M
|
|
```
|
|
|
|
## CLI
|
|
|
|
| Flag | Long | Description | Default |
|
|
|------|------|-------------|---------|
|
|
| `-q` | `--query` | Search query | — |
|
|
| `-t` | `--type` | `web`, `image`, or `both` | `image` |
|
|
| `-l` | `--limit` | Max results per type | `10` |
|
|
| `-m` | `--mode` | `human` or `agent` | `human` |
|
|
| `-p` | `--proxy` | Single proxy URL override | — |
|
|
| `-c` | `--config` | Path to config file | auto-detect |
|
|
| `-M` | `--metrics` | Dump proxy pool metrics | — |
|
|
| | `--serve` | Start REST API server | — |
|
|
| | `--port` | API server port | `9876` |
|
|
| | `--generate-key` | Generate API key | — |
|
|
| `-h` | `--help` | Show help | — |
|
|
|
|
## REST API
|
|
|
|
For AI agent consumption. See [API.md](API.md) for full documentation.
|
|
|
|
```bash
|
|
node src/index.js --generate-key # create an API key
|
|
node src/index.js --serve --port 9876 # start the server
|
|
```
|
|
|
|
**Endpoints:** `GET /health`, `POST /search`, `POST /batch`, `GET /metrics`
|
|
|
|
---
|
|
|
|
## CLI
|
|
|
|
Providers are auto-discovered from environment variables:
|
|
|
|
| Provider | Env vars | Type |
|
|
|----------|----------|------|
|
|
| Webshare | `WEBSHARE_API_KEY` | API-fetched, 10 rotating IPs |
|
|
| Oxylabs | `OXYLABS_USERNAME`, `OXYLABS_PASSWORD`, `OXYLABS_COUNTRY` | Single datacenter endpoint |
|
|
|
|
Add a new provider by creating a file in `src/http/providers/` that calls `registerProvider(name, fetcher)`. The fetcher returns an array of proxy URL strings.
|
|
|
|
## Engine architecture
|
|
|
|
Engines are registered in `src/engines/setup.js`. Each engine supports `web`, `image`, or both. DuckDuckGo is the default. Add Brave, Bing, or custom engines by implementing the `search(query, opts)` interface.
|
|
|
|
## Project structure
|
|
|
|
```
|
|
src/
|
|
├── index.js # CLI + programmatic API + API server dispatch
|
|
├── api.js # REST API server (/search, /batch, /metrics, /health)
|
|
├── api-key.js # Key generation + env storage
|
|
├── cli.js # Argument parsing
|
|
├── config.js # Config loader (json + env + providers)
|
|
├── run.js # Search orchestration + engine fallback
|
|
├── engines/
|
|
│ ├── base.js # Abstract engine interface
|
|
│ ├── index.js # Engine registry
|
|
│ ├── setup.js # Built-in engine registration
|
|
│ └── duckduckgo.js # DuckDuckGo (web + image)
|
|
├── http/
|
|
│ ├── client.js # Fetch wrapper (proxy, retry, timeout, UA)
|
|
│ ├── proxy.js # Proxy pool (least-used, circuit breaker, metrics)
|
|
│ └── providers/
|
|
│ ├── index.js # Provider registry
|
|
│ ├── webshare.js # Webshare.io
|
|
│ └── oxylabs.js # Oxylabs datacenter
|
|
├── output/
|
|
│ ├── human.js # Terminal formatting
|
|
│ └── agent.js # JSON formatting
|
|
└── utils/
|
|
├── logger.js # Pino structured logging
|
|
├── retry.js # Exponential backoff + jitter
|
|
└── ua.js # User-agent pool
|
|
```
|