Logo
Reference

Proxy Formats and Parsers

Supported proxy formats, parser profiles, JSON paths, HTML selectors, and dedupe behavior.

ZeroTrace Proxy accepts a broad set of proxy input formats across extractor, checker, benchmark, leak test, chain tester, and viewer workflows.

Supported Proxy Formats

FormatExample
Host and port192.0.2.10:8080
Scheme URLhttp://192.0.2.10:8080
SOCKS URLsocks5://198.51.100.20:1080
User/pass URLhttp://user:pass@192.0.2.10:8080
Host-port-user-pass192.0.2.10:8080:user:pass
IPv6 bracketedhttp://[2001:db8::10]:8080

Supported schemes include:

  • HTTP
  • HTTPS
  • SOCKS4
  • SOCKS4a
  • SOCKS5
  • SOCKS5h

Scheme variants with missing or malformed :// are normalized when possible.

Endpoint Validation

The parser validates:

  • IPv4 hosts
  • IPv6 hosts
  • domains
  • localhost
  • ports from 1 to 65535

Parser Profiles

ProfileBehavior
AutoChooses JSON, HTML, or plain text based on content type, body shape, selector, and JSON path.
Plain textRuns text extraction directly.
HTMLUses selectors, common elements, attributes, tables, lists, and script-like blocks.
JSONWalks JSON payloads and proxy-shaped objects.

JSON Path Examples

data.items[*]
data.items[*].proxy
sources[0].items[*]

Supported path features:

  • dot paths
  • numeric indexes like [0]
  • wildcards like [*]

HTML Selector Examples

table tbody tr
pre
code
[data-proxy]

Use selectors when Auto mode finds too much or too little.

Extractor Sources and Fetch Controls

FeatureDetail
SourcesSingle source, many sources, and local file.
HTTP controlsTimeout seconds, concurrency, redirects, random User-Agent, custom User-Agent, custom headers, and cookies.
Body handlingGzip, zlib, and deflate-style decompression plus an extraction body cap around 4 MiB.
Per-source metricsHTTP status, duration, content type, response bytes, parser used, extraction scope count, candidate count, duplicate count, proxy count, and error.

HTML Extraction Details

HTML mode can read selected elements, table tr, pre, code, textarea, ordered lists, unordered lists, data-ip, data-host, data-proxy, and data-address. It also reads useful attributes such as data-*, href, content, value, and title, understands table cells and child elements, and inspects embedded scripts that look like JSON or contain proxy-shaped values.

JSON Extraction Details

JSON mode collects strings, numbers, arrays, and object fields. It can infer proxy-shaped objects from host-like fields, port fields, scheme/protocol/type fields, username/user/login fields, and password/pass fields.

Dedupe Modes

ModeBest for
FullKeeping scheme/auth variants separate.
Host + portReducing duplicates where only scheme or auth varies.
HostAggressive cleanup by host.

For mixed public proxy sources, start with Auto parser and Host + port dedupe. Tighten to HTML selector or JSON path only when source metrics show noise.

On this page