Skip to content
gsearch

Capturing the API

The capture command: recording the raw AJAX responses Google fires while a SERP loads, one file per response, for research.

gsearch capture is a research aid. While a Google Search page loads, the browser makes a series of background requests, the batchexecute and /_/search XHR calls that carry Google's internal data. capture records the raw JSON responses to those calls, one file per response, so you can study Google's internal structure offline. Everyday extraction does not need this; search reads the rendered page directly.

Recording a capture

gsearch capture "epl" --no-headless --out /tmp/epl-capture
capturing network for 'epl'...
    1 https://www.google.com/search?q=epl&...
       text/html  →  <!doctype html><html...
    2 https://www.google.com/_/search/...
       application/json  →  )]}'  [["...
  ...

12 responses captured → /tmp/epl-capture

Each response is written as a JSON file named HHMMSS-<slug>-NNN.json, carrying the request URL, the content type, and the raw body. The console lists each one as it lands.

Where it writes

--out sets the output directory. Without it, captures go under ~/data/gsearch/capture:

gsearch capture "weather london" --out /tmp/capture
gsearch capture "epl"                          # defaults to ~/data/gsearch/capture

The directory is created if it does not exist.

Headless or visible

Like search, capture runs headlessly by default. Pass --no-headless to watch the window, which is the common choice the first time, both to see what loads and to clear any consent or CAPTCHA page in the same session:

gsearch capture "epl" --no-headless

capture also accepts --profile-dir, --no-profile, and --timeout, with the same meaning as on search. See the CLI reference for the full flag list.

When to use it

Reach for capture when you are investigating how Google assembles a page, for example to find which response carries a particular block, or to compare the raw payload against what extract.js pulled out of the rendered DOM. For getting the structured content of a search, search is the command you want; capture sits underneath it as a way to look at the raw traffic.