Use Case: AI Agent Development
When an AI Agent needs to control a browser for tasks like "search competitor prices", "fill out and submit forms", or "extract structured data from web pages", the traditional approach requires the Agent to directly generate Puppeteer/Playwright code — brittle and hard to debug. Browser Forest offers two more direct integration paths: the AI Browser SDK (act/extract/observe high-level API) and an MCP Server (letting LLM tools like Claude / Cursor directly invoke browser operations).
Approach A: AI Browser SDK
The SDK encapsulates the full logic of "operating a browser using natural language": the LLM translates natural language instructions into CDP operations, executes them, and returns results. You only need a Browser Forest API Key and an LLM Provider (OpenAI, Anthropic, etc.).
Installation
npm install @browser-forest/ai-sdk
Initialization
import { AIBrowser, OpenAIProvider } from '@browser-forest/ai-sdk';
import OpenAI from 'openai';
const llm = new OpenAIProvider({
client: new OpenAI({ apiKey: process.env.OPENAI_API_KEY }),
model: 'gpt-4o', // or 'gpt-4o-mini' for lower cost
});
const browser = new AIBrowser({
apiKey: process.env.BROWSER_FOREST_API_KEY, // bf_live_xxx
llm,
});
act — Execute Natural Language Operations
act() lets the LLM analyze the current page DOM and generate + execute the corresponding CDP click / type / navigation actions. Ideal for interaction tasks like "click a button", "fill out a form", or "navigate to a page".
const session = await browser.session();
// Navigate
await session.act('navigate to https://news.ycombinator.com');
// Click
await session.act('click on the first story link');
// Fill a form and submit
await session.act('type "browser automation" in the search box and press Enter');
await session.close();
extract — Extract Structured Data
extract() lets the LLM extract data from the current page DOM according to your described structure, returning a type-safe object. Combining with Zod schemas ensures accurate data formats.
import { z } from 'zod';
const session = await browser.session();
await session.act('navigate to https://news.ycombinator.com');
// Extract structured data
const stories = await session.extract({
instruction: 'extract the top 10 stories: title, URL, points, and comment count',
schema: z.array(z.object({
title: z.string(),
url: z.string().url(),
points: z.number(),
comments: z.number(),
})),
});
// stories is typed as Array<{ title: string; url: string; points: number; comments: number }>
console.log(stories);
await session.close();
observe — Observe Page State
observe() lets the LLM describe the current page content and list available interactive actions. Great for Agents that need to "look at what's on the page" before deciding the next step.
const session = await browser.session();
await session.act('navigate to https://example.com/checkout');
const state = await session.observe('what forms and buttons are on this page?');
console.log(state.description);
// "The page shows a checkout form with fields for shipping address,
// payment method, and a 'Place Order' button."
console.log(state.actions);
// [
// { action: 'type', element: '#shipping-name', description: 'Enter recipient name' },
// { action: 'click', element: '#place-order-btn', description: 'Submit the order' },
// ]
// Decide next step based on observation
if (state.actions.some(a => a.element === '#place-order-btn')) {
await session.act('fill in the shipping address and place the order');
}
await session.close();
Full Agent Example: Competitor Price Monitoring
import { AIBrowser, OpenAIProvider } from '@browser-forest/ai-sdk';
import { z } from 'zod';
import OpenAI from 'openai';
const browser = new AIBrowser({
apiKey: process.env.BROWSER_FOREST_API_KEY!,
llm: new OpenAIProvider({
client: new OpenAI({ apiKey: process.env.OPENAI_API_KEY! }),
model: 'gpt-4o-mini',
}),
});
async function monitorPrice(productName: string) {
const session = await browser.session({ os: 'windows' });
try {
// Search for product
await session.act(`navigate to https://www.amazon.com`);
await session.act(`search for "${productName}" and press Enter`);
await session.act('wait for search results to load');
// Extract pricing info
const results = await session.extract({
instruction: 'extract the first 5 products: name, price, rating, review count',
schema: z.array(z.object({
name: z.string(),
price: z.string(),
rating: z.number().optional(),
reviewCount: z.number().optional(),
})),
});
return results;
} finally {
await session.close();
}
}
const prices = await monitorPrice('wireless earbuds');
console.table(prices);
Approach B: MCP Server
Browser Forest includes a built-in MCP (Model Context Protocol) Server. MCP-compatible LLM tools (Claude Desktop, Cursor, Continue, etc.) can directly invoke browser operations — no code required. The LLM itself decides when to create a Session, where to navigate, and what data to extract.
Configure in Claude Desktop
Open the Claude Desktop config file (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json) and add:
{
"mcpServers": {
"browser-forest": {
"url": "https://bf.mktindex.com/api/mcp"
}
}
}
After restarting Claude Desktop, you can simply say in conversation:
"Open Hacker News and extract the top 5 article titles and links"
"Go to Amazon and search for AirPods Pro, tell me the current price"
"Open https://example.com/login, log in with [email protected] / pass123,
then go to Dashboard and take a screenshot"
Configure in Cursor
Open Cursor Settings → MCP → Add Server, and enter:
{
"name": "browser-forest",
"url": "https://bf.mktindex.com/api/mcp"
}
Available Tools
| Tool Name | Description | Key Parameters |
|---|---|---|
| mirage_create_session | Create a browser Session | apiKey, contextId, timeout |
| mirage_close_session | Close a Session | sessionId |
| mirage_navigate | Navigate to a URL | sessionId, url, waitUntil |
| mirage_screenshot | Take a screenshot (returns base64) | sessionId |
| mirage_click | Click an element | sessionId, selector |
| mirage_type | Type text into an element | sessionId, selector, text |
| mirage_extract_text | Extract page text content | sessionId, selector? |
| mirage_extract_structured | Extract structured data via natural language | sessionId, instruction |
| mirage_scroll | Scroll the page | sessionId, x, y |
| mirage_wait | Wait for specified milliseconds | sessionId, ms |
| mirage_scrape | One-shot URL scrape (no Session needed) | url, format |
Integrate in LangChain (REST API)
In Python / LangChain Agents, call Browser Forest via REST API, wrapping create_session and scrape as LangChain Tools:
import requests
from langchain.tools import tool
API_KEY = "bf_live_xxxxxxxx"
BASE = "https://bf.mktindex.com/api/v1"
HEADERS = {"X-API-Key": API_KEY, "Content-Type": "application/json"}
@tool
def scrape_url(url: str) -> str:
"""Scrape a URL and return the page content as markdown.
Use this when you need to read content from a web page."""
res = requests.post(
f"{BASE}/scrape",
headers=HEADERS,
json={"url": url, "format": "markdown", "waitFor": "networkidle"},
timeout=30,
)
data = res.json()
return data.get("content", "")
@tool
def scrape_and_extract(url: str, instruction: str) -> str:
"""Navigate to a URL and extract specific information.
url: the page to visit
instruction: what data to extract in plain English"""
# Step 1: Create session
session = requests.post(
f"{BASE}/sessions",
headers=HEADERS,
json={"os": "windows", "timeout": 120},
).json()
session_id = session["id"]
try:
# Step 2: Get page content via scrape API
content = requests.post(
f"{BASE}/scrape",
headers=HEADERS,
json={"url": url, "format": "markdown"},
).json().get("content", "")
return f"Page content from {url}:\n\n{content}"
finally:
# Step 3: Always clean up
requests.delete(f"{BASE}/sessions/{session_id}", headers=HEADERS)
# Use in a LangChain Agent
from langchain.agents import create_openai_tools_agent, AgentExecutor
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
tools = [scrape_url, scrape_and_extract]
# ... create agent and run
mirage_create_session to get a sessionId, pass it to subsequent calls, and finally call mirage_close_session to clean up.