Code Examples & Integration Guides

Start Converting URLs to Markdown in Under 60 Seconds

Copy-paste ready examples in Python, JavaScript, TypeScript, Go, PHP, and Ruby. Real-world use cases from RAG pipelines to news aggregation.

Get Started Browse Use Cases

30+

Code Examples

Languages

15+

Use Cases

100%

Copy-Paste Ready

Quick Start

Get Started in 60 Seconds

Copy, paste, and run. It's that simple. No SDKs to install, no complex setup.

Basic Request

Convert any URL to markdown in one line

bash

curl 'https://api.neoreader.dev/https://example.com'

Extract Article Content

Target specific content with CSS selectors

bash

curl -H "X-Target-Selector: article" \
     -H "X-Remove-Selector: nav, footer, .ads" \
     'https://api.neoreader.dev/https://blog.example.com/post'

Handle JavaScript Apps

Wait for dynamic content to load

bash

curl -H "X-Wait-For-Selector: .content-loaded" \
     -H "X-Timeout: 30" \
     'https://api.neoreader.dev/https://spa-app.com'

First Time Using Neo Reader?

All examples work out of the box. Replace YOUR_API_KEY with your actual key from the dashboard.

Get Free API Key Read Full Docs

Choose Your Language

Production-ready examples in your favorite programming language. All examples include error handling, best practices, and real-world use cases.

Popular

Python

Perfect for RAG pipelines, data science, and AI applications

View Examples

Popular

JavaScript

Node.js examples with fetch and axios

View Examples

Popular

TypeScript

Type-safe client with full IntelliSense support

View Examples

Go

High-performance concurrent scraping

View Examples

PHP

Easy integration with WordPress, Laravel, and more

View Examples

Ruby

Rails-ready examples with clean Ruby syntax

View Examples

Don't see your language? Check our REST API documentation for universal HTTP examples.

What Makes Us Different

Features Others Don't Have

Built from the ground up to solve real problems that competitors ignore.

Two-Tier Speed Architecture

Lightning-fast HTTP fetching for static pages (~500ms) with intelligent fallback to full browser rendering for SPAs (~2s). 10-20x faster than competitors.

The Problem with Competitors:

Jina Reader: 7.9s average | Neo Reader: <500ms static, <2s SPAs

Automatic speed optimization

No configuration needed - speed optimization is automatic

# Neo Reader automatically chooses the fastest method
curl 'https://api.neoreader.dev/https://static-blog.com'
# ✓ Uses HTTP fetcher: ~100-500ms

curl 'https://api.neoreader.dev/https://react-app.com'
# ✓ Detects SPA, uses browser: ~2-5s
# Still 3-4x faster than Jina Reader's 7.9s average

Transparent Pricing (No Multipliers)

One request = one request. No hidden costs for JavaScript rendering, complex pages, or premium features. All competitors use credit multipliers.

The Problem with Competitors:

Firecrawl: 5-25x multipliers | Neo Reader: Fixed per-request pricing

Predictable costs

$0.60 per 1K requests on Scale plan - no surprises

# Complex SPA with JavaScript
curl 'https://api.neoreader.dev/https://heavy-spa.com'
# Cost: 1 request

# With Firecrawl:
# Base cost: 1 credit
# JavaScript multiplier: 5x
# Premium proxy: 10x
# Total: 50 credits per request

# With Neo Reader:
# Always 1 request, regardless of complexity

Automatic Framework Detection

Zero-configuration detection and rendering of React, Vue, Next.js, Nuxt, Angular, and Svelte. Competitors require manual wait selectors.

The Problem with Competitors:

Competitors: Manual configuration | Neo Reader: Automatic detection

No configuration required

Detects _next/, _nuxt/, React root, Vue app, Angular modules automatically

# React app - works automatically
curl 'https://api.neoreader.dev/https://react-app.com'

# Vue app - works automatically
curl 'https://api.neoreader.dev/https://vue-app.com'

# Next.js app - works automatically
curl 'https://api.neoreader.dev/https://nextjs-app.com'

# No need to specify:
# - X-Wait-For-Selector
# - Framework-specific settings
# - Custom timeout logic

Production-Ready Self-Hosting

Apache 2.0 license with battle-tested Docker configs. Firecrawl uses AGPL (requires open-sourcing forks) and has unstable self-hosting.

The Problem with Competitors:

Firecrawl: AGPL + unstable | Neo Reader: Apache 2.0 + production-ready

Deploy anywhere in minutes

Docker, Kubernetes configs included. No licensing headaches.

# Clone and run
git clone https://github.com/neoreader/neo-reader-api
cd neo-reader-api

# Production deployment
docker-compose --profile prod up -d

# Or use Kubernetes
kubectl apply -f k8s/

# Full source access, no AGPL restrictions
# Modify and deploy privately

See the Difference Yourself

Start Free Trial View Pricing

Real-World Use Cases

Production-Ready Examples

Battle-tested code from real applications. Copy, customize, and deploy.

RAG Pipeline Integration

Index web content for AI chatbots and knowledge bases. Perfect for documentation scraping, Q&A systems, and semantic search.

AIRAGVector DBLangChain

Choose your language:

Feed content directly to vector database

Python

import requests

def fetch_for_rag(url: str) -> dict:
    """Fetch clean markdown optimized for RAG pipelines"""
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            "X-Target-Selector": "article, main",
            "X-Remove-Selector": "nav, footer, .ads"
        }
    )

    if response.status_code == 200:
        return {
            "content": response.text,
            "source": url,
            "tokens": len(response.text.split())
        }

    raise Exception(f"Error: {response.status_code}")

# Feed to vector database
content = fetch_for_rag("https://docs.example.com/guide")
vector_db.upsert(
    id=content["source"],
    text=content["content"],
    metadata={"source": content["source"]}
)

Key Features

Clean markdown output
CSS selector targeting
Remove noise automatically
Token counting included

News Article Extraction

Scrape news articles at scale with automatic SPA detection. Extract clean content from thousands of sources.

NewsContentMonitoringScale

Choose your language:

High-volume news scraping with asyncio

Python

import asyncio
import aiohttp

async def scrape_news_articles(urls: list[str]) -> list[dict]:
    """Scrape multiple news articles concurrently"""
    async with aiohttp.ClientSession() as session:
        tasks = []

        for url in urls:
            headers = {
                "X-API-Key": "YOUR_API_KEY",
                "X-Respond-With": "markdown",
                "X-Target-Selector": "article, [itemtype*='Article']",
                "X-Remove-Selector": ".ad, .social-share, .comments",
                "X-With-Images-Summary": "true"
            }

            task = session.get(
                f"https://api.neoreader.dev/{url}",
                headers=headers
            )
            tasks.append(task)

        responses = await asyncio.gather(*tasks, return_exceptions=True)

        articles = []
        for url, response in zip(urls, responses):
            if isinstance(response, Exception):
                continue

            content = await response.text()
            articles.append({
                "url": url,
                "content": content,
                "timestamp": datetime.now().isoformat()
            })

        return articles

# Scrape 10,000 articles daily
articles = await scrape_news_articles(news_urls)

Key Features

Concurrent scraping
Automatic SPA detection
Clean article extraction
Built-in error handling

Single Page Application Scraping

Automatically detect and render React, Vue, Next.js, Nuxt, Angular, and Svelte apps. Zero configuration required.

SPAReactVueJavaScript

Choose your language:

Scrape modern JavaScript frameworks

Python

import requests

def scrape_spa(url: str) -> str:
    """
    Automatically detects SPAs and renders them.
    Supports: React, Vue, Next.js, Nuxt, Angular, Svelte
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            # No X-Wait-For-Selector needed!
            # Automatic framework detection handles it
        }
    )

    return response.text

# Works automatically for any framework
react_content = scrape_spa("https://react-app.com")
vue_content = scrape_spa("https://vue-app.com")
next_content = scrape_spa("https://nextjs-app.com")

Key Features

Zero configuration
All frameworks supported
Automatic JS rendering
10x faster than competitors

LangChain Document Loader

Build custom document loaders for LangChain applications. Perfect for MVPs and production RAG systems.

LangChainLlamaIndexAIIntegration

Choose your language:

Custom LangChain document loader

Python

from langchain.document_loaders.base import BaseLoader
from langchain.schema import Document
import requests
from typing import List

class NeoReaderLoader(BaseLoader):
    """LangChain document loader using Neo Reader API"""

    def __init__(
        self,
        urls: List[str],
        api_key: str,
        target_selector: str = "article, main"
    ):
        self.urls = urls
        self.api_key = api_key
        self.target_selector = target_selector

    def load(self) -> List[Document]:
        """Load documents from URLs"""
        documents = []

        for url in self.urls:
            response = requests.get(
                f"https://api.neoreader.dev/{url}",
                headers={
                    "X-API-Key": self.api_key,
                    "X-Respond-With": "markdown",
                    "X-Target-Selector": self.target_selector,
                    "X-Remove-Selector": "nav, footer, .ads"
                }
            )

            if response.status_code == 200:
                documents.append(Document(
                    page_content=response.text,
                    metadata={
                        "source": url,
                        "loader": "neo-reader"
                    }
                ))

        return documents

# Usage with LangChain
loader = NeoReaderLoader(
    urls=["https://docs.example.com/guide"],
    api_key="YOUR_API_KEY"
)

documents = loader.load()
vector_store = FAISS.from_documents(documents, embeddings)

Key Features

LangChain native integration
Metadata preservation
Clean content extraction
Production ready

Screenshot & Visual Testing

Capture viewport or full-page screenshots for visual regression testing, archival, or documentation.

ScreenshotsTestingVisualQA

Choose your language:

Automated screenshot capture

Python

import requests

def capture_screenshot(url: str, full_page: bool = False) -> str:
    """
    Capture screenshot of any webpage
    Returns: URL to screenshot PNG
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "pageshot" if full_page else "screenshot"
        },
        allow_redirects=False  # Get redirect URL
    )

    # Returns 302 redirect to screenshot
    screenshot_url = response.headers.get("Location")
    return screenshot_url

# Capture viewport (1024x1024)
viewport_url = capture_screenshot("https://example.com")

# Capture full page
fullpage_url = capture_screenshot("https://example.com", full_page=True)

# Download screenshot
screenshot_response = requests.get(viewport_url)
with open("screenshot.png", "wb") as f:
    f.write(screenshot_response.content)

Key Features

Viewport & full-page
1024x1024 standard size
Auto-deleted after 1 hour
Perfect for visual testing

Authenticated Content Scraping

Access content behind login walls using cookie injection. Perfect for private dashboards and paywalled content.

AuthenticationCookiesPrivateSecurity

Choose your language:

Scrape authenticated content

Python

import requests

def scrape_authenticated(url: str, session_cookie: str) -> str:
    """
    Scrape content behind authentication
    Useful for private dashboards, paywalled content
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            "X-Set-Cookie": f"session={session_cookie}; user_authenticated=true",
            "X-Target-Selector": ".premium-content"
        }
    )

    return response.text

# Example: Scrape subscriber-only content
cookie = "your_session_cookie_here"
premium_content = scrape_authenticated(
    "https://premium.example.com/article",
    cookie
)

# Works with complex cookie strings
multi_cookie = "session=abc123; _ga=GA1.2.123; pref=dark"
dashboard_content = scrape_authenticated(
    "https://app.example.com/dashboard",
    multi_cookie
)

Key Features

Cookie injection support
Multiple cookies allowed
Paywalls supported
Session cookies work

Proxy Rotation & Geo-Targeting

Route requests through proxies for geo-restricted content, IP rotation, or privacy. Supports HTTP, HTTPS, and SOCKS5.

ProxyGeo-RestrictionPrivacyScale

Choose your language:

Scrape geo-restricted content

Python

import requests

def scrape_with_proxy(url: str, proxy_url: str) -> str:
    """
    Scrape content through proxy
    Supports: HTTP, HTTPS, SOCKS5
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            "X-Proxy-Url": proxy_url,
            "X-User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
        }
    )

    return response.text

# HTTP proxy
content = scrape_with_proxy(
    "https://us-only-site.com",
    "http://us-proxy.example.com:8080"
)

# SOCKS5 proxy with authentication
content = scrape_with_proxy(
    "https://geo-restricted.com",
    "socks5://user:pass@proxy.example.com:1080"
)

# Rotate proxies for high-volume scraping
proxies = [
    "http://proxy1.example.com:8080",
    "http://proxy2.example.com:8080",
    "http://proxy3.example.com:8080"
]

for i, url in enumerate(urls):
    proxy = proxies[i % len(proxies)]
    content = scrape_with_proxy(url, proxy)

Key Features

HTTP/HTTPS/SOCKS5 support
Authentication included
Geo-restriction bypass
IP rotation ready

Ready to Build?

Start with 500 free requests per month. No credit card required.

Start Building Now Browse by Language

Start Converting URLs to Markdown in Under 60 Seconds

Get Started in 60 Seconds

Basic Request

Extract Article Content

Handle JavaScript Apps

First Time Using Neo Reader?

Choose Your Language

Python

JavaScript

TypeScript

Go

PHP

Ruby

Features Others Don't Have

Two-Tier Speed Architecture

Automatic speed optimization

Transparent Pricing (No Multipliers)

Predictable costs

Automatic Framework Detection

No configuration required

Production-Ready Self-Hosting

Deploy anywhere in minutes

See the Difference Yourself

Production-Ready Examples

RAG Pipeline Integration

Feed content directly to vector database

Key Features

Async batch processing for multiple URLs

Key Features

News Article Extraction

High-volume news scraping with asyncio

Key Features

Single Page Application Scraping

Scrape modern JavaScript frameworks

Key Features

Handle complex SPAs with custom wait strategies

Key Features

LangChain Document Loader

Custom LangChain document loader

Key Features

Type-safe LangChain integration

Key Features

Screenshot & Visual Testing

Automated screenshot capture

Key Features

Visual regression testing suite

Key Features

Authenticated Content Scraping

Scrape authenticated content

Key Features

Proxy Rotation & Geo-Targeting

Scrape geo-restricted content

Key Features

Ready to Build?