Code Examples & Integration Guides

Start Converting URLs to Markdown in Under 60 Seconds

Copy-paste ready examples in Python, JavaScript, TypeScript, Go, PHP, and Ruby. Real-world use cases from RAG pipelines to news aggregation.

30+
Code Examples
6
Languages
15+
Use Cases
100%
Copy-Paste Ready
Quick Start

Get Started in 60 Seconds

Copy, paste, and run. It's that simple. No SDKs to install, no complex setup.

Basic Request

Convert any URL to markdown in one line

bash
curl 'https://api.neoreader.dev/https://example.com'

Extract Article Content

Target specific content with CSS selectors

bash
curl -H "X-Target-Selector: article" \
     -H "X-Remove-Selector: nav, footer, .ads" \
     'https://api.neoreader.dev/https://blog.example.com/post'

Handle JavaScript Apps

Wait for dynamic content to load

bash
curl -H "X-Wait-For-Selector: .content-loaded" \
     -H "X-Timeout: 30" \
     'https://api.neoreader.dev/https://spa-app.com'

First Time Using Neo Reader?

All examples work out of the box. Replace YOUR_API_KEY with your actual key from the dashboard.

What Makes Us Different

Features Others Don't Have

Built from the ground up to solve real problems that competitors ignore.

Two-Tier Speed Architecture

Lightning-fast HTTP fetching for static pages (~500ms) with intelligent fallback to full browser rendering for SPAs (~2s). 10-20x faster than competitors.

The Problem with Competitors:

Jina Reader: 7.9s average | Neo Reader: <500ms static, <2s SPAs

Automatic speed optimization

No configuration needed - speed optimization is automatic

# Neo Reader automatically chooses the fastest method
curl 'https://api.neoreader.dev/https://static-blog.com'
# ✓ Uses HTTP fetcher: ~100-500ms

curl 'https://api.neoreader.dev/https://react-app.com'
# ✓ Detects SPA, uses browser: ~2-5s
# Still 3-4x faster than Jina Reader's 7.9s average
1

Transparent Pricing (No Multipliers)

One request = one request. No hidden costs for JavaScript rendering, complex pages, or premium features. All competitors use credit multipliers.

The Problem with Competitors:

Firecrawl: 5-25x multipliers | Neo Reader: Fixed per-request pricing

Predictable costs

$0.60 per 1K requests on Scale plan - no surprises

# Complex SPA with JavaScript
curl 'https://api.neoreader.dev/https://heavy-spa.com'
# Cost: 1 request

# With Firecrawl:
# Base cost: 1 credit
# JavaScript multiplier: 5x
# Premium proxy: 10x
# Total: 50 credits per request

# With Neo Reader:
# Always 1 request, regardless of complexity
2

Automatic Framework Detection

Zero-configuration detection and rendering of React, Vue, Next.js, Nuxt, Angular, and Svelte. Competitors require manual wait selectors.

The Problem with Competitors:

Competitors: Manual configuration | Neo Reader: Automatic detection

No configuration required

Detects _next/, _nuxt/, React root, Vue app, Angular modules automatically

# React app - works automatically
curl 'https://api.neoreader.dev/https://react-app.com'

# Vue app - works automatically
curl 'https://api.neoreader.dev/https://vue-app.com'

# Next.js app - works automatically
curl 'https://api.neoreader.dev/https://nextjs-app.com'

# No need to specify:
# - X-Wait-For-Selector
# - Framework-specific settings
# - Custom timeout logic
3

Production-Ready Self-Hosting

Apache 2.0 license with battle-tested Docker configs. Firecrawl uses AGPL (requires open-sourcing forks) and has unstable self-hosting.

The Problem with Competitors:

Firecrawl: AGPL + unstable | Neo Reader: Apache 2.0 + production-ready

Deploy anywhere in minutes

Docker, Kubernetes configs included. No licensing headaches.

# Clone and run
git clone https://github.com/neoreader/neo-reader-api
cd neo-reader-api

# Production deployment
docker-compose --profile prod up -d

# Or use Kubernetes
kubectl apply -f k8s/

# Full source access, no AGPL restrictions
# Modify and deploy privately
4

See the Difference Yourself

Sign up for our free tier and experience the speed and simplicity.
No credit card required to get started.

Real-World Use Cases

Production-Ready Examples

Battle-tested code from real applications. Copy, customize, and deploy.

RAG Pipeline Integration

Index web content for AI chatbots and knowledge bases. Perfect for documentation scraping, Q&A systems, and semantic search.

AIRAGVector DBLangChain
Choose your language:

Feed content directly to vector database

Python
import requests

def fetch_for_rag(url: str) -> dict:
    """Fetch clean markdown optimized for RAG pipelines"""
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            "X-Target-Selector": "article, main",
            "X-Remove-Selector": "nav, footer, .ads"
        }
    )

    if response.status_code == 200:
        return {
            "content": response.text,
            "source": url,
            "tokens": len(response.text.split())
        }

    raise Exception(f"Error: {response.status_code}")

# Feed to vector database
content = fetch_for_rag("https://docs.example.com/guide")
vector_db.upsert(
    id=content["source"],
    text=content["content"],
    metadata={"source": content["source"]}
)
Key Features
  • Clean markdown output
  • CSS selector targeting
  • Remove noise automatically
  • Token counting included

News Article Extraction

Scrape news articles at scale with automatic SPA detection. Extract clean content from thousands of sources.

NewsContentMonitoringScale
Choose your language:

High-volume news scraping with asyncio

Python
import asyncio
import aiohttp

async def scrape_news_articles(urls: list[str]) -> list[dict]:
    """Scrape multiple news articles concurrently"""
    async with aiohttp.ClientSession() as session:
        tasks = []

        for url in urls:
            headers = {
                "X-API-Key": "YOUR_API_KEY",
                "X-Respond-With": "markdown",
                "X-Target-Selector": "article, [itemtype*='Article']",
                "X-Remove-Selector": ".ad, .social-share, .comments",
                "X-With-Images-Summary": "true"
            }

            task = session.get(
                f"https://api.neoreader.dev/{url}",
                headers=headers
            )
            tasks.append(task)

        responses = await asyncio.gather(*tasks, return_exceptions=True)

        articles = []
        for url, response in zip(urls, responses):
            if isinstance(response, Exception):
                continue

            content = await response.text()
            articles.append({
                "url": url,
                "content": content,
                "timestamp": datetime.now().isoformat()
            })

        return articles

# Scrape 10,000 articles daily
articles = await scrape_news_articles(news_urls)
Key Features
  • Concurrent scraping
  • Automatic SPA detection
  • Clean article extraction
  • Built-in error handling

Single Page Application Scraping

Automatically detect and render React, Vue, Next.js, Nuxt, Angular, and Svelte apps. Zero configuration required.

SPAReactVueJavaScript
Choose your language:

Scrape modern JavaScript frameworks

Python
import requests

def scrape_spa(url: str) -> str:
    """
    Automatically detects SPAs and renders them.
    Supports: React, Vue, Next.js, Nuxt, Angular, Svelte
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            # No X-Wait-For-Selector needed!
            # Automatic framework detection handles it
        }
    )

    return response.text

# Works automatically for any framework
react_content = scrape_spa("https://react-app.com")
vue_content = scrape_spa("https://vue-app.com")
next_content = scrape_spa("https://nextjs-app.com")
Key Features
  • Zero configuration
  • All frameworks supported
  • Automatic JS rendering
  • 10x faster than competitors

LangChain Document Loader

Build custom document loaders for LangChain applications. Perfect for MVPs and production RAG systems.

LangChainLlamaIndexAIIntegration
Choose your language:

Custom LangChain document loader

Python
from langchain.document_loaders.base import BaseLoader
from langchain.schema import Document
import requests
from typing import List

class NeoReaderLoader(BaseLoader):
    """LangChain document loader using Neo Reader API"""

    def __init__(
        self,
        urls: List[str],
        api_key: str,
        target_selector: str = "article, main"
    ):
        self.urls = urls
        self.api_key = api_key
        self.target_selector = target_selector

    def load(self) -> List[Document]:
        """Load documents from URLs"""
        documents = []

        for url in self.urls:
            response = requests.get(
                f"https://api.neoreader.dev/{url}",
                headers={
                    "X-API-Key": self.api_key,
                    "X-Respond-With": "markdown",
                    "X-Target-Selector": self.target_selector,
                    "X-Remove-Selector": "nav, footer, .ads"
                }
            )

            if response.status_code == 200:
                documents.append(Document(
                    page_content=response.text,
                    metadata={
                        "source": url,
                        "loader": "neo-reader"
                    }
                ))

        return documents

# Usage with LangChain
loader = NeoReaderLoader(
    urls=["https://docs.example.com/guide"],
    api_key="YOUR_API_KEY"
)

documents = loader.load()
vector_store = FAISS.from_documents(documents, embeddings)
Key Features
  • LangChain native integration
  • Metadata preservation
  • Clean content extraction
  • Production ready

Screenshot & Visual Testing

Capture viewport or full-page screenshots for visual regression testing, archival, or documentation.

ScreenshotsTestingVisualQA
Choose your language:

Automated screenshot capture

Python
import requests

def capture_screenshot(url: str, full_page: bool = False) -> str:
    """
    Capture screenshot of any webpage
    Returns: URL to screenshot PNG
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "pageshot" if full_page else "screenshot"
        },
        allow_redirects=False  # Get redirect URL
    )

    # Returns 302 redirect to screenshot
    screenshot_url = response.headers.get("Location")
    return screenshot_url

# Capture viewport (1024x1024)
viewport_url = capture_screenshot("https://example.com")

# Capture full page
fullpage_url = capture_screenshot("https://example.com", full_page=True)

# Download screenshot
screenshot_response = requests.get(viewport_url)
with open("screenshot.png", "wb") as f:
    f.write(screenshot_response.content)
Key Features
  • Viewport & full-page
  • 1024x1024 standard size
  • Auto-deleted after 1 hour
  • Perfect for visual testing

Authenticated Content Scraping

Access content behind login walls using cookie injection. Perfect for private dashboards and paywalled content.

AuthenticationCookiesPrivateSecurity
Choose your language:

Scrape authenticated content

Python
import requests

def scrape_authenticated(url: str, session_cookie: str) -> str:
    """
    Scrape content behind authentication
    Useful for private dashboards, paywalled content
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            "X-Set-Cookie": f"session={session_cookie}; user_authenticated=true",
            "X-Target-Selector": ".premium-content"
        }
    )

    return response.text

# Example: Scrape subscriber-only content
cookie = "your_session_cookie_here"
premium_content = scrape_authenticated(
    "https://premium.example.com/article",
    cookie
)

# Works with complex cookie strings
multi_cookie = "session=abc123; _ga=GA1.2.123; pref=dark"
dashboard_content = scrape_authenticated(
    "https://app.example.com/dashboard",
    multi_cookie
)
Key Features
  • Cookie injection support
  • Multiple cookies allowed
  • Paywalls supported
  • Session cookies work

Proxy Rotation & Geo-Targeting

Route requests through proxies for geo-restricted content, IP rotation, or privacy. Supports HTTP, HTTPS, and SOCKS5.

ProxyGeo-RestrictionPrivacyScale
Choose your language:

Scrape geo-restricted content

Python
import requests

def scrape_with_proxy(url: str, proxy_url: str) -> str:
    """
    Scrape content through proxy
    Supports: HTTP, HTTPS, SOCKS5
    """
    response = requests.get(
        f"https://api.neoreader.dev/{url}",
        headers={
            "X-API-Key": "YOUR_API_KEY",
            "X-Respond-With": "markdown",
            "X-Proxy-Url": proxy_url,
            "X-User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"
        }
    )

    return response.text

# HTTP proxy
content = scrape_with_proxy(
    "https://us-only-site.com",
    "http://us-proxy.example.com:8080"
)

# SOCKS5 proxy with authentication
content = scrape_with_proxy(
    "https://geo-restricted.com",
    "socks5://user:pass@proxy.example.com:1080"
)

# Rotate proxies for high-volume scraping
proxies = [
    "http://proxy1.example.com:8080",
    "http://proxy2.example.com:8080",
    "http://proxy3.example.com:8080"
]

for i, url in enumerate(urls):
    proxy = proxies[i % len(proxies)]
    content = scrape_with_proxy(url, proxy)
Key Features
  • HTTP/HTTPS/SOCKS5 support
  • Authentication included
  • Geo-restriction bypass
  • IP rotation ready

Ready to Build?

Start with 500 free requests per month. No credit card required.