How to Use OkraPDF with ChatGPT & Claude

Talk to your documents the way DeepWiki lets you talk to code.

Two Ways to Work with Your Documents

| Method | Best For | Setup Time | |--------|----------|------------| | MCP Connection (Claude Desktop) | Natural conversation with your doc library | 2 minutes | | Preprocessing + Upload (ChatGPT/Claude) | One-off analysis, maximum control | 5 minutes |

Method 1: Direct Connection via MCP (Recommended)

Just like DeepWiki lets you ask questions about any GitHub repo, OkraPDF lets you ask questions about any PDF you've uploaded—directly from Claude Desktop.

How It Works

┌─────────────────────────────────────────────────────────────────┐
│  Claude Desktop                                                  │
│                                                                  │
│  You: "What was Q3 revenue across my quarterly reports?"        │
│                                                                  │
│  Claude: → [MCP] list_documents()                               │
│          → [MCP] ask_question("Q3-Report.pdf", "total revenue") │
│          → "Based on your Q3 report, revenue was $4.2B..."      │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ OAuth (one-time login)
┌─────────────────────────────────────────────────────────────────┐
│  okrapdf.com/mcp                                                 │
│                                                                  │
│  Your uploaded PDFs → Pre-extracted tables & text → Fast answers│
└─────────────────────────────────────────────────────────────────┘

Setup (2 minutes)

Open Claude Desktop settings
Add this MCP server:

{
  "mcpServers": {
    "okrapdf": {
      "url": "https://okrapdf.com/mcp",
      "oauth": {
        "client_id": "auto",
        "scopes": ["documents:read"]
      }
    }
  }
}

Claude will prompt you to log in (Google/GitHub SSO)
Start asking questions about your documents

Example Conversation

You: "What are the revenue figures across my uploaded quarterly reports?"

Claude: I'll search your documents and extract the revenue data.

  → [Calling list_documents...]
  → Found: Q1-Report.pdf, Q2-Report.pdf, Q3-Report.pdf

  → [Calling ask_question on each...]

Based on your quarterly reports:
  • Q1: $3.2B
  • Q2: $3.8B
  • Q3: $4.2B

Revenue grew 31% across the three quarters.

Why This Works

Unlike uploading files every time, MCP uses pre-extracted context:

Tables already parsed into structured data
OCR already run on scanned documents
Answers come back in seconds, not minutes

Method 2: Preprocessing + Upload

For ChatGPT users, or when you need maximum control over what the AI sees.

Why Preprocess PDFs?

When you upload a PDF directly to ChatGPT, it "reads" it like a human—which means it can skip lines or misread tables. When you preprocess with OkraPDF, ChatGPT receives clean Markdown files with exact data structures and near-perfect accuracy.

What Gets Lost Without Preprocessing

| Document Element | What ChatGPT Sees | The Problem | |------------------|-------------------|-------------| | Multi-row tables | Fragmented text chunks | Rows split across chunks, losing context | | Nested tables | Jumbled text | Structure completely destroyed | | Footnotes & references | Disconnected text | Numbers without their explanations | | Scanned documents | Nothing or garbage | Basic OCR fails on complex layouts |

The Workflow

┌─────────────────┐
│   Your PDF      │
└────────┬────────┘
         │
         ▼
┌──────────────────────────────────────┐
│           OkraPDF                    │
│  Visual AI → Structure Detection    │
│  → Page-by-Page Markdown Files      │
└──────────────────────────────────────┘
         │
         ▼
┌─────────────────┐
│  Download ZIP   │  ← Individual .md files per page
│  (pages.zip)    │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌────────┐ ┌────────┐
│ChatGPT │ │ Claude │
│Projects│ │        │
└────────┘ └────────┘

Step 1: Process Your PDF in OkraPDF

Go to okrapdf.com
Upload your PDF (drag & drop or paste URL)
Wait for processing to complete (green status bar)
Review extraction in the split-view

What OkraPDF Preserves

Table structure with row/column relationships
Headers and footers in context
Multi-page table continuations
Footnote references

Step 2: Download the ZIP File

Once processing completes, click the Export button and select Download ZIP.

You'll get a pages.zip file containing:

pages.zip
├── page_001.md
├── page_002.md
├── page_003.md
├── ...
└── manifest.json

Each page is a clean Markdown file with tables properly formatted:

| Segment | 2024 | 2023 |
|---------|------|------|
| Gaming  | 12.5B | 10.2B |
| Data Center | 47.5B | 15.0B |

Step 3: Use with ChatGPT

Choose the method that fits your workflow:

Option A: ChatGPT Projects (Best for Recurring Work)

Use this if you need to ask questions about these reports regularly (e.g., "Compare Q1 and Q4"). Creates a permanent library for your files.

Requirements: ChatGPT Plus, Team, or Enterprise

Steps:

Open ChatGPT Desktop App sidebar
Click Projects → New Project
Name it (e.g., "Financial Reports 2024")
In Project settings, find Knowledge or Files
Click Add source → Upload files
Upload your pages.zip file directly (ChatGPT reads inside zips)

Example prompts:

"Search the Q3 files for the Adjusted EBITDA table and give me the rows where the value exceeds $5M."

"Compare revenue trends between page_010.md and page_045.md"

Option B: Zip & Script (Best for Precision)

Use this when you need 100% accuracy (e.g., "Find every single mention of 'churn' across 500 pages"). Uses Python to physically search files.

Requirements: ChatGPT with Advanced Data Analysis (Plus/Team/Enterprise)

Steps:

Open a new ChatGPT chat
Drag the pages.zip file into the message bar
Use this exact prompt:

I have uploaded a zip file containing page-by-page markdown reports.
I need you to perform a precise search using Python.

- Unzip the file into your environment
- Iterate through every .md file in the directory
- Search specifically for the term '[YOUR KEYWORD HERE]'
- For every match found, print a list containing:
  - The Filename (e.g., page_042.md)
  - The exact line of text where the match appears

Do not summarize. Just list the matches.

ChatGPT will run Python code to grep through your files
Click "Analyzing..." to verify the actual code it ran

Which Method Should I Use?

| Method | Best For | Example Question | AI | |--------|----------|------------------|-----| | MCP Connection | Ongoing document library | "Search my reports for revenue trends" | Claude | | ChatGPT Projects | Big picture questions | "What's the revenue trend?" | ChatGPT | | Zip & Script | Finding specific data | "List every invoice number" | ChatGPT |

Decision Tree

Do you use Claude Desktop?
  ├─ Yes → Use MCP Connection (fastest, no re-uploads)
  └─ No → Do you need recurring access?
            ├─ Yes → ChatGPT Projects
            └─ No → Zip & Script

Why OkraPDF Makes AI Better

The Accuracy Difference

| Approach | Table Accuracy | Speed | |----------|----------------|-------| | Direct PDF Upload | ~40-60% | Slow (re-parses every time) | | OkraPDF Preprocessing | ~95%+ | Fast (pre-extracted) | | OkraPDF MCP | ~95%+ | Fastest (cached + streaming) |

The Key Difference

ChatGPT's text extraction doesn't understand tables. It might see:

Revenue | 2024 | 2023
Gaming | 12.5B |

And chunk right there, losing the 2023 value.

OkraPDF uses visual AI that sees document layout like a human, then extracts complete, self-contained tables ready for AI analysis.

Try It Now

For Claude Desktop Users (Recommended)

Add OkraPDF to your MCP servers (config above)
Upload PDFs at okrapdf.com
Ask Claude: "What documents do I have in OkraPDF?"

For ChatGPT Users

Upload a complex PDF at okrapdf.com
Download the ZIP when processing completes
Upload to ChatGPT Projects or use the Zip & Script method
Compare answers with and without preprocessing

Common Questions

"Can't ChatGPT just read PDFs now?"

Yes, but it uses basic text extraction. Complex layouts—especially tables with merged cells or multi-page spans—still get mangled. OkraPDF uses visual AI that "sees" the document like a human.

"How many pages can I process?"

Free tier: 5 pages
Standard plan: 2,000 pages/month

"Is my data secure?"

Yes. AES-256 encryption at rest, TLS 1.3 in transit, auto-deleted after 30 days. See our Security page.

Summary

| Without OkraPDF | With OkraPDF | |-----------------|--------------| | Tables break across chunks | Tables stay intact | | Numbers may hallucinate | Numbers verified visually | | ~60% accuracy on tables | ~95%+ accuracy on tables | | Re-upload files every time | MCP: instant access to your library |

OkraPDF isn't a replacement for ChatGPT or Claude. It's the bridge that makes AI dramatically better at understanding your documents.

Think of it like DeepWiki for documents:

DeepWiki: "Explain how authentication works in this repo"
OkraPDF: "What's the revenue breakdown in my Q3 report?"

Questions? Email support@okrapdf.com