Vectify AI

⭐ What is PageIndex

PageIndex transforms lengthy PDF document into a searchable tree structure — like a smart "tables of contents" — but optimized for use with LLMs.

Built for Reasoning-based RAG 🧠, PageIndex enables LLMs to navigate documents logically and find exactly what they need through reasoning and structured relevance — without relying on vector similarity or arbitrary chunking. It's ideal for: financial reports, legal documents, technical manuals or any document that exceeds LLM context limits.

👉 Try it now via the API or the Web Dashboard.

💬 For support or feedback, please leave us a message or join our Discord community.

✅ Key Features

Hierarchical Tree Structure
Enables LLMs to traverse documents logically—like an intelligent, LLM-optimized table of contents.
Chunk-Free Segmentation
No arbitrary chunking. Nodes follow the natural structure of the document.
Scales to Massive Documents
Designed to handle hundreds or even thousands of pages with ease.
Precise Page Referencing
Every node contains its summary and start/end page physical index, allowing pinpoint retrieval.

📦 PageIndex Format

Here is an example output. See more example documents and generated trees.

{
  "title": "Financial Stability",
  "node_id": "0006",
  "start_index": 21,
  "end_index": 22,
  "summary": "The Federal Reserve ...",
  "nodes": [
    {
      "title": "Monitoring Financial Vulnerabilities",
      "node_id": "0007",
      "start_index": 22,
      "end_index": 28,
      "summary": "The Federal Reserve's monitoring ..."
    },
    {
      "title": "Domestic and International Cooperation and Coordination",
      "node_id": "0008",
      "start_index": 28,
      "end_index": 31,
      "summary": "In 2023, the Federal Reserve collaborated ..."
    }
  ]
}