Cortex vs. LlamaIndex

One assembles. The other serves.

LlamaIndex is the framework you wire together — retrievers, indexes, vector stores, parsers — to build your own RAG stack. HangarX Cortex is managed memory infrastructure: a knowledge graph, vector store, claim extractor, and MCP server, ready to query.

They're not competitive — they live at different layers of the stack. The right question is which layer you want to operate.

Pick Cortex if

You want grounded agent memory shipping in a week, not a quarter.

Graph + vector + claims + retrieval pipeline + MCP server, all bundled. You ingest a corpus and start querying with cited answers — no pipeline assembly.

Pick LlamaIndex if

You want maximum flexibility and a long roadmap of custom retrieval.

100+ LLMs, 40+ vector stores, 100+ data connectors, world-class document parsing via LlamaParse. You assemble exactly the stack you want.

What we agree on

Cortex and LlamaIndex are both serious, well-engineered tools used in production by real teams. They're aligned on the fundamentals.

  • RAG over your documents — both are built around grounding LLMs in your corpus
  • Hybrid retrieval matters — keyword + vector is better than either alone
  • Knowledge graphs add value — both support graph-based retrieval over entities and relationships
  • Open source core — both have OSS variants alongside managed cloud platforms
  • Multi-LLM — neither locks you into a single model provider
  • Production-ready — both ship in real products, not demos

Where we differ

The dimensions that matter when choosing between assembling your own RAG stack and using managed memory infrastructure.

DimensionHangarX CortexLlamaIndex
Layer of the stackManaged memory platformFramework + LlamaCloud (parse/extract/index services)
Out-of-the-box hybrid retrieval

LlamaIndex provides retrievers as primitives. Cortex bundles BM25 + vector + multi-hop graph + PPR + CRAG + reranker as a single pipeline you don't assemble.

Knowledge graph storage included

LlamaIndex has a KnowledgeGraphIndex abstraction; you bring your own graph DB. Cortex ships FalkorDB Cypher-compatible storage as part of the stack.

Vector storage included

LlamaIndex integrates with 40+ vector stores; you choose and run one. Cortex ships Postgres + pgvector preconfigured.

Claims with provenance (SPO triples)
Contradiction detection
MCP server (cross-tool memory)

LlamaIndex agents can call MCP servers, but it doesn't expose an MCP server itself. Cortex is MCP-native — Claude, Cursor, Cline, Windsurf all read the same memory through it.

Native Obsidian plugin
Document parsing (LlamaParse)

LlamaParse is genuinely best-in-class for complex PDFs, tables, and 50+ file types. Cortex handles common formats but doesn't try to compete with LlamaParse on parsing breadth.

Schema-based extraction (LlamaExtract)
Multi-LLM provider support8+ providers100+ providers
Self-host (Docker)
Time-to-first-queryMinutes (Docker up + ingest)Hours to days (you assemble the pipeline)
LanguagesTypeScript / REST / MCPPython + TypeScript
Yes, fully supported Partial / possible with workaround Not a primary capability

The wedge

LlamaIndex is the dominant framework for RAG, and for good reason — it's well-designed, well-documented, and gives you ceiling. If you have a team that wants to build a custom retrieval pipeline tuned to your exact data and workload, LlamaIndex is the right floor to start from.

But most teams don't need ceiling. They need grounded agent memory shipping by next sprint. That's the gap Cortex fills. We made the opinionated decisions for you: FalkorDB for the graph, Postgres + pgvector for vectors, LLM-extracted SPO claims with provenance, hybrid retrieval with CRAG-style evaluation, MCP serving on top. You ingest a corpus, you query it. The pipeline is already tuned.

Think of it like Vercel vs. Next.js. Next.js is an excellent framework — but most teams ship faster on Vercel because they don't want to operate the runtime. LlamaIndex is the framework. Cortex is the runtime.

When you should pick LlamaIndex

  • You're building a custom retrieval pipeline with non-standard requirements that don't fit into a managed platform's opinions.
  • You need parsing for complex PDFs, scanned documents, tables, or 50+ file types — LlamaParse is genuinely best-in-class.
  • Your team has ML/infra engineers and the bandwidth to operate a graph DB, vector DB, retrieval logic, and observability stack themselves.
  • You need 100+ LLM provider integrations or 40+ vector store choices today.
  • You're working in Python and want the broadest ecosystem of integrations.
  • You want to start from a framework and grow into your own bespoke memory infrastructure over time.

When you should pick Cortex

  • You want grounded agent memory shipping in days, not weeks. The graph DB, vector store, retrieval pipeline, and MCP server are already wired together.
  • You want every retrieved fact to link back to its source span. Claims with provenance are first-class, not bolted on.
  • You want the same memory exposed to multiple AI tools (Claude Desktop, Cursor, Cline, Windsurf, Zed) over MCP — Cortex is MCP-native.
  • Your team is small and you'd rather not operate Postgres + a graph DB + reranking + extraction pipelines yourselves.
  • You're an Obsidian user and want a native plugin out of the box — Cortex has one; LlamaIndex doesn't.
  • You want auditable memory: contradictions surfaced, claims cited, temporal queries supported via asOf.

FAQ

Is Cortex a replacement for LlamaIndex?

Not a 1:1 replacement — they live at different layers. LlamaIndex is a framework: a toolkit of retrievers, indexes, agents, and 100+ integrations you compose yourself. Cortex is managed infrastructure: a graph DB, vector store, claim extractor, retrieval pipeline, and MCP server bundled into a stack you can run with one Docker command. If you want to assemble your own stack with maximum flexibility, LlamaIndex. If you want memory infrastructure that just works, Cortex.

What about LlamaCloud / LlamaParse?

LlamaParse is excellent — genuinely best-in-class for parsing complex PDFs, tables, scanned documents, and 50+ file types. Cortex doesn't try to beat it on parsing breadth. Many teams pair the two: use LlamaParse to ingest gnarly source documents, then push the cleaned text into Cortex for graph extraction, claims, retrieval, and MCP serving. They're complementary, not competitive.

Can I use LlamaIndex with Cortex?

Yes. You can build a LlamaIndex agent that calls Cortex's MCP server as a memory tool, or use LlamaIndex's retrievers to query Cortex's underlying FalkorDB and pgvector stores directly. The data formats are open, so there's no integration tax.

Why would I pay for managed memory if LlamaIndex is open source?

Same reason you'd use Vercel instead of self-hosting Next.js, or Supabase instead of running Postgres yourself. The framework is free; the operational cost of running it well — graph DB, vector DB, retrieval tuning, reranking, observability, multi-tenant isolation, MCP exposure — is not. Cortex bundles those operations. If your team has ML/infra engineers and a long roadmap, LlamaIndex gives you ceiling. If you want to ship grounded agent memory in a week, Cortex gets you there faster.

Is Cortex open source?

Yes. The Cortex API stack and the Obsidian plugin are open source. Cloud mode runs the same core with managed infrastructure on top. So you have the same 'OSS or managed' choice that LlamaIndex offers — just at a different layer.

Skip the assembly. Ship grounded memory.

Point Cortex at your corpus, connect Claude or Cursor over MCP, and watch your agents start citing your docs. No retrieval pipeline to build.