Cortex vs. LlamaIndex

One assembles. The other serves.

LlamaIndex is the framework you wire together — retrievers, indexes, vector stores, parsers — to build your own RAG stack. HangarX Cortex is managed memory infrastructure: a knowledge graph, vector store, claim extractor, and MCP server, ready to query.

They're not competitive — they live at different layers of the stack. The right question is which layer you want to operate.

Pick Cortex if

You want grounded agent memory shipping in a week, not a quarter.

Graph + vector + claims + retrieval pipeline + MCP server, all bundled. You ingest a corpus and start querying with cited answers — no pipeline assembly.

Pick LlamaIndex if

You want maximum flexibility and a long roadmap of custom retrieval.

100+ LLMs, 40+ vector stores, 100+ data connectors, world-class document parsing via LlamaParse. You assemble exactly the stack you want.

What we agree on

Cortex and LlamaIndex are both serious, well-engineered tools used in production by real teams. They're aligned on the fundamentals.

RAG over your documents — both are built around grounding LLMs in your corpus
Hybrid retrieval matters — keyword + vector is better than either alone
Knowledge graphs add value — both support graph-based retrieval over entities and relationships
Open source core — both have OSS variants alongside managed cloud platforms
Multi-LLM — neither locks you into a single model provider
Production-ready — both ship in real products, not demos

Where we differ

The dimensions that matter when choosing between assembling your own RAG stack and using managed memory infrastructure.

Dimension	HangarX Cortex	LlamaIndex
Layer of the stack	Managed memory platform	Framework + LlamaCloud (parse/extract/index services)
Out-of-the-box hybrid retrieval LlamaIndex provides retrievers as primitives. Cortex bundles BM25 + vector + multi-hop graph + PPR + CRAG + reranker as a single pipeline you don't assemble.
Knowledge graph storage included LlamaIndex has a KnowledgeGraphIndex abstraction; you bring your own graph DB. Cortex ships FalkorDB Cypher-compatible storage as part of the stack.
Vector storage included LlamaIndex integrates with 40+ vector stores; you choose and run one. Cortex ships Postgres + pgvector preconfigured.
Claims with provenance (SPO triples)
Contradiction detection
MCP server (cross-tool memory) LlamaIndex agents can call MCP servers, but it doesn't expose an MCP server itself. Cortex is MCP-native — Claude, Cursor, Cline, Windsurf all read the same memory through it.
Native Obsidian plugin
Document parsing (LlamaParse) LlamaParse is genuinely best-in-class for complex PDFs, tables, and 50+ file types. Cortex handles common formats but doesn't try to compete with LlamaParse on parsing breadth.
Schema-based extraction (LlamaExtract)
Multi-LLM provider support	8+ providers	100+ providers
Self-host (Docker)
Time-to-first-query	Minutes (Docker up + ingest)	Hours to days (you assemble the pipeline)
Languages	TypeScript / REST / MCP	Python + TypeScript

Yes, fully supported Partial / possible with workaround Not a primary capability

The wedge

LlamaIndex is the dominant framework for RAG, and for good reason — it's well-designed, well-documented, and gives you ceiling. If you have a team that wants to build a custom retrieval pipeline tuned to your exact data and workload, LlamaIndex is the right floor to start from.

But most teams don't need ceiling. They need grounded agent memory shipping by next sprint. That's the gap Cortex fills. We made the opinionated decisions for you: FalkorDB for the graph, Postgres + pgvector for vectors, LLM-extracted SPO claims with provenance, hybrid retrieval with CRAG-style evaluation, MCP serving on top. You ingest a corpus, you query it. The pipeline is already tuned.

Think of it like Vercel vs. Next.js. Next.js is an excellent framework — but most teams ship faster on Vercel because they don't want to operate the runtime. LlamaIndex is the framework. Cortex is the runtime.

When you should pick LlamaIndex

You're building a custom retrieval pipeline with non-standard requirements that don't fit into a managed platform's opinions.
You need parsing for complex PDFs, scanned documents, tables, or 50+ file types — LlamaParse is genuinely best-in-class.
Your team has ML/infra engineers and the bandwidth to operate a graph DB, vector DB, retrieval logic, and observability stack themselves.
You need 100+ LLM provider integrations or 40+ vector store choices today.
You're working in Python and want the broadest ecosystem of integrations.
You want to start from a framework and grow into your own bespoke memory infrastructure over time.

When you should pick Cortex

You want grounded agent memory shipping in days, not weeks. The graph DB, vector store, retrieval pipeline, and MCP server are already wired together.
You want every retrieved fact to link back to its source span. Claims with provenance are first-class, not bolted on.
You want the same memory exposed to multiple AI tools (Claude Desktop, Cursor, Cline, Windsurf, Zed) over MCP — Cortex is MCP-native.
Your team is small and you'd rather not operate Postgres + a graph DB + reranking + extraction pipelines yourselves.
You're an Obsidian user and want a native plugin out of the box — Cortex has one; LlamaIndex doesn't.
You want auditable memory: contradictions surfaced, claims cited, temporal queries supported via asOf.

FAQ

Is Cortex a replacement for LlamaIndex?

Not a 1:1 replacement — they live at different layers. LlamaIndex is a framework: a toolkit of retrievers, indexes, agents, and 100+ integrations you compose yourself. Cortex is managed infrastructure: a graph DB, vector store, claim extractor, retrieval pipeline, and MCP server bundled into a stack you can run with one Docker command. If you want to assemble your own stack with maximum flexibility, LlamaIndex. If you want memory infrastructure that just works, Cortex.

What about LlamaCloud / LlamaParse?

LlamaParse is excellent — genuinely best-in-class for parsing complex PDFs, tables, scanned documents, and 50+ file types. Cortex doesn't try to beat it on parsing breadth. Many teams pair the two: use LlamaParse to ingest gnarly source documents, then push the cleaned text into Cortex for graph extraction, claims, retrieval, and MCP serving. They're complementary, not competitive.

Can I use LlamaIndex with Cortex?

Yes. You can build a LlamaIndex agent that calls Cortex's MCP server as a memory tool, or use LlamaIndex's retrievers to query Cortex's underlying FalkorDB and pgvector stores directly. The data formats are open, so there's no integration tax.

Why would I pay for managed memory if LlamaIndex is open source?

Same reason you'd use Vercel instead of self-hosting Next.js, or Supabase instead of running Postgres yourself. The framework is free; the operational cost of running it well — graph DB, vector DB, retrieval tuning, reranking, observability, multi-tenant isolation, MCP exposure — is not. Cortex bundles those operations. If your team has ML/infra engineers and a long roadmap, LlamaIndex gives you ceiling. If you want to ship grounded agent memory in a week, Cortex gets you there faster.

Is Cortex open source?

Yes. The Cortex API stack and the Obsidian plugin are open source. Cloud mode runs the same core with managed infrastructure on top. So you have the same 'OSS or managed' choice that LlamaIndex offers — just at a different layer.

Related comparisons

If you're evaluating this against Cortex, you're probably also weighing these.

vs. Pinecone

Pinecone stores embeddings. Cortex serves grounded answers.

vs. Supabase + pgvector

Supabase gives you primitives. Cortex is what you'd build on top.

vs. LangChain (LangMem)

LangMem is a memory primitive for LangChain. Cortex is end-to-end memory infrastructure.

See all comparisons

Skip the assembly. Ship grounded memory.

Point Cortex at your corpus, connect Claude or Cursor over MCP, and watch your agents start citing your docs. No retrieval pipeline to build.

Get started More comparisons