Show HN: MCP Document Indexer – Local AI search for your documents using Ollama

Show HN: MCP Document Indexer – Local AI search for your documents using Ollama

Hacker News - AI 28 min read Article

Summary

The MCP Document Indexer is a Python-based tool for local document indexing and search, utilizing Ollama and LanceDB for efficient document management and semantic search capabilities.

Why It Matters

As remote work and digital documentation proliferate, tools like the MCP Document Indexer empower users to manage their documents locally, ensuring privacy and efficiency. By integrating local LLMs for summarization and keyword extraction, it enhances productivity without relying on cloud services, addressing concerns around data security.

Key Takeaways

  • Supports multiple document formats including PDF, Word, and Markdown.
  • Integrates local LLMs for real-time document summarization and keyword extraction.
  • Utilizes LanceDB for efficient semantic search and indexing.
  • Optimized for performance on standard laptops, making it accessible for everyday users.
  • Allows for incremental indexing, processing only changed files to save resources.

MCP Document Indexer A Python-based MCP (Model Context Protocol) server for local document indexing and search using LanceDB vector database and local LLMs. Features Real-time Document Monitoring: Automatically indexes new and modified documents in configured folders Multi-format Support: Handles PDF, Word (docx/doc), text, Markdown, and RTF files Local LLM Integration: Uses Ollama for document summarization and keyword extraction. Nothing ever leaves your computer Vector Search: Semantic search using LanceDB and sentence transformers MCP Integration: Exposes search and catalog tools via Model Context Protocol Incremental Indexing: Only processes changed files to save resources Performance Optimized: Designed for decent performance on standard laptops (e.g. M1/M2 MacBook) Installation Prerequisites Python 3.9+ installed uv package manager: curl -LsSf https://astral.sh/uv/install.sh | sh Ollama (for local LLM): # Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Pull a model (e.g., llama3.2) ollama pull llama3.2:3b Install MCP Document Indexer # Clone the repository git clone https://github.com/yairwein/mcp-doc-indexer.git cd mcp-doc-indexer # Install with uv uv sync # Or install as a package uv add mcp-doc-indexer Configuration Configure the indexer using environment variables or a .env file: # Folders to monitor (comma-separated) WATCH_FOLDERS="/Users/me/Documents,/Users/me/Research" # LanceDB storage path LANCEDB_PATH="./vector_index" # Ollama model for summ...

Related Articles

Llms

AI Has Broken the Internet

So the web has been breaking a lot lately. Vercel is down. GitHub is down. Claude is down. Cloudflare is down. AWS is down. Everything is...

Reddit - Artificial Intelligence · 1 min ·
Llms

LLM agents can trigger real actions now. But what actually stops them from executing?

We ran into a simple but important issue while building agents with tool calling: the model can propose actions but nothing actually enfo...

Reddit - Artificial Intelligence · 1 min ·
Llms

Are LLMs a Dead End? (Investors Just Bet $1 Billion on “Yes”)

| AI Reality Check | Cal Newport Chapters 0:00 What is Yan LeCun Up To? 14:55 How is it possible that LeCun could be right about LLM’s be...

Reddit - Artificial Intelligence · 1 min ·
Mercor says it was hit by cyberattack tied to compromise of open-source LiteLLM project | TechCrunch
Llms

Mercor says it was hit by cyberattack tied to compromise of open-source LiteLLM project | TechCrunch

The AI recruiting startup confirmed a security incident after an extortion hacking crew took credit for stealing data from the company's ...

TechCrunch - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime