What is ProbonoBonobo sui mcp server
MCP Server with FAISS for RAG
This project provides a proof-of-concept implementation of a Machine Conversation Protocol (MCP) server that allows an AI agent to query a vector database and retrieve relevant documents for Retrieval-Augmented Generation (RAG).
Features
- FastAPI server with MCP endpoints
- FAISS vector database integration
- Document chunking and embedding
- GitHub Move file extraction and processing
- LLM integration for complete RAG workflow
- Simple client example
- Sample documents
Installation
Using pipx (Recommended)
pipx is a tool to help you install and run Python applications in isolated environments.
- First, install pipx if you don't have it:
# On macOS
brew install pipx
pipx ensurepath
# On Ubuntu/Debian
sudo apt update
sudo apt install python3-pip python3-venv
python3 -m pip install --user pipx
python3 -m pipx ensurepath
# On Windows with pip
pip install pipx
pipx ensurepath
- Install the MCP Server package directly from the project directory:
# Navigate to the directory containing the mcp_server folder
cd /path/to/mcp-server-project
# Install in editable mode
pipx install -e .
- (Optional) Configure environment variables:
- Copy
.env.example
to.env
- Add your GitHub token for higher rate limits:
GITHUB_TOKEN=your_token_here
- Add your OpenAI or other LLM API key for RAG integration:
OPENAI_API_KEY=your_key_here
- Copy
Manual Installation
If you prefer not to use pipx:
- Clone the repository
- Install dependencies:
cd mcp_server
pip install -r requirements.txt
Usage with pipx
After installing with pipx, you'll have access to the following commands:
Downloading Move Files from GitHub
# Download Move files with default settings
mcp-download --query "use sui" --output-dir docs/move_files
# Download with more options
mcp-download --query "module sui::coin" --max-results 50 --new-index --verbose
Improved GitHub Search and Indexing (Recommended)
# Search GitHub and index files with default settings
mcp-search-index --keywords "sui move"
# Search multiple keywords and customize options
mcp-search-index --keywords "sui move,move framework" --max-repos 30 --output-results --verbose
# Save search results and use a custom index location
mcp-search-index --keywords "sui coin,sui::transfer" --index-file custom/path/index.bin --output-results
The mcp-search-index
command provides enhanced GitHub repository search capabilities:
- Searches repositories first, then recursively extracts Move files
- Supports multiple search keywords (comma-separated)
- Intelligently filters for Move files containing "use sui" references
- Always rebuilds the vector database after downloading
Indexing Move Files
# Index files in the default location
mcp-index
# Index with custom options
mcp-index --docs-dir path/to/files --index-file path/to/index.bin --verbose
Querying the Vector Database
# Basic query
mcp-query "What is a module in Sui Move?"
# Advanced query with options
mcp-query "How do I define a struct in Sui Move?" -k 3 -f
Using RAG with LLM Integration
# Basic RAG query (will use simulated LLM if no API key is provided)
mcp-rag "What is a module in Sui Move?"
# Using with a specific LLM API
mcp-rag "How do I define a struct in Sui Move?" --api-key your_api_key --top-k 3
# Output as JSON for further processing
mcp-rag "What are the benefits of sui::coin?" --output-json > rag_response.json
Running the Server
# Start the server with default settings
mcp-server
# Start with custom settings
mcp-server --host 127.0.0.1 --port 8080 --index-file custom/path/index.bin
Manual Usage (without pipx)
Starting the server
cd mcp_server
python main.py
The server will start on http://localhost:8000
Downloading Move Files from GitHub
To download Move files from GitHub and populate your vector database:
# Download Move files with default query "use sui"
./run.sh --download-move
# Customize the search query
./run.sh --download-move --github-query "module sui::coin" --max-results 50
# Download, index, and start the server
./run.sh --download-move --index
You can also use the Python script directly:
python download_move_files.py --query "use sui" --output-dir docs/move_files
Indexing documents
Before querying, you need to index your documents. You can place your text files (.txt), Markdown files (.md), or Move files (.move) in the docs
directory.
To index the documents, you can either:
- Use the run script with the
--index
flag:
./run.sh --index
- Use the index script directly:
python index_move_files.py --docs-dir docs/move_files --index-file data/faiss_index.bin
Querying documents
You can use the local query script:
python local_query.py "What is RAG?"
# With more options
python local_query.py -k 3 -f "How to define a struct in Sui Move?"
Using RAG with LLM Integration
# Direct RAG query with an LLM
python rag_integration.py "What is a module in Sui Move?" --index-file data/faiss_index.bin
# With API key (if you have one)
OPENAI_API_KEY=your_key_here python rag_integration.py "How do coins work in Sui?"
MCP API Endpoint
The MCP API endpoint is available at /mcp/action
. You can use it to perform different actions:
retrieve_documents
: Retrieve relevant documents for a queryindex_documents
: Index documents from a directory
Example:
curl -X POST "http://localhost:8000/mcp/action" -H "Content-Type: application/json" -d '{"action_type": "retrieve_documents", "payload": {"query": "What is RAG?", "top_k": 3}}'
Complete RAG Pipeline
The full RAG (Retrieval-Augmented Generation) pipeline works as follows:
- Search Query: The user submits a question
- Retrieval: The system searches the vector database for relevant documents
- Context Formation: Retrieved documents are formatted into a prompt
- LLM Generation: The prompt is sent to an LLM with the retrieved context
- Enhanced Response: The LLM provides an answer based on the retrieved information
This workflow is fully implemented in the rag_integration.py
module, which can be used either through the command line or as a library in your own applications.
GitHub Move File Extraction
The system can extract Move files from GitHub based on search queries. It implements two methods:
- GitHub API (preferred): Requires a GitHub token for higher rate limits
- Web Scraping fallback: Used when API method fails or when no token is provided
To configure your GitHub token, set it in the .env
file or as an environment variable:
GITHUB_TOKEN=your_github_token_here
Project Structure
mcp_server/
โโโ __init__.py # Package initialization
โโโ main.py # Main server file
โโโ mcp_api.py # MCP API implementation
โโโ index_move_files.py # File indexing utility
โโโ local_query.py # Local query utility
โโโ download_move_files.py # GitHub Move file extractor
โโโ rag_integration.py # LLM integration for RAG
โโโ pyproject.toml # Package configuration
โโโ requirements.txt # Dependencies
โโโ .env.example # Example environment variables
โโโ README.md # This file
โโโ data/ # Storage for the FAISS index
โโโ docs/ # Sample documents
โ โโโ move_files/ # Downloaded Move files
โโโ models/ # Model implementations
โ โโโ vector_store.py # FAISS vector store implementation
โโโ utils/
โโโ document_processor.py # Document processing utilities
โโโ github_extractor.py # GitHub file extraction utilities
Extending the Project
To extend this proof-of-concept:
- Add authentication and security features
- Implement more sophisticated document processing
- Add support for more document types
- Integrate with other LLM providers
- Add monitoring and logging
- Improve the Move language parsing for more structured data extraction
License
MIT
Leave a Comment
Frequently Asked Questions
What is MCP?
MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications, providing a standardized way to connect AI models to different data sources and tools.
What are MCP Servers?
MCP Servers are lightweight programs that expose specific capabilities through the standardized Model Context Protocol. They act as bridges between LLMs like Claude and various data sources or services, allowing secure access to files, databases, APIs, and other resources.
How do MCP Servers work?
MCP Servers follow a client-server architecture where a host application (like Claude Desktop) connects to multiple servers. Each server provides specific functionality through standardized endpoints and protocols, enabling Claude to access data and perform actions through the standardized protocol.
Are MCP Servers secure?
Yes, MCP Servers are designed with security in mind. They run locally with explicit configuration and permissions, require user approval for actions, and include built-in security features to prevent unauthorized access and ensure data privacy.
Related MCP Servers
chrisdoc hevy mcp
sylphlab pdf reader mcp
An MCP server built with Node.js/TypeScript that allows AI agents to securely read PDF files (local or URL) and extract text, metadata, or page counts. Uses pdf-parse.
aashari mcp server atlassian bitbucket
Node.js/TypeScript MCP server for Atlassian Bitbucket. Enables AI systems (LLMs) to interact with workspaces, repositories, and pull requests via tools (list, get, comment, search). Connects AI directly to version control workflows through the standard MCP interface.
aashari mcp server atlassian confluence
Node.js/TypeScript MCP server for Atlassian Confluence. Provides tools enabling AI systems (LLMs) to list/get spaces & pages (content formatted as Markdown) and search via CQL. Connects AI seamlessly to Confluence knowledge bases using the standard MCP interface.
prisma prisma
Next-generation ORM for Node.js & TypeScript | PostgreSQL, MySQL, MariaDB, SQL Server, SQLite, MongoDB and CockroachDB
Zzzccs123 mcp sentry
mcp sentry for typescript sdk
zhuzhoulin dify mcp server
zhongmingyuan mcp my mac
zhixiaoqiang desktop image manager mcp
MCP ๆๅกๅจ๏ผ็จไบ็ฎก็ๆก้ขๅพ็ใๆฅ็่ฏฆๆ ใๅ็ผฉใ็งปๅจ็ญ๏ผๅฎๅ จ่ฎฉTraeๅฎ็ฐ๏ผ
zhixiaoqiang antd components mcp
An MCP service for Ant Design components query | ไธไธชๅๅฐ Ant Design ็ปไปถไปฃ็ ็ๆๅนป่ง็ MCP ๆๅก๏ผๅ ๅซ็ณป็ปๆ็คบ่ฏใ็ปไปถๆๆกฃใAPI ๆๆกฃใไปฃ็ ็คบไพๅๆดๆฐๆฅๅฟๆฅ่ฏข
Submit Your MCP Server
Share your MCP server with the community
Submit Now