Enhanced MCP server for deep web research
What is PedroDnT mcp DEEPwebresearch
MCP Deep Web Research Server (v0.3.0)
*
*
*
*
A Model Context Protocol (MCP) server for advanced web research.
Latest Changes
- Added visit_page tool for direct webpage content extraction
- Optimized performance to work within MCP timeout limits
- Reduced default maxDepth and maxBranching parameters
- Improved page loading efficiency
- Added timeout checks throughout the process
- Enhanced error handling for timeouts
This project is a fork of mcp-webresearch by mzxrai, enhanced with additional features for deep web research capabilities. We're grateful to the original creators for their foundational work.
Bring real-time info into Claude with intelligent search queuing, enhanced content extraction, and deep research capabilities.
Features
-
Intelligent Search Queue System
- Batch search operations with rate limiting
- Queue management with progress tracking
- Error recovery and automatic retries
- Search result deduplication
-
Enhanced Content Extraction
- TF-IDF based relevance scoring
- Keyword proximity analysis
- Content section weighting
- Readability scoring
- Improved HTML structure parsing
- Structured data extraction
- Better content cleaning and formatting
-
Core Features
- Google search integration
- Webpage content extraction
- Research session tracking
- Markdown conversion with improved formatting
Prerequisites
- Node.js >= 18 (includes
npm
andnpx
) - Claude Desktop app
Installation
Installing via Smithery
To install Deep Web Research Server for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @PedroDnT/mcp-deepwebresearch --client claude
Global Installation (Recommended)
# Install globally using npm
npm install -g mcp-deepwebresearch
# Or using yarn
yarn global add mcp-deepwebresearch
# Or using pnpm
pnpm add -g mcp-deepwebresearch
Local Project Installation
# Using npm
npm install mcp-deepwebresearch
# Using yarn
yarn add mcp-deepwebresearch
# Using pnpm
pnpm add mcp-deepwebresearch
Claude Desktop Integration
After installing the package, add this entry to your claude_desktop_config.json
:
Windows
{
"mcpServers": {
"deepwebresearch": {
"command": "mcp-deepwebresearch",
"args": []
}
}
}
Location: %APPDATA%\Claude\claude_desktop_config.json
macOS
{
"mcpServers": {
"deepwebresearch": {
"command": "mcp-deepwebresearch",
"args": []
}
}
}
Location: ~/Library/Application Support/Claude/claude_desktop_config.json
This config allows Claude Desktop to automatically start the web research MCP server when needed.
First-time Setup
After installation, run this command to install required browser dependencies:
npx playwright install chromium
Usage
Simply start a chat with Claude and send a prompt that would benefit from web research. If you'd like a prebuilt prompt customized for deeper web research, you can use the agentic-research
prompt that we provide through this package. Access that prompt in Claude Desktop by clicking the Paperclip icon in the chat input and then selecting Choose an integration
→ deepwebresearch
→ agentic-research
.
Tools
-
deep_research
- Performs comprehensive research with content analysis
- Arguments:
{ topic: string; maxDepth?: number; // default: 2 maxBranching?: number; // default: 3 timeout?: number; // default: 55000 (55 seconds) minRelevanceScore?: number; // default: 0.7 }
- Returns:
{ findings: { mainTopics: Array<{name: string, importance: number}>; keyInsights: Array<{text: string, confidence: number}>; sources: Array<{url: string, credibilityScore: number}>; }; progress: { completedSteps: number; totalSteps: number; processedUrls: number; }; timing: { started: string; completed?: string; duration?: number; operations?: { parallelSearch?: number; deduplication?: number; topResultsProcessing?: number; remainingResultsProcessing?: number; total?: number; }; }; }
-
parallel_search
- Performs multiple Google searches in parallel with intelligent queuing
- Arguments:
{ queries: string[], maxParallel?: number }
- Note: maxParallel is limited to 5 to ensure reliable performance
-
visit_page
- Visit a webpage and extract its content
- Arguments:
{ url: string }
- Returns:
{ url: string; title: string; content: string; // Markdown formatted content }
Prompts
agentic-research
A guided research prompt that helps Claude conduct thorough web research. The prompt instructs Claude to:
- Start with broad searches to understand the topic landscape
- Prioritize high-quality, authoritative sources
- Iteratively refine the research direction based on findings
- Keep you informed and let you guide the research interactively
- Always cite sources with URLs
Configuration Options
The server can be configured through environment variables:
MAX_PARALLEL_SEARCHES
: Maximum number of concurrent searches (default: 5)SEARCH_DELAY_MS
: Delay between searches in milliseconds (default: 200)MAX_RETRIES
: Number of retry attempts for failed requests (default: 3)TIMEOUT_MS
: Request timeout in milliseconds (default: 55000)LOG_LEVEL
: Logging level (default: 'info')
Error Handling
Common Issues
-
Rate Limiting
- Symptom: "Too many requests" error
- Solution: Increase
SEARCH_DELAY_MS
or decreaseMAX_PARALLEL_SEARCHES
-
Network Timeouts
- Symptom: "Request timed out" error
- Solution: Ensure requests complete within the 60-second MCP timeout
-
Browser Issues
- Symptom: "Browser failed to launch" error
- Solution: Ensure Playwright is properly installed (
npx playwright install
)
Debugging
This is beta software. If you run into issues:
-
Check Claude Desktop's MCP logs:
# On macOS tail -n 20 -f ~/Library/Logs/Claude/mcp*.log # On Windows Get-Content -Path "$env:APPDATA\Claude\logs\mcp*.log" -Tail 20 -Wait
-
Enable debug logging:
export LOG_LEVEL=debug
Development
Setup
# Install dependencies
pnpm install
# Build the project
pnpm build
# Watch for changes
pnpm watch
# Run in development mode
pnpm dev
Testing
# Run all tests
pnpm test
# Run tests in watch mode
pnpm test:watch
# Run tests with coverage
pnpm test:coverage
Code Quality
# Run linter
pnpm lint
# Fix linting issues
pnpm lint:fix
# Type check
pnpm type-check
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Coding Standards
- Follow TypeScript best practices
- Maintain test coverage above 80%
- Document new features and APIs
- Update CHANGELOG.md for significant changes
- Follow semantic versioning
Performance Considerations
- Use batch operations where possible
- Implement proper error handling and retries
- Consider memory usage with large datasets
- Cache results when appropriate
- Use streaming for large content
Requirements
- Node.js >= 18
- Playwright (automatically installed as a dependency)
Verified Platforms
- macOS
- Windows
- Linux
License
MIT
Credits
This project builds upon the excellent work of mcp-webresearch by mzxrai. The original codebase provided the foundation for our enhanced features and capabilities.
Author
qpd-v
Leave a Comment
Frequently Asked Questions
What is MCP?
MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications, providing a standardized way to connect AI models to different data sources and tools.
What are MCP Servers?
MCP Servers are lightweight programs that expose specific capabilities through the standardized Model Context Protocol. They act as bridges between LLMs like Claude and various data sources or services, allowing secure access to files, databases, APIs, and other resources.
How do MCP Servers work?
MCP Servers follow a client-server architecture where a host application (like Claude Desktop) connects to multiple servers. Each server provides specific functionality through standardized endpoints and protocols, enabling Claude to access data and perform actions through the standardized protocol.
Are MCP Servers secure?
Yes, MCP Servers are designed with security in mind. They run locally with explicit configuration and permissions, require user approval for actions, and include built-in security features to prevent unauthorized access and ensure data privacy.
Related MCP Servers
Brave Search MCP
Integrate Brave Search capabilities into Claude through MCP. Enables real-time web searches with privacy-focused results and comprehensive web coverage.
chrisdoc hevy mcp
sylphlab pdf reader mcp
An MCP server built with Node.js/TypeScript that allows AI agents to securely read PDF files (local or URL) and extract text, metadata, or page counts. Uses pdf-parse.
aashari mcp server atlassian bitbucket
Node.js/TypeScript MCP server for Atlassian Bitbucket. Enables AI systems (LLMs) to interact with workspaces, repositories, and pull requests via tools (list, get, comment, search). Connects AI directly to version control workflows through the standard MCP interface.
aashari mcp server atlassian confluence
Node.js/TypeScript MCP server for Atlassian Confluence. Provides tools enabling AI systems (LLMs) to list/get spaces & pages (content formatted as Markdown) and search via CQL. Connects AI seamlessly to Confluence knowledge bases using the standard MCP interface.
prisma prisma
Next-generation ORM for Node.js & TypeScript | PostgreSQL, MySQL, MariaDB, SQL Server, SQLite, MongoDB and CockroachDB
Zzzccs123 mcp sentry
mcp sentry for typescript sdk
zhuzhoulin dify mcp server
zhongmingyuan mcp my mac
zhixiaoqiang desktop image manager mcp
MCP 服务器,用于管理桌面图片、查看详情、压缩、移动等(完全让Trae实现)
Submit Your MCP Server
Share your MCP server with the community
Submit Now