nakamurau1 tts mcp

nakamurau1 tts mcp avatar

by nakamurau1

What is nakamurau1 tts mcp

tts-mcp

A Model Context Protocol (MCP) server and command-line tool for high-quality text-to-speech generation using the OpenAI TTS API.

Main Features

  • MCP Server: Integrate text-to-speech capabilities with Claude Desktop and other MCP-compatible clients
  • Voice Options: Support for multiple voice characters (alloy, nova, echo, etc.)
  • High-Quality Audio: Support for various output formats (MP3, WAV, OPUS, AAC)
  • Customizable: Configure speech speed, voice character, and additional instructions
  • CLI Tool: Also available as a command-line utility for direct text-to-speech conversion

Installation

Method 1: Install from Repository

# Clone the repository
git clone https://github.com/nakamurau1/tts-mcp.git
cd tts-mcp

# Install dependencies
npm install

# Optional: Install globally
npm install -g .

Method 2: Run Directly with npx (No Installation Required)

# Start the MCP server directly
npx tts-mcp tts-mcp-server --voice nova --model tts-1-hd

# Use the CLI tool directly
npx tts-mcp -t "Hello, world" -o hello.mp3

MCP Server Usage

The MCP server allows you to integrate text-to-speech functionality with Model Context Protocol (MCP) compatible clients like Claude Desktop.

Starting the MCP Server

# Start with default settings
npm run server

# Start with custom settings
npm run server -- --voice nova --model tts-1-hd

# Or directly with API key
node bin/tts-mcp-server.js --voice echo --api-key your-openai-api-key

MCP Server Options

Options:
  -V, --version       Display version information
  -m, --model <model> TTS model to use (default: "gpt-4o-mini-tts")
  -v, --voice <voice> Voice character (default: "alloy")
  -f, --format <format> Audio format (default: "mp3")
  --api-key <key>     OpenAI API key (can also be set via environment variable)
  -h, --help          Display help information

Integrating with MCP Clients

The MCP server can be used with Claude Desktop and other MCP-compatible clients. For Claude Desktop integration:

  1. Open the Claude Desktop configuration file (typically at ~/Library/Application Support/Claude/claude_desktop_config.json)
  2. Add the following configuration, including your OpenAI API key:
{
  "mcpServers": {
    "tts-mcp": {
      "command": "node",
      "args": ["full/path/to`/bin/tts-mcp-server.js",` "--voice", "nova", "--api-key", "your-openai-api-key"],
      "env": {
        "OPENAI_API_KEY": "your-openai-api-key"
      }
    }
  }
}

Alternatively, you can use npx for easier setup:

{
  "mcpServers": {
    "tts-mcp": {
      "command": "npx",
      "args": ["-p", "tts-mcp", "tts-mcp-server", "--voice", "nova", "--model", "gpt-4o-mini-tts"],
      "env": {
        "OPENAI_API_KEY": "your-openai-api-key"
      }
    }
  }
}

You can provide the API key in two ways:

  1. Direct method (recommended for testing): Include it in the args array using the --api-key parameter
  2. Environment variable method (more secure): Set it in the env object as shown above

Security Note: Make sure to secure your configuration file when including API keys.

  1. Restart Claude Desktop
  2. When you ask Claude to "read this text aloud" or similar requests, the text will be converted to speech

Available MCP Tools

  • text-to-speech: Tool for converting text to speech and playing it

CLI Tool Usage

You can also use tts-mcp as a standalone command-line tool:

# Convert text directly
tts-mcp -t "Hello, world" -o hello.mp3

# Convert from a text file
tts-mcp -f speech.txt -o speech.mp3

# Specify custom voice
tts-mcp -t "Welcome to the future" -o welcome.mp3 -v nova

CLI Tool Options

Options:
  -V, --version           Display version information
  -t, --text <text>       Text to convert
  -f, --file <path>       Path to input text file
  -o, --output <path>     Path to output audio file (required)
  -m, --model <n>         Model to use (default: "gpt-4o-mini-tts")
  -v, --voice <n>         Voice character (default: "alloy")
  -s, --speed <number>    Speech speed (0.25-4.0) (default: 1)
  --format <format>       Output format (default: "mp3")
  -i, --instructions <text> Additional instructions for speech generation
  --api-key <key>         OpenAI API key (can also be set via environment variable)
  -h, --help              Display help information

Supported Voices

The following voice characters are supported:

  • alloy (default)
  • ash
  • coral
  • echo
  • fable
  • onyx
  • nova
  • sage
  • shimmer

Supported Models

  • tts-1
  • tts-1-hd
  • gpt-4o-mini-tts (default)

Output Formats

The following output formats are supported:

  • mp3 (default)
  • opus
  • aac
  • flac
  • wav
  • pcm

Environment Variables

You can also configure the tool using system environment variables:

OPENAI_API_KEY=your-api-key-here

License

MIT

Leave a Comment

Frequently Asked Questions

What is MCP?

MCP (Model Context Protocol) is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications, providing a standardized way to connect AI models to different data sources and tools.

What are MCP Servers?

MCP Servers are lightweight programs that expose specific capabilities through the standardized Model Context Protocol. They act as bridges between LLMs like Claude and various data sources or services, allowing secure access to files, databases, APIs, and other resources.

How do MCP Servers work?

MCP Servers follow a client-server architecture where a host application (like Claude Desktop) connects to multiple servers. Each server provides specific functionality through standardized endpoints and protocols, enabling Claude to access data and perform actions through the standardized protocol.

Are MCP Servers secure?

Yes, MCP Servers are designed with security in mind. They run locally with explicit configuration and permissions, require user approval for actions, and include built-in security features to prevent unauthorized access and ensure data privacy.

Related MCP Servers

chrisdoc hevy mcp avatar

chrisdoc hevy mcp

mcp
sylphlab pdf reader mcp avatar

sylphlab pdf reader mcp

An MCP server built with Node.js/TypeScript that allows AI agents to securely read PDF files (local or URL) and extract text, metadata, or page counts. Uses pdf-parse.

pdf-parsetypescriptnodejs
aashari mcp server atlassian bitbucket avatar

aashari mcp server atlassian bitbucket

Node.js/TypeScript MCP server for Atlassian Bitbucket. Enables AI systems (LLMs) to interact with workspaces, repositories, and pull requests via tools (list, get, comment, search). Connects AI directly to version control workflows through the standard MCP interface.

atlassianrepositorymcp
aashari mcp server atlassian confluence avatar

aashari mcp server atlassian confluence

Node.js/TypeScript MCP server for Atlassian Confluence. Provides tools enabling AI systems (LLMs) to list/get spaces & pages (content formatted as Markdown) and search via CQL. Connects AI seamlessly to Confluence knowledge bases using the standard MCP interface.

atlassianmcpconfluence
prisma prisma avatar

prisma prisma

Next-generation ORM for Node.js & TypeScript | PostgreSQL, MySQL, MariaDB, SQL Server, SQLite, MongoDB and CockroachDB

cockroachdbgomcp
Zzzccs123 mcp sentry avatar

Zzzccs123 mcp sentry

mcp sentry for typescript sdk

mcptypescript
zhuzhoulin dify mcp server avatar

zhuzhoulin dify mcp server

mcp
zhongmingyuan mcp my mac avatar

zhongmingyuan mcp my mac

mcp
zhixiaoqiang desktop image manager mcp avatar

zhixiaoqiang desktop image manager mcp

MCP 服务器,用于管理桌面图片、查看详情、压缩、移动等(完全让Trae实现)

mcp
zhixiaoqiang antd components mcp avatar

zhixiaoqiang antd components mcp

An MCP service for Ant Design components query | 一个减少 Ant Design 组件代码生成幻觉的 MCP 服务,包含系统提示词、组件文档、API 文档、代码示例和更新日志查询

designantdapi

Submit Your MCP Server

Share your MCP server with the community

Submit Now