MCP 业务落地实践

纸上得来终觉浅,绝知此事要躬行。在上篇近期LLM的一些趋势之二——MCP之后,很快发现这不只是趋势,而是已经成为行业前仆后继落地的现实。虽然本心是不想为了热点和流行而去追逐,但近期有好几个业务场景在分析和demo之后,都发现可以通过MCP以成本较低、有标准可循且可持续迭代演进的方式显著改善旧有问题的解决效果。因此,这个热点算是一拍即合。

网上关于MCP的文章很多了,但是更多是停留在MCP本身,很多其实一眼可以看出也是AI的产物。这本身是时代发展的趋势,但这些文章都缺乏如何把MCP与LLM结合落地的全局视角。更进一步的,真正落地时候需呀处理的关键细节也是当前缺失的。以前提起躬身入局是一种行动,更是一种姿态的表达,而在AI时代来临面前这可能是留给human的不多的一块自留地。下面以自己实践的过程,结合协议底层拆解下MCP的落地细节和一些关键问题思考。

MCP 协议

MCP RPC 目前发布了两个版本。都可以在官网specification中找到。
* 2024-11-05: https://modelcontextprotocol.io/specification/2024-11-05
* 2025-03-26: https://modelcontextprotocol.io/specification/2025-03-26

官方的文档整体质量很高,我推荐所有要做MCP业务落地的都花一点时间全文阅读。尤其是其中的Best Practice部分,里面有很多非常实用的建议和最佳实践。

  • 当前时间点上,生态上对2024-11-05的支持更加完善,因此生产级别上使用的话,建议先采纳这个版本的实现,兼容性上也会更好。
  • transport 层支持stdio 和 HTTP with SSE(后简称SSE) 两种模式。
  • stdio 适合在本地运行的agent client
  • HTTP with SSE 适合远程运行的agent client
  • 因此,大部分B/S模式的业务落地应该使用 SSE 作为 transport 层。因此MCP 网关是架构上很自然要建设的关键基建之一。下文我们单独讲这部分内容。
  • 两种 transport 均使用 JSON-RPC 2.0 作为消息格式进行通信。这部分内容也强烈建议阅读一下 specification, 对后面无论是落地建设 MCP Server 还是 MCP gateway 都有很大帮助。

如何把构建或将已有服务转换为 MCP Server

这两个问题之所以放在一起,是因为在SSE 作为 transport 的前提下,本质上这是同一个问题。在这种模式下的 MCP Server 是一个 支持 MCP 协议的 HTTP server,负责处理来自 client 的请求,并将结果返回给 client。MCP Server 需要实现以下几个关键点:

  • 主要处理两个端点 endpoint: SSE 连接端点 以及 消息通信端点。
  • SSE连接端点:一般实现使用的端点为 /sse, 该端点负责与client建立长连接session, 走 SSE. 这里可以做鉴权认证,同时会返回给client 消息通信的端点。
  • 消息通信端点:一般实现使用的端点为 /message 进行消息的发送和接收,走标准的 HTTP POST, 这里也可以做鉴权认证。
  • 其他详情可参考参考spec中的(Transport)[https://modelcontextprotocol.io/specification/2024-11-05/basic/transports].

作为一名gopher, 简单看了下mcp-go相关部分的实现:

SSE 端点建立连接,返回消息通信端点:

func (s *SSEServer) handleSSE(w http.ResponseWriter, r *http.Request) {
...
// Send the initial endpoint event
fmt.Fprintf(w, "event: endpoint\ndata: %s\r\n\r\n", s.GetMessageEndpointForClient(sessionID))
flusher.Flush()

// Main event loop - this runs in the HTTP handler goroutine
for {
select {
case event := <-session.eventQueue:
            // Write the event to the response
            fmt.Fprint(w, event)
            flusher.Flush()
        case <-r.Context().Done():
            close(session.done)
            return
        case <-session.done:
            return
        }
    }

注:其中SSE 消息格式可参考阮一峰老师的Server-Sent Events 教程

消息通信端点HTTP POST请求处理:


    switch baseMessage.Method {
    case mcp.MethodInitialize:
        var request mcp.InitializeRequest
        var result *mcp.InitializeResult
        if unmarshalErr := json.Unmarshal(message, &request); unmarshalErr != nil {
            err = &requestError{
                id:   baseMessage.ID,
                code: mcp.INVALID_REQUEST,
                err:  &UnparsableMessageError{message: message, err: unmarshalErr, method: baseMessage.Method},
            }
        } else {
            s.hooks.beforeInitialize(ctx, baseMessage.ID, &request)
            result, err = s.handleInitialize(ctx, baseMessage.ID, request)
        }
        if err != nil {
            s.hooks.onError(ctx, baseMessage.ID, baseMessage.Method, &request, err)
            return err.ToJSONRPCError()
        }
        s.hooks.afterInitialize(ctx, baseMessage.ID, &request, result)
        return createResponse(baseMessage.ID, *result)
    case mcp.MethodPing:
        var request mcp.PingRequest
        var result *mcp.EmptyResult
        if unmarshalErr := json.Unmarshal(message, &request); unmarshalErr != nil {
            err = &requestError{
                id:   baseMessage.ID,
                code: mcp.INVALID_REQUEST,
                err:  &UnparsableMessageError{message: message, err: unmarshalErr, method: baseMessage.Method},
            }
        } else {
            s.hooks.beforePing(ctx, baseMessage.ID, &request)
            result, err = s.handlePing(ctx, baseMessage.ID, request)
        }
        if err != nil {
            s.hooks.onError(ctx, baseMessage.ID, baseMessage.Method, &request, err)
            return err.ToJSONRPCError()
        }
        s.hooks.afterPing(ctx, baseMessage.ID, &request, result)
        return createResponse(baseMessage.ID, *result)
    case mcp.MethodResourcesList:

从上面代码可以看出,无论是初始化、接口列表发现、还是工具调用,都是通过HTTP POST请求来实现的。这里需要注意的是,MCP Server 需要同时支持长连接的SSE端点和短连接的HTTP POST端点。

但是在实际的业务系统中,不太可能把每个应用服务都按照以上方式进行改造。一个推荐的架构方式:

MCP client –> MCP server 网关 –> application server

但是从效果上看,以为直接套用这个架构思路就可以大杀四方就是一种幻想了。从实践上看,至少要做好一下节点:

  • 如何进行领域MCP Server的划分。映射到tools上,就是如何做接口的领域分层和治理。
  • tools中每个接口到应用层接口的编排
  • 每个tool的准确描述、参数说明,以及返回参数的剪裁,确保值返回必要、明确的字段。
  • 接口返回结果错误的精细处理和返回。这部分需要一些耐心,确保LLM能够尽可能的理解异常情况,持续react, 而不阻断后续推理。

如何在agent client中使用MCP

MCP Client 作为桥梁,按照协议与 MCP Server 进行通信。然后将MCP server 提供的tools, resources, prompts 提供给LLM。

实际应用中,LLM 通过prompt发现和使用mcp server:

{
"model": "deepseek-reasoner",
"messages": [
{
"role": "system",
"content": "In this environment you have access to a set of tools you can use to answer the user's question. You can use one tool per message, and will receive the result of that tool use in the user's response. You use tools step-by-step to accomplish a given task, with each tool use informed by the result of the previous tool use.\n\n## Tool Use Formatting\n\nTool use is formatted using XML-style tags. The tool name is enclosed in opening and closing tags, and each parameter is similarly enclosed within its own set of tags. Here's the structure:\n\n<tool_use>\n  <name>{tool_name}</name>\n  <arguments>{json_arguments}</arguments>\n</tool_use>\n\nThe tool name should be the exact name of the tool you are using, and the arguments should be a JSON object containing the parameters required by that tool. For example:\n<tool_use>\n  <name>python_interpreter</name>\n  <arguments>{\"code\": \"5 + 3 + 1294.678\"}</arguments>\n</tool_use>\n\nThe user will respond with the result of the tool use, which should be formatted as follows:\n\n<tool_use_result>\n  <name>{tool_name}</name>\n  <result>{result}</result>\n</tool_use_result>\n\nThe result should be a string, which can represent a file or any other output type. You can use this result as input for the next action.\nFor example, if the result of the tool use is an image file, you can use it in the next action like this:\n\n<tool_use>\n  <name>image_transformer</name>\n  <arguments>{\"image\": \"image_1.jpg\"}</arguments>\n</tool_use>\n\nAlways adhere to this format for the tool use to ensure proper parsing and execution.\n\n## Tool Use Examples\n\nHere are a few examples using notional tools:\n---\nUser: Generate an image of the oldest person in this document.\n\nAssistant: I can use the document_qa tool to find out who the oldest person is in the document.\n<tool_use>\n  <name>document_qa</name>\n  <arguments>{\"document\": \"document.pdf\", \"question\": \"Who is the oldest person mentioned?\"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>document_qa</name>\n  <result>John Doe, a 55 year old lumberjack living in Newfoundland.</result>\n</tool_use_result>\n\nAssistant: I can use the image_generator tool to create a portrait of John Doe.\n<tool_use>\n  <name>image_generator</name>\n  <arguments>{\"prompt\": \"A portrait of John Doe, a 55-year-old man living in Canada.\"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>image_generator</name>\n  <result>image.png</result>\n</tool_use_result>\n\nAssistant: the image is generated as image.png\n\n---\nUser: \"What is the result of the following operation: 5 + 3 + 1294.678?\"\n\nAssistant: I can use the python_interpreter tool to calculate the result of the operation.\n<tool_use>\n  <name>python_interpreter</name>\n  <arguments>{\"code\": \"5 + 3 + 1294.678\"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>python_interpreter</name>\n  <result>1302.678</result>\n</tool_use_result>\n\nAssistant: The result of the operation is 1302.678.\n\n---\nUser: \"Which city has the highest population , Guangzhou or Shanghai?\"\n\nAssistant: I can use the search tool to find the population of Guangzhou.\n<tool_use>\n  <name>search</name>\n  <arguments>{\"query\": \"Population Guangzhou\"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>search</name>\n  <result>Guangzhou has a population of 15 million inhabitants as of 2021.</result>\n</tool_use_result>\n\nAssistant: I can use the search tool to find the population of Shanghai.\n<tool_use>\n  <name>search</name>\n  <arguments>{\"query\": \"Population Shanghai\"}</arguments>\n</tool_use>\n\nUser: <tool_use_result>\n  <name>search</name>\n  <result>26 million (2019)</result>\n</tool_use_result>\nAssistant: The population of Shanghai is 26 million, while Guangzhou has a population of 15 million. Therefore, Shanghai has the highest population.\n\n\n## Tool Use Available Tools\nAbove example were using notional tools that might not exist for you. You only have access to these tools:\n<tools>\n\n<tool>\n  <name>fj5EiLHfjeWQhDMmQTXZUr</name>\n  <description>获取当前时间</description>\n  <arguments>\n    {\"type\":\"object\"}\n  </arguments>\n</tool>\n\n\n<tool>\n  <name>fq-YVmKZQ2p6aciTm5WVtX</name>\n  <description>获取技能组指标的当前值</description>\n  <arguments>\n    {\"type\":\"object\",\"properties\":{\"qualifierCodeList\":{\"description\":\"指标code列表\",\"type\":\"array\"},\"skillGroupId\":{\"description\":\"技能组ID\",\"type\":\"string\"}},\"required\":[\"skillGroupId\",\"qualifierCodeList\"]}\n  </arguments>\n</tool>\n\n\n<tool>\n  <name>fTlRF3f4Q22VGUdO0xbIxt</name>\n  <description>获取技能组指标的历史值</description>\n  <arguments>\n    {\"type\":\"object\",\"properties\":{\"beginTime\":{\"description\":\"开始时间,格式:2024-11-28 18:00:00,必须为整分钟\",\"type\":\"string\"},\"endTime\":{\"description\":\"结束时间,格式:2024-11-28 18:00:00,必须为整分钟\",\"type\":\"string\"},\"qualifierCode\":{\"description\":\"指标code\",\"type\":\"string\"},\"skillGroupId\":{\"description\":\"技能组ID\",\"type\":\"string\"}},\"required\":[\"skillGroupId\",\"qualifierCode\",\"beginTime\",\"endTime\"]}\n  </arguments>\n</tool>\n\n\n<tool>\n  <name>fU6szvcmSkfRaFBoMbptOz</name>\n  <description>获取技能组的指标列表</description>\n  <arguments>\n    {\"type\":\"object\"}\n  </arguments>\n</tool>\n\n</tools>\n\n## Tool Use Rules\nHere are the rules you should always follow to solve your task:\n1. Always use the right arguments for the tools. Never use variable names as the action arguments, use the value instead.\n2. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself.\n3. If no tool call is needed, just answer the question directly.\n4. Never re-do a tool call that you previously did with the exact same parameters.\n5. For tool use, MARK SURE use XML tag format as shown in the examples above. Do not use any other format.\n\n# User Instructions\n\n\nNow Begin! If you solve the task correctly, you will receive a reward of $1,000,000.\n"
},
{
"role": "user",
"content": "分析一下饿了么技能组 1221000000006593548 今天18点以来的平均产能趋势,用表格展示\n"
}
],
"stream": true
}

系统prompt部分格式化后如下:

In this environment you have access to a set of tools you can use to answer the user's question. You can use one tool per message, and will receive the result of that tool use in the user's response. You use tools step-by-step to accomplish a given task, with each tool use informed by the result of the previous tool use.

## Tool Use Formatting

Tool use is formatted using XML-style tags. The tool name is enclosed in opening and closing tags, and each parameter is similarly enclosed within its own set of tags. Here's the structure:

<tool_use>
<name>{tool_name}</name>
<arguments>{json_arguments}</arguments>
</tool_use>

The tool name should be the exact name of the tool you are using, and the arguments should be a JSON object containing the parameters required by that tool. For example:
<tool_use>
<name>python_interpreter</name>
<arguments>{"code": "5 + 3 + 1294.678"}</arguments>
</tool_use>

The user will respond with the result of the tool use, which should be formatted as follows:

<tool_use_result>
<name>{tool_name}</name>
<result>{result}</result>
</tool_use_result>

The result should be a string, which can represent a file or any other output type. You can use this result as input for the next action.
For example, if the result of the tool use is an image file, you can use it in the next action like this:

<tool_use>
<name>image_transformer</name>
<arguments>{"image": "image_1.jpg"}</arguments>
</tool_use>

Always adhere to this format for the tool use to ensure proper parsing and execution.

## Tool Use Examples

Here are a few examples using notional tools:
---
User: Generate an image of the oldest person in this document.

Assistant: I can use the document_qa tool to find out who the oldest person is in the document.
<tool_use>
<name>document_qa</name>
<arguments>{"document": "document.pdf", "question": "Who is the oldest person mentioned?"}</arguments>
</tool_use>

User: <tool_use_result>
<name>document_qa</name>
<result>John Doe, a 55 year old lumberjack living in Newfoundland.</result>
</tool_use_result>

Assistant: I can use the image_generator tool to create a portrait of John Doe.
<tool_use>
<name>image_generator</name>
<arguments>{"prompt": "A portrait of John Doe, a 55-year-old man living in Canada."}</arguments>
</tool_use>

User: <tool_use_result>
<name>image_generator</name>
<result>image.png</result>
</tool_use_result>

Assistant: the image is generated as image.png

---
User: "What is the result of the following operation: 5 + 3 + 1294.678?"

Assistant: I can use the python_interpreter tool to calculate the result of the operation.
<tool_use>
<name>python_interpreter</name>
<arguments>{"code": "5 + 3 + 1294.678"}</arguments>
</tool_use>

User: <tool_use_result>
<name>python_interpreter</name>
<result>1302.678</result>
</tool_use_result>

Assistant: The result of the operation is 1302.678.

---
User: "Which city has the highest population , Guangzhou or Shanghai?"

Assistant: I can use the search tool to find the population of Guangzhou.
<tool_use>
<name>search</name>
<arguments>{"query": "Population Guangzhou"}</arguments>
</tool_use>

User: <tool_use_result>
<name>search</name>
<result>Guangzhou has a population of 15 million inhabitants as of 2021.</result>
</tool_use_result>

Assistant: I can use the search tool to find the population of Shanghai.
<tool_use>
<name>search</name>
<arguments>{"query": "Population Shanghai"}</arguments>
</tool_use>

User: <tool_use_result>
<name>search</name>
<result>26 million (2019)</result>
</tool_use_result>
Assistant: The population of Shanghai is 26 million, while Guangzhou has a population of 15 million. Therefore, Shanghai has the highest population.

## Tool Use Available Tools
Above example were using notional tools that might not exist for you. You only have access to these tools:
<tools></tools>

<tool>
<name>fj5EiLHfjeWQhDMmQTXZUr</name>
<description>获取当前时间</description>
<arguments>
{"type":"object"}
</arguments>
</tool>

<tool>
<name>fq-YVmKZQ2p6aciTm5WVtX</name>
<description>获取技能组指标的当前值</description>
<arguments>
{"type":"object","properties":{"qualifierCodeList":{"description":"指标code列表","type":"array"},"skillGroupId":{"description":"技能组ID","type":"string"}},"required":["skillGroupId","qualifierCodeList"]}
</arguments>
</tool>

<tool>
<name>fTlRF3f4Q22VGUdO0xbIxt</name>
<description>获取技能组指标的历史值</description>
<arguments>
{"type":"object","properties":{"beginTime":{"description":"开始时间,格式:2024-11-28 18:00:00,必须为整分钟","type":"string"},"endTime":{"description":"结束时间,格式:2024-11-28 18:00:00,必须为整分钟","type":"string"},"qualifierCode":{"description":"指标code","type":"string"},"skillGroupId":{"description":"技能组ID","type":"string"}},"required":["skillGroupId","qualifierCode","beginTime","endTime"]}
</arguments>
</tool>

<tool>
<name>fU6szvcmSkfRaFBoMbptOz</name>
<description>获取技能组的指标列表</description>
<arguments>
{"type":"object"}
</arguments>
</tool>

## Tool Use Rules
Here are the rules you should always follow to solve your task:
1. Always use the right arguments for the tools. Never use variable names as the action arguments, use the value instead.
2. Call a tool only when needed: do not call the search agent if you do not need information, try to solve the task yourself.
3. If no tool call is needed, just answer the question directly.
4. Never re-do a tool call that you previously did with the exact same parameters.
5. For tool use, MARK SURE use XML tag format as shown in the examples above. Do not use any other format.

# User Instructions

Now Begin! If you solve the task correctly, you will receive a reward of $1,000,000.

当然最简单的方式就是直接使用支持MCP的客户端应用,比如 Cursor、Cherry Studio.

一些工具

  • MCP Server调试工具:(MCP inspector)[https://github.com/modelcontextprotocol/inspector]
  • Cherry Studio:很灵活、用户体验很好的支持MCP 的 agent client.

–EOF–

版权声明
转载请注明出处,本文原始链接