DDNS环境下的copilot-gtp4私有化部署指南

发表于2024-01-31由daniel

国内使用ChatGPT主要面临两个问题，一个是支付，一个是网络。但是国内使用GitHub Copilot是畅通无阻的，对于个人开发者来说，每个月只需要$10。但是Copilot毕竟是编程语言的场景。有没有一种方式能够通过Copilot使用自然语言进行交互呢？答案不仅是可以，而且底层就是ChatGPT. 而这种方式就是copilot-gpt4-service.

项目本身非常有意思，自带的说明写得挺详细。自己尝试安装了一下，针对自己的使用场景有些问题做了简单的记录。

安装copilot-gpt4-service

推荐使用docker：

docker run -d \
--name copilot-gpt4-service \
--restart always \
-p 9010:8080 \
-e HOST=0.0.0.0 \
aaamoon/copilot-gpt4-service:latest

如果对golang很熟悉，也可以从源码安装。需要自己指定配置文件.configenv，参考格式:

HOST=0.0.0.0 # 服务监听地址，默认为 0.0.0.0。
PORT=8080 # 服务监听端口，默认为 8080。
CACHE=true # 是否启用持久化，默认为 true。
CACHE_PATH=db/cache.sqlite3 # 持久化缓存的路径（仅当 CACHE=true 时有效），默认为 db/cache.sqlite3。
DEBUG=false # 是否启用调试模式，启用后会输出更多日志，默认为 false。
LOGGING=true # 是否启用日志，默认为 true。
LOG_LEVEL=info # 日志级别，可选值：panic、fatal、error、warn、info、debug、trace（注意：仅当 LOGGING=true 时有效），默认为 info。
COPILOT_TOKEN=ghp_xxxxxxx # 默认的 GitHub Copilot Token，如果设置此项，则请求时携带的 Token 将被忽略。默认为空。
SUPER_TOKEN=randomtoken,randomtoken2 # Super Token 是用户自定义的 Token，用于对请求进行鉴权，若鉴权成功则会使用上方的 COPILOT_TOKEN 处理请求。多个 Token 以英文逗号分隔。默认为空。设置该项可以帮助用户在不泄漏 COPILOT_TOKEN 的情况下分享服务给他人使用。
ENABLE_SUPER_TOKEN=false # 是否启用 Super Token 鉴权，默认为 false。如果未启用但 COPILOT_TOKEN 不为空，则所有请求都会在不鉴权的情况下使用 COPILOT_TOKEN 处理。
CORS_PROXY_NEXTCHAT=false # 启用后，可以通过路由 /cors-proxy-nextchat/ 上为 NextChat 提供代理服务。配置 NextChat 云同步时，如本地部署方式则设置代理地址为：http://localhost:8080/cors-proxy-nextchat/
RATE_LIMIT=0 # 每分钟允许的请求数，如果为 0 则没有限制，默认为 0。

安装客户端 – OpenCat

我自己比较喜欢在手机上用OpenCat, 这里需要注意的是需要从美区下载的版本才能设置OpenAI，国区不支持。另外，当前(2024-01-31)似乎支持订阅制使用高级功能了，以前是可以一次性买断，略微遗憾。不过还是推荐这个APP，即使是免费版，功能也够用，APP的交互和审美设计都很过关，且没有乱七八糟的广告，还要什么自行车。

需要注意的是在APP里面设置OpenAI地址需要使用https协议。从安全和隐私角度，这个要求非常合理。作为一个gopher，自然是是通过Caddy来解决。只不过因为我的把copilot-gpt4-service部署在家里的内网服务器，因此有一层DDNS。

使用caddy解决https问题

因为有DDNS，我自己的域名托管商是DNSPod，因此需要下载下载带有特定DNS Provider的版本，也可以使用xcaddy工具自己编译：

xcaddy build --with github.com/caddy-dns/REPOSITORY

然后一切就绪以后提供一个参考的Caddyfile配置：

{
email YOUR_EMAIL
}

YOUR_DOMAIN:PORT {
reverse_proxy localhost:9010
tls {
dns dnspod ID,KEY
}
}

Have fun!

温故而知新之 https proxy

发表于2022-07-31由daniel

周六昏昏欲睡的下午，顺手用 goproxy 写了个小工具。功能部分不到两个小时就搞定了，但是在处理 https proxy 部分，希望实现一个自定义feature时，调试了挺长时间。这大大打击了自诩可以手撕 https/tls 的自我信心😝。于是顺手看了一下这块的实现部分。

原理上，https proxy 的处理，都是以客户端 CONNECT 请求开始，后续的请求都是通过这次建立的连接进行 req-rsp 交互。一句话就能讲完，很简单，对吧？对也不对。正确的部分在于，原理的确就是这样的，但是如何使用这个连接，以及如何处理其中的安全问题让这一块有很多细节需要考虑。所谓魔鬼在细节中，这也是这块最有意思的地方。

以 goproxy 这块的实现为例，通过 CONNECT 建立连接以后，请求的的交互主要支持4种方式：

ConnectAccept
ConnectHijack
ConnectHTTPMitm
ConnectMitm

ConnectAccept 是最基本的方式，只负责在 tcp 层建立远端和客户端的连接，这个连接具体怎么用，由客户端自己与远端交互决定。各个平台、各个客户端都支持这种方式，基本上没有兼容性问题。

ConnectHijack 与后面的两种方式本质上同一种类型，都可以认为是 Hijack 类型。这里涉及一个很重要的概念 Hijack, 这个在 golang 标准库中有非常准确详细的解释，搬运如下：

// Hijack lets the caller take over the connection.
// After a call to Hijack the HTTP server library
// will not do anything else with the connection.
//
// It becomes the caller's responsibility to manage
// and close the connection.
//
// The returned net.Conn may have read or write deadlines
// already set, depending on the configuration of the
// Server. It is the caller's responsibility to set
// or clear those deadlines as needed.
//
// The returned bufio.Reader may contain unprocessed buffered
// data from the client.
//
// After a call to Hijack, the original Request.Body must not
// be used. The original Request's Context remains valid and
// is not canceled until the Request's ServeHTTP method
// returns.

// Hijack lets the caller take over the connection.

// After a call to Hijack the HTTP server library

// will not do anything else with the connection.

// It becomes the caller's responsibility to manage

// and close the connection.

// The returned net.Conn may have read or write deadlines

// already set, depending on the configuration of the

// Server. It is the caller's responsibility to set

// or clear those deadlines as needed.

// The returned bufio.Reader may contain unprocessed buffered

// data from the client.

// After a call to Hijack, the original Request.Body must not

// be used. The original Request's Context remains valid and

// is not canceled until the Request's ServeHTTP method

// returns.

也就是相较于 ConnectAccept, proxy 不再是中间的小透明，而是可以接管连接，在中间实现一些自定义有意思的功能，这也是 MITM 的基石。所不同的是：

ConnectHijack 要求在业务自己在应用层实现这块逻辑，优点就是一切尽在掌握，可以实现很多有意思的功能。
ConnectHTTPMitm 是一种 https 降级为 http 的一种实现。也就是 https 通过这个 proxy 后，都被降级为 http 请求与远端交互。优点是可以 offload TLS 这层的加解密和签名开销，但是完全没有安全性。因此，这种一种古老的方式已经不被绝大部分 client 支持了。
ConnectMitm 则是一种经典实现，把客户端的 https 请求在中间做中转，然后还是以 https 的方式发送到远端。这是被client广泛支持的方式，因此兼容性较好。

需要说明的是：

Hijack 类型都依赖CA证书，这也是为什么你手机、电脑设备里面的CA根证书很重要，不要随便信任和安装来路不明证书的原因。
goproxy 在 ConnectHTTPMitm 和 ConnectMitm 的实现中，都会再次调用 filterRequest 执行 OnRequest 的 handler, 因此要在你的 request handler 中识别和处理这种经过转换的请求，否则会出现 loop 以及请求失败的情况。

从原理到实现，基本就拆解完了。其实也没有什么高深的部分，正如很多东西，不过是温故而知新，进一寸有一寸的欢喜。

Golang http response 解压缩分析

发表于2021-11-30由daniel

Golang自带的 http 标准库一直是自己的首选http library，最主要的原因是对标准的细节支持非常到位，兼容性优秀。

前几天遇到一个问题一直没时间处理，晚上花了挺多时间排查，落实到代码上其实也就一行的修改量，但是对于有段时间没有读标准库代码的自己来说倒是值得记录一下。

问题

使用标准库进行 http 请求时，如果在请求 header 中手动设置了 Accept-Encoding, 那么返回的响应内容看起来是乱码（其实是二进制）。

原因

定位的过程其实挺曲折的，分析抓包的时候对比了多个请求数据，最终才确认是header 中的 Accept-Encoding 导致。而这个细节之前在 http.Transport 的 DisableCompression 字段是有文档说明的：

// DisableCompression, if true, prevents the Transport from
// requesting compression with an "Accept-Encoding: gzip"
// request header when the Request contains no existing
// Accept-Encoding value. If the Transport requests gzip on
// its own and gets a gzipped response, it's transparently
// decoded in the Response.Body. However, if the user
// explicitly requested gzip it is not automatically
// uncompressed.
DisableCompression bool

// DisableCompression, if true, prevents the Transport from

// requesting compression with an "Accept-Encoding: gzip"

// request header when the Request contains no existing

// Accept-Encoding value. If the Transport requests gzip on

// its own and gets a gzipped response, it's transparently

// decoded in the Response.Body. However, if the user

// explicitly requested gzip it is not automatically

// uncompressed.

DisableCompression bool

也就是说：

如果设置 DisableCompression 为 true 并且请求中没有设置 Accept-Encoding 的情况下，那么将发送要求服务器返回非压缩的请求；如果 DisableCompression 为 false (默认值), 则会发送允许服务器返回压缩内容的请求，并且会在 transport 层自动把 response 内容解压返回到业务层。这一点很容易理解，屏蔽了解压读取压缩内容细节，使用起来也非常方便。这也是我对golang http 标准库的理解，很多东西都自动处理了。我遇到的问题在于下面说的第二点。
如果用户手动在请求中设置了 Accept-Encoding, 那么标准库会认为业务层是想自己接管 response 内容的处理，因此会返回原始的未解压的内容。也就是上面问题里面提到的乱码数据。

这段话看起来比较绕，我们直接看一下 transport.go 相关代码。

发起请求时，DisableCompression 为 false 且业务层没有设置 Accept-Encoding 的情况下，自动设置header:

// Ask for a compressed version if the caller didn't set their
// own value for Accept-Encoding. We only attempt to
// uncompress the gzip stream if we were the layer that
// requested it.
requestedGzip := false
if !pc.t.DisableCompression &amp;&amp;
req.Header.Get("Accept-Encoding") == "" &amp;&amp;
req.Header.Get("Range") == "" &amp;&amp;
req.Method != "HEAD" {
// Request gzip only, not deflate. Deflate is ambiguous and
// not as universally supported anyway.
// See: https://zlib.net/zlib_faq.html#faq39
//
// Note that we don't request this for HEAD requests,
// due to a bug in nginx:
//   https://trac.nginx.org/nginx/ticket/358
//   https://golang.org/issue/5522
//
// We don't request gzip if the request is for a range, since
// auto-decoding a portion of a gzipped document will just fail
// anyway. See https://golang.org/issue/8923
requestedGzip = true
req.extraHeaders().Set("Accept-Encoding", "gzip")
}

// Ask for a compressed version if the caller didn't set their

// own value for Accept-Encoding. We only attempt to

// uncompress the gzip stream if we were the layer that

// requested it.

requestedGzip := false

if !pc.t.DisableCompression &&

req.Header.Get("Accept-Encoding") == "" &&

req.Header.Get("Range") == "" &&

req.Method != "HEAD" {

// Request gzip only, not deflate. Deflate is ambiguous and

// not as universally supported anyway.

// See: https://zlib.net/zlib_faq.html#faq39

// Note that we don't request this for HEAD requests,

// due to a bug in nginx:

// https://trac.nginx.org/nginx/ticket/358

// https://golang.org/issue/5522

// We don't request gzip if the request is for a range, since

// auto-decoding a portion of a gzipped document will just fail

// anyway. See https://golang.org/issue/8923

requestedGzip = true

req.extraHeaders().Set("Accept-Encoding", "gzip")

}

读取 response 时，解压内容：

resp.Body = body
if rc.addedGzip &amp;&amp; ascii.EqualFold(resp.Header.Get("Content-Encoding"), "gzip") {
resp.Body = &amp;gzipReader{body: body}
resp.Header.Del("Content-Encoding")
resp.Header.Del("Content-Length")
resp.ContentLength = -1
resp.Uncompressed = true
}

resp.Body = body

if rc.addedGzip && ascii.EqualFold(resp.Header.Get("Content-Encoding"), "gzip") {

resp.Body = &gzipReader{body: body}

resp.Header.Del("Content-Encoding")

resp.Header.Del("Content-Length")

resp.ContentLength = -1

resp.Uncompressed = true

}

小结

很多问题其实都是魔鬼在细节中，本质上还是基础文档没有仔细阅读。不过找问题的晚上其实也挺开心的，因为通过配置好的 github Copilot 来写代码，整个过程真的非常愉悦。最开始自己还会写几个关键字，引导 Copilot 给出提示代码，后面索性回车就开始等待 Copilot 帮我写代码，跟自己编写代码匹配度能到 80% 以上。目前已知的不足是在 string 的内容推导上有时候不太准确（事实上，人工编码的时候，这部分也容易出现typo或者其他错误），因此要注意这部分代码的review.

行思錄 | Travel Coder

Arch, Coding, Life

分类目录归档：Golang