探索基于隧道的 Kubernetes 跨集群通讯

2023-02-14 00:00:00 集群 部署 服务 宿主 隧道

伴随着k8s的大量使用,无论是基于应用隔离或者高可用,容灾的需要还是运维管理的需求,很多企业都会部署多个K8S集群。这就会导致有些应用依赖于其它k8s集群的微服务,需要从一个集群里的pod访问另外一个集群里的pod或者service。为了解决跨集群服务调用的问题,我们试验了一种基于隧道的方案,下面就让我们一起来体验一下吧。

作者:鲍盈海, 中国移动云能力中心软件开发工程师,专注于云原生领域。

环境要求:

组件

版本

备注

kubernetes

1.21.5以上(包含)

要求有两个集群,且至少其中一个集群的node节点可以访问另外一个集群中的某一个node节点。

docker

18.09.5

打镜像

go

1.19

用于开发隧道代理及模拟业务服务


01

单隧道单服务访问


先介绍一下简单的场景,让集群A中的服务访问集群B中的服务,架构图如下:

集群A和集群B由一条隧道连接,隧道的左边是代理了隧道入口的service,隧道的右边是一个业务服务,我们在左边集群A中在宿主机上通过curl+ip的方式(或者在容器中通过域名的方式)访问集群B的业务服务。下面我们一起来实际操作一下。我们从右往左来部署服务。

1.先部署demo-service,这个服务是golang官网的demo : https://go.dev/doc/tutorial/web-service-gin, 将它打包成镜像后使用k8s来部署在集群B中充当业务服务,创建pod和svc的yaml文件如下 :

---apiVersion: v1kind: Namespacemetadata:  name: tunnel-proxy ---apiVersion: apps/v1kind: Deploymentmetadata:  name: demo-service  namespace: tunnel-proxyspec:  replicas: 1  selector:    matchLabels:      app: demo-service  template:    metadata:      labels:        app: demo-service    spec:      containers:      - name: demo-service        image: nexus.cmss.com:8086/cnp/tunnel/demo-service:v1.0.0        imagePullPolicy: Always        ports:        - containerPort: 8080          name: http-console ---apiVersion: v1kind: Servicemetadata:  name: tunnel-proxy  namespace: tunnel-proxyspec:  type: NodePort  ports:    - protocol: TCP      port: 8080      targetPort: 8080      nodePort: 31080  selector:    app: demo-service  externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack

2. 创建隧道,我们使用ssh命令来创建一个ssh隧道,在clusterB上执行如下的命令:

ssh -NR *:8079:localhost:31080 root@[集群A的隧道入口机器IP]

其中8079是集群A上监听的端口,31080是集群B上监听的端口,也是demo-service的svc暴露出来的端口,后面root@[集群A的隧道入口机器IP]是集群A中的机器。需要注意的是集群A的机器的ssh必须开启网关转发功能,具体修改步骤是将/etc/ssh/sshd_config文件中的GatewayPorts改为yes,并重启sshd。

3. 我们还创建了一个tunnel-service的svc,这是一个没有selector的服务,目的是方便集群A中的服务来访问隧道。详细的yaml如下:

---apiVersion: v1kind: Namespacemetadata:  name: tunnel-proxy ---apiVersion: v1kind: Servicemetadata:  name: tunnel-proxy  namespace: tunnel-proxyspec:  type: NodePort  ports:    - protocol: TCP      port: 8079      targetPort: 8079      nodePort: 31079  externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack apiVersion: v1kind: Endpointsmetadata:  name: tunnel-proxy  namespace: tunnel-proxysubsets:  - addresses:      - ip: [集群A的隧道入口机器IP]    ports:      - port: 8079

至此,在集群A中就可以通过隧道来访问集群B的服务了。但是目前这种方案还不适合在生产环境中使用,因为一般情况下跨集群之间因为安全、性能、成本等因素多个服务会复用一条隧道,而上面的方案中,一个服务独占了一条隧道,如果有多个demo-service服务则需要创建多个隧道。所以我们设计了单隧道多服务的方案。


02

单隧道多服务访问



单隧道多服务的实现原理是在隧道两头增加一个隧道的代理,隧道左端监听多个端口,用来区分集群A中服务要访问的集群B中的不同服务。并且将此信息告知隧道游段的代理,隧道右段代理根据此信息来转发给对应的ClusterB中的服务。架构图如下:

我们定义了一个配置文件,来描述隧道左侧监听的端口与隧道右侧服务映射的关系,如下:

{  "data": [    {      "port": "8050",       "remoteIP": "http://demo-service-1.cnp-tunnel.svc.cluster.local:8080"    },        {      "port": "8051",      "remoteIP": "http://demo-service-2.cnp-tunnel.svc.cluster.local:8080"    }  ]}

这个json文件中描述了两条映射关系,集群A隧道左端8050端口接收到的请求都会转给集群B的http://demo-service-1.cnp-tunnel.svc.cluster.local:8080即biz-1f服务,同理8051端口的请求转给biz-2。在实现过程中,当隧道左端接收到请求时,会在当前请求的header中新增一个名为“X-Proxy-Condition”的信息,记录了当前请求来自哪个端口,在隧道右端读取到这个信息后就知道要转发给集群B中的哪个服务了。


03

关键代码列举


上面配置的功能是被隧道发送方和接收方共享的,所以在代码设计中做了三个module,分别是common(读取配置文件),receive(隧道右侧代理),send(隧道左侧代理)。目录结构如下:

其中三个go文件和go.mod文件如下:

common/config.go

package common import (    "encoding/json"    "log"    "os")// Port是send端启动的时候 创建的服务端口, 集群A通过端口区分访问的服务// Port对应的RemoteIP是集群B中服务的地址,type Config struct {    Port     string `json:"port"`    RemoteIP string `json:"remoteIp"`} type ConfigHelper struct {    Data []Config `json:"data"`} func (configHelper *ConfigHelper) loadJson() error {    // 从文件读取    jsonFile, err := os.Open("config.json")    if err != nil {        log.Fatalln("Cannot open config file", err)    }     defer jsonFile.Close()     decoder := json.NewDecoder(jsonFile)    err = decoder.Decode(&configHelper)    if err != nil {        log.Fatalln("Cannot get configuration from file", err)        return err    }    return nil}// 从配置文件读取func (configHelper *ConfigHelper) GetConfigIns() ([]Config, error) {    if configHelper.Data == nil {        err := configHelper.loadJson()        if err != nil {            return nil, err        }    }    log.Println(configHelper)    return configHelper.Data, nil}

common/go.mod

module tunnel/http-proxy/common go 1.19

receive/main.go

package mainimport (    "errors"    "log"    "net/http"    "net/http/httputil"    "net/url"    "strings"    "tunnel/http-proxy/common")const PORT = "8080"var configHelper = &common.ConfigHelper{} type requestPayloadStruct struct {    ProxyCondition string `json:"proxy_condition"`}// Get the port to listen onfunc getListenAddress() string {    return ":" + PORT}// Log the env variables required for a reverse proxyfunc logSetup() {    log.Printf("Server will run on: %s\n", getListenAddress())}// Log the typeform payload and redirect urlfunc logRequestPayload(requestionPayload requestPayloadStruct, proxyUrl string) {    log.Printf("proxy_condition: %s, proxy_url: %s\n", requestionPayload.ProxyCondition, proxyUrl)}// Get the url for a given proxy conditionfunc getProxyUrl(proxyConditionRaw string) (string, error) {    proxyCondition := strings.ToUpper(proxyConditionRaw)     configIns, err := configHelper.GetConfigIns()    if err != nil {        log.Fatalln("proxy config is nil", err)        return "", errors.New("not match config")    }    for i := ; i < len(configIns); i++ {        if configIns[i].Port == proxyCondition {            return configIns[i].RemoteIP, nil        }    }    return "", errors.New("not match config")}// Serve a reverse proxy for a given urlfunc serveReverseProxy(target string, res http.ResponseWriter, req *http.Request) {    url, _ := url.Parse(target)     proxy := httputil.NewSingleHostReverseProxy(url)     proxy.ServeHTTP(res, req)}// Given a request send it to the appropriate urlfunc handleRequestAndRedirect(res http.ResponseWriter, req *http.Request) {    requestPayload := requestPayloadStruct{        ProxyCondition: req.Header.Get("X-Proxy-Condition"),    }     url, err := getProxyUrl(requestPayload.ProxyCondition)    if err != nil {        log.Fatalln("proxy url error:", err)        return    }     logRequestPayload(requestPayload, url)     serveReverseProxy(url, res, req)}func main() {    // Log setup values    logSetup()     // start server    http.HandleFunc("/", handleRequestAndRedirect)    if err := http.ListenAndServe(getListenAddress(), nil); err != nil {        panic(err)    }}

receive/go.mod

module tunnel/http-proxy/receivego 1.19replace tunnel/http-proxy/common => ../commonrequire tunnel/http-proxy/common v..-00010101000000-000000000000

send/main.go

package mainimport (    "log"    "net/http"    "net/http/httputil"    "net/url"    "tunnel/http-proxy/common")const TUNNEL_ENTER = "http://tunnel-proxy.cnp-tunnel.svc.cluster.local:8079"var configHelper = &common.ConfigHelper{} type requestPayloadStruct struct {    ProxyCondition string `json:"proxy_condition"`}// Get the port to listen onfunc getListenAddress(port string) string {    return ":" + port}// Log the env variables required for a reverse proxyfunc logSetup(configIns common.Config) {    log.Printf("Server will run on: %s\n", getListenAddress(configIns.Port))}// Log the typeform payload and redirect urlfunc logRequestPayload(condition string, proxyUrl string) {    log.Printf("proxy_condition: %s, proxy_url: %s\n", condition, proxyUrl)}// Serve a reverse proxy for a given urlfunc serveReverseProxy(target string, res http.ResponseWriter, req *http.Request) {    // parse the url    url, _ := url.Parse(target)     // create the reverse proxy    proxy := httputil.NewSingleHostReverseProxy(url)     // Note that ServeHttp is non blocking and uses a go routine under the hood    proxy.ServeHTTP(res, req)}// Given a request send it to the appropriate urlfunc handleRequestAndRedirect(res http.ResponseWriter, req *http.Request, condition string) {    req.Header.Add("X-Proxy-Condition", condition)     url := TUNNEL_ENTER     logRequestPayload(condition, url)     serveReverseProxy(url, res, req)} func main() {    configIns, err := configHelper.GetConfigIns()     if err != nil {        log.Fatalln("proxy config is nil", err)        return    }     // var httpMux *http.ServeMux    for i := ; i < len(configIns); i++ {        // Log setup values        logSetup(configIns[i])        // start server        port := configIns[i].Port         httpMux := http.NewServeMux()        httpMux.HandleFunc("/", func(res http.ResponseWriter, req *http.Request) {            handleRequestAndRedirect(res, req, port)        })        server := &http.Server{            Addr:    getListenAddress(port),            Handler: httpMux,        }        go server.ListenAndServe()    }     select {}}

send/go.mod

module tunnel/http-proxy/sendgo 1.19replace tunnel/http-proxy/common => ../commonrequire tunnel/http-proxy/common v..-00010101000000-000000000000

构建tunnel-servicetunnel-sneddockerfile文件分别如下:

FROM golang:1.19.5-alpine
WORKDIR /opt
ADD . /opt# 设置代理RUN go env -w GO111MODULE=onRUN go env -w GOPROXY=https://goproxy.io,direct
WORKDIR /opt/receive RUN go build -o main ./main.go EXPOSE 8080 CMD ["/opt/receive/main"]

FROM golang:1.19.5-alpine WORKDIR /opt ADD . /opt# 设置代理RUN go env -w GO111MODULE=onRUN go env -w GOPROXY=https://goproxy.io,direct WORKDIR /opt/send RUN go build -o main ./main.go EXPOSE 8080 CMD ["/opt/send/main"]

打包镜像的脚本可以参考如下:

#!/bin/bashecho "building send\n" docker build -f Dockerfile.send -t nexus.cmss.com:8086/cnp/tunnel/send:v1.0.0 .if [ $? -ne  ]; then  echo "build send failed\n"  exit 1fi docker push nexus.cmss.com:8086/cnp/tunnel/send:v1..0if [ $? -ne  ]; then  echo "push send failed\n"  exit 1fiecho "build send success\n"echo "building receive\n" docker build -f Dockerfile.receive -t nexus.cmss.com:8086/cnp/tunnel/receive:v1.0.0 .if [ $? -ne  ]; then  echo "build receive failed\n"  exit 1fi docker push nexus.cmss.com:8086/cnp/tunnel/receive:v1.0.0if [ $? -ne 0 ]; then  echo "push receive failed\n"  exit 1fiecho "build receive success\n"

终我们打包了nexus.cmss.com:8086/cnp/tunnel/receive:v1.0.0和nexus.cmss.com:8086/cnp/tunnel/send:v1.0.0两个镜像。


04

部署实操


下面我们也来一起部署一下,同样是从右往左部署:

1.我们依然使用golang官网的demo : https://go.dev/doc/tutorial/web-service-gin, 作为demo-service,不同的是我们要创建两个。yaml文件如下:

---apiVersion: v1kind: Namespacemetadata:  name: cnp-tunnel# biz-1---apiVersion: apps/v1kind: Deploymentmetadata:  name: demo-service-1  namespace: cnp-tunnelspec:  replicas: 1  selector:    matchLabels:      app: demo-service-1  template:    metadata:      labels:        app: demo-service-1    spec:      containers:      - name: demo-service-1        image: nexus.cmss.com:8086/cnp/tunnel/demo-service:v1.0.0        imagePullPolicy: Always        ports:        - containerPort: 8080          name: http-console ---apiVersion: v1kind: Servicemetadata:  name: demo-service-1  namespace: cnp-tunnelspec:  type: NodePort  ports:    - protocol: TCP      port: 8080      targetPort: 8080      nodePort: 31050  selector:    app: demo-service-1  externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack# biz-2---apiVersion: apps/v1kind: Deploymentmetadata:  name: demo-service-2  namespace: cnp-tunnelspec:  replicas: 1  selector:    matchLabels:      app: demo-service-2  template:    metadata:      labels:        app: demo-service-2    spec:      containers:      - name: demo-service-2        image: nexus.cmss.com:8086/cnp/tunnel/demo-service:v1.0.0        imagePullPolicy: Always        ports:        - containerPort: 8080          name: http-console ---apiVersion: v1kind: Servicemetadata:  name: demo-service-2  namespace: cnp-tunnelspec:  type: NodePort  ports:    - protocol: TCP      port: 8080      targetPort: 8080      nodePort: 31051  selector:    app: demo-service-2  externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack

然后我们需要造一点数据,来区分两个服务,在这个demo中支持创建数据,命令如下:

# biz-1curl 'http://[集群B中宿主机IP]:31050/albums' \  -H 'content-type: application/json' \  --data-raw '{"id":"4","title": "8050-add", "artist":"biz-1", "price": 100}' \  --compressed \  --insecure# biz-2 curl 'http://[集群B中宿主机IP]:31051/albums' \  H 'content-type: application/json' \  --data-raw '{"id":"4","title": "8051-add", "artist":"biz-2", "price": 99}' \  --compressed \  --insecure

然后在浏览器中输入http://[集群B中宿主机IP]31050/albums

和http://[集群B中宿主机IP]:31051/albums来查看插入的数据是否生效。

2. 部署tunnel-receive服务,即隧道接收端的服务,其中namespace已经在步中创建了,yaml文件如下:

---kind: ConfigMapapiVersion: v1metadata:  name: tunnel-config  namespace: cnp-tunneldata:  config.json: |    {      "data": [        {          "port": "8050",           "remoteIP": "http://localhost:31050"        },        {          "port": "8051",          "remoteIP": "http://localhost:31051"        }      ]    } ---apiVersion: apps/v1kind: Deploymentmetadata:  name: tunnel-receive  namespace: cnp-tunnelspec:  replicas: 1  selector:    matchLabels:      app: tunnel-receive  template:    metadata:      labels:        app: tunnel-receive    spec:      containers:      - name: tunnel-receive        image: nexus.cmss.com:8086/cnp/tunnel/receive:v1.0.0        imagePullPolicy: Always        ports:        - containerPort: 8080          name: tunnel        volumeMounts:        - mountPath: /opt/send/config.json          name: tunnel-config          subPath: config.json      volumes:      - name: tunnel-config        configMap:          name: tunnel-config ---apiVersion: v1kind: Servicemetadata:  name: tunnel-receive  namespace: cnp-tunnelspec:  type: NodePort  ports:  - protocol: TCP    port: 8080    targetPort: 8080    nodePort: 31080  selector:    app: tunnel-receive  externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack

服务部署后,可以通过在clusterB上执行下面的命令来检查receive服务是否正常,注意此处已经在header中设置了X-Proxy-Condition。

curl 'http://[集群B中宿主机IP]:31080/albums' \  -H 'X-Proxy-Condition: 8050' \  --compressed curl 'http://[集群B中宿主机IP]:31080/albums' \  -H 'X-Proxy-Condition: 8051' \  --compressed

3. 创建隧道,同单隧道单服务中的步骤,执行下面的命令

ssh -NR *:8079:localhost:31080 root@[集群A的隧道入口机器IP]

在集群A上执行下面的命令来检查隧道是否成功创建:

curl 'http://[集群A的隧道入口机器IP]:8079/albums' \  -H 'X-Proxy-Condition: 8050' \  --compressed curl 'http://[集群A的隧道入口机器IP]:8079/albums' \  -H 'X-Proxy-Condition: 8051' \  --compressed

同样的我们还需要创建一个没有selector的svc来代理隧道的左边,yaml文件如下:

---apiVersion: v1kind: Servicemetadata:  name: tunnel-proxy  namespace: cnp-tunnelspec:  # type: NodePort  ports:    - protocol: TCP      port: 8079      targetPort: 8079      # nodePort: 31079  # externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack ---apiVersion: v1kind: Endpointsmetadata:  name: tunnel-proxy  namespace: cnp-tunnelsubsets:  - addresses:      - ip: 100.76.11.99    ports:      - port: 8079

4. 部署tunnel-send服务,即隧道发送端的服务,yaml文件如下:

---apiVersion: v1kind: Namespacemetadata:  name: cnp-tunnel---kind: ConfigMapapiVersion: v1metadata:  name: tunnel-config  namespace: cnp-tunneldata:  config.json: |    {      "data": [        {          "port": "8050",           "remoteIP": "http://localhost:31050"        },        {          "port": "8051",          "remoteIP": "http://localhost:31051"        }      ]    } ---apiVersion: apps/v1kind: Deploymentmetadata:  name: tunnel-send  namespace: cnp-tunnelspec:  replicas: 1  selector:    matchLabels:      app: tunnel-send  template:    metadata:      labels:        app: tunnel-send    spec:      containers:      - name: tunnel-send        image: nexus.cmss.com:8086/cnp/tunnel/send:v1.0.0        imagePullPolicy: Always        ports:        - containerPort: 8050          name: proxy-server-1        - containerPort: 8051          name: proxy-server-2        volumeMounts:        - mountPath: /opt/send/config.json          name: tunnel-config          subPath: config.json      volumes:      - name: tunnel-config        configMap:          name: tunnel-config ---apiVersion: v1kind: Servicemetadata:  name: tunnel-send  namespace: cnp-tunnelspec:  type: NodePort  ports:  - protocol: TCP    port: 8050    targetPort: 8050    nodePort: 31050    name: proxy-server-1  - protocol: TCP    port: 8051    targetPort: 8051    nodePort: 31051    name: proxy-server-2  selector:    app: tunnel-send  externalTrafficPolicy: Cluster  ipFamilies:  - IPv4  ipFamilyPolicy: SingleStack

至此,已经全部部署结束,执行下面的命令测试一下:

curl http://[集群A的中的宿主机IP]:31050/albumscurl http://[集群A的中的宿主机IP]:31051/albums

或者在浏览器里访问上面的地址,结果如下:


05

结束语


以上我们通过ssh隧道实现了跨集群的访问,目前只是在demo的程度,要在正式环境中使用的话,还需要考虑整个通信的稳定可靠的问题,例如给隧道增加心跳,多条隧道做负载均衡等。实际在业界还有例如Submariner(https://submariner.io/)等开源项目能轻松提供跨集群的安全应用访问,大家可以进一步学习了解。


相关文章