Traefik将请求错误地转发到了一个服务的主机。

huangapple go评论59阅读模式
英文:

Traefik is forwarding request to wrong host for a service

问题

我遇到了一个问题,希望这里的一些专家可以帮忙。

我们正在尝试部署一个新的服务,但Traefik返回404错误。根本原因似乎是它正在保留服务的过时主机IP。可能是有些小错误被忽略了...

设置是这样的,我们有一个由Traefik作为入口/反向代理前端的ECS集群。我们使用以下Docker标签部署了一个新的服务:

"dockerLabels": {
    "traefik.enable": "true",
    "traefik.frontend.entryPoints": "http",
    "traefik.frontend.passHostHeader": "true",
    "traefik.frontend.priority": "6",
    "traefik.frontend.rule": "Headers: CS-Forwarded-Host,report-api.dev.some-domain.com"
}

ECS集群有2个节点,服务分布在它们之间,Traefik在两个节点上运行。当服务更新时,Traefik日志显示它已经相应地更新了配置:

Configuration received from provider ecs:

"backend-service-reporting-public-api-development-python": {
    "servers": {
        "server-service-reporting-public-api-development-python-4405ef23c5c6": {
            "url": "http://172.30.3.147:57138",
            "weight": 1
        }
    },
    "loadBalancer": {
        "method": "wrr"
    }
},

...

"frontend-service-reporting-public-api-development-python": {
    "entryPoints": [
        "http"
    ],
    "backend": "backend-service-reporting-public-api-development-python",
    "routes": {
        "route-frontend-service-reporting-public-api-development-python": {
            "rule": "Headers: CS-Forwarded-Host,report-api.dev.some-domain.com"
        }
    },
    "passHostHeader": true,
    "priority": 6,
    "basicAuth": []
},

我们通过跳转到ECS实例并执行curl http://172.30.3.147:57138/swagger/doc来验证了这一点,这个操作正常工作。

然而,当从外部访问端点时,我们总是收到404错误。以下Traefik日志显示它将请求转发到错误的主机IP/端口,可能是过时的:

vulcand/oxy/forward: completed ServeHttp on request" Request="
{
    "Method": "GET",
    "URL": {
        "Scheme": "http",
        "Opaque": "",
        "User": null,
        "Host": "172.30.3.147:49181",
        "Path": "",
        "RawPath": "",
        "ForceQuery": false,
        "RawQuery": "",
        "Fragment": ""
    },
    "Proto": "HTTP/1.0",
    "ProtoMajor": 1,
    "ProtoMinor": 0,
    "Header": {
        ...
        "Cs-Forwarded-Host": [
            "report-api.dev.some-domain.com"
        ],
        ...
    }
}"

所以,它不是将流量转发到最新的backend-service-reporting-public-api-development-python的URL 172.30.3.147:57138,而是将请求转发到172.30.3.147:49181。请注意IP地址相同但端口号不同。这显然会导致404错误。但为什么Traefik不将请求转发到正确的URL?如果我理解正确的话,我找不到强制Traefik更新其配置的方法。我在这里漏掉了什么?

Traefik版本是v1.6.6。我知道这个版本已经很古老了。这在我们的计划中,希望很快升级。

请帮忙。提前感谢。

英文:

I've come across a snag and hope some experts from here can help out.

We are trying to deploy a new service but Traefik is returning 404. And the root cause seems to be it is holding onto a staled host IP for the service. Might be something silly got missed out...

The setup is we have a ECS cluster fronted by Traefik as the ingresser/reverse proxy. We have a new service deployed with following docker labels:

"dockerLabels": {
                "traefik.enable": "true",
                "traefik.frontend.entryPoints": "http",
                "traefik.frontend.passHostHeader": "true",
                "traefik.frontend.priority": "6",
                "traefik.frontend.rule": "Headers: CS-Forwarded-Host,report-api.dev.some-domain.com"

The ECS cluster has got 2 nodes with services deployed across them, traefik is running on both of them. When the service is updated, the traefik logs reveals that it's getting the configuration updated accordingly:

Configuration received from provider ecs:

"backend-service-reporting-public-api-development-python": {
            "servers": {
                "server-service-reporting-public-api-development-python-4405ef23c5c6": {
                    "url": "http://172.30.3.147:57138",
                    "weight": 1
                }
            },
            "loadBalancer": {
                "method": "wrr"
            }
        },

...

"frontend-service-reporting-public-api-development-python": {
            "entryPoints": [
                "http"
            ],
            "backend": "backend-service-reporting-public-api-development-python",
            "routes": {
                "route-frontend-service-reporting-public-api-development-python": {
                    "rule": "Headers: CS-Forwarded-Host,report-api.dev.some-domain.com"
                }
            },
            "passHostHeader": true,
            "priority": 6,
            "basicAuth": []
        },

We have verified it by jumping on the ECS instance and do a curl http://172.30.3.147:57138/swagger/doc which works ok.

However, when hitting the endpoint from outside we always get 404. The following traefik logs reveals it's forwarding requests to the wrong host IP/port, possibly a staled one:

vulcand/oxy/forward: completed ServeHttp on request" Request="
{
    "Method": "GET",
    "URL": {
        "Scheme": "http",
        "Opaque": "",
        "User": null,
        "Host": "172.30.3.147:49181",
        "Path": "",
        "RawPath": "",
        "ForceQuery": false,
        "RawQuery": "",
        "Fragment": ""
    },
    "Proto": "HTTP/1.0",
    "ProtoMajor": 1,
    "ProtoMinor": 0,
    "Header": {
        ...
        "Cs-Forwarded-Host": [
            "report-api.dev.some-domain.com"
        ],
        ...
    }
}"

So instead of forwarding to the latest backend-service-reporting-public-api-development-python url 172.30.3.147:57138, it forwards traffic to 172.30.3.147:49181. Notice the IP is the same but different port number. This obviously will result in a 404. But why traefik is NOT forwarding the requests to the correct URL? I didn't find a way to force traefik to update it's config as this should all be dynamic if my understanding is correct. What am I missing here?

The traefik version is v1.6.6. I know it's ancient. This is on our radar, will hopefully upgrade soon.

Please help out. Thanks in advance.

答案1

得分: 0

我终于解决了这个谜团,想在这里分享我的发现,以防对任何人有帮助。

ECS 集群中的另一个服务(假设为服务 x)上设置了一个 traefik 规则集:

"traefik.frontend.rule": "HeadersRegexp: CS-Forwarded-Host,api(-development)?.dev.some-domain.com;PathPrefix:/v1,/v2"

HeadersRegexp 是问题的根源。它的广泛匹配条件影响了使用自定义域设置为 report-api.dev.some-domain.com 的公共 API 服务的报告。因为 /api(-development)?. dev.some-domain.com/g 将匹配 report-api.dev.some-domain.com,所以处理公共 API 服务请求的方式被服务 x 接管了。服务 x 显然没有处理该请求的程序,因此返回了 404。解决这个问题的方法是将正则表达式限制为 ^api(-development)?.dev.some-domain.com

真的很不幸。希望这会在某个时候对某人有所帮助。

英文:

I solved this mystery finally, thought would share my findings here in case it'd help anyone.

There is a traefik rule set on another service in the ECS cluster (say service x):

"traefik.frontend.rule": "HeadersRegexp: CS-Forwarded-Host,api(-development)?.dev.some-domain.com;PathPrefix:/v1,/v2"

The HeadersRegexp is the culprit. It's wide open matching criteria affected reporting public api service with the custom domain set to report-api.dev.some-domain.com. Since /api(-development)?. dev.some-domain.com/g will match report-api.dev.some-domain.com so the handling of reporting public api service request is hijacked by service x. Service x certainly doesn't have a handler for the request hence 404 returned. The remedy to this is to restrict the regex to ^api(-development)?.dev.some-domain.com.

Quite unfortunate really. Hope this will help someone at some point.

huangapple
  • 本文由 发表于 2023年6月22日 19:29:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76531420.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定