2023年4月20日 07:03:00go评论107阅读模式

英文:

ARGOCD ssh: handshake failed: read tcp 10.#.3.21:36808->20.#.#.#:22: read: connection reset by peer and failed to get git client for repo

问题

I have translated the provided text:

创建了一个 argocd 应用程序，提到了两个来源，它得到了同步良好的状态，但是每隔几秒钟开始出现以下错误：

ssh: 握手失败：read tcp 10.254.3.21:36808->20.41.6.26:22：read：连接被对等方重置
并且无法获取用于存储库的 git 客户端

有任何建议吗？

project: default
destination:
  server: 'https://kubernetes.default.svc'
  namespace: akv2k8s
syncPolicy:
  automated:
    prune: true
    selfHeal: true
sources:
  - repoURL: 'http://charts.spvapi.no'
    targetRevision: 2.3.2
    helm:
      valueFiles:
        - $values/charts/akv2k8s.yaml
    chart: akv2k8s
  - repoURL: 'git@ssh.##.azure.com:v3/####'
    targetRevision: helm_chart_test
    ref: values

我已经添加了包含 SSH 密钥的 repo-cred 密钥，如果我只使用一个存储库作为来源，它可以正常工作。

英文:

created an argocd-application, mentioned two sources, it got sync ok status, but every few seconds start getting

ssh: handshake failed: read tcp 10.254.3.21:36808-&gt;20.41.6.26:22: read: connection reset by peer 
and failed to get git client for repo

errors.
Any Suggestions

project: default
destination:
  server: &#39;https://kubernetes.default.svc&#39;
  namespace: akv2k8s
syncPolicy:
  automated:
    prune: true
    selfHeal: true
sources:
  - repoURL: &#39;http://charts.spvapi.no&#39;
    targetRevision: 2.3.2
    helm:
      valueFiles:
        - $values/charts/akv2k8s.yaml
    chart: akv2k8s
  - repoURL: &#39;git@ssh.##.azure.com:v3/####&#39;
    targetRevision: helm_chart_test
    ref: values

i have added repo-cred secret already with sshkey which works fine if i use just one repo as source.

答案1

得分: 0

原因是函数LsRemote中的并发。当两个请求同时访问存储库时，其中一个会失败。这种行为不是立即发生的，一些并发请求会先成功。

目前的解决办法是将maxAttemptsCount从默认值1增加到50，通过设置ARGOCD_GIT_ATTEMPTS_COUNT环境变量。

观察到重试次数会增加到12次，直到最终成功。需要检查是否可以控制这种节流。如果不能，也许可以改进这段 ArgoCD 代码，例如，随机化重试间隔可能会产生更好的结果。

英文:

Turns out the root cause is concurrency in the function LsRemote:

func (m *nativeGitClient) LsRemote(revision string) (res string, err error) {
	for attempt := 0; attempt &lt; maxAttemptsCount; attempt++ {
		res, err = m.lsRemote(revision)
		if err == nil {
			return
		} else if apierrors.IsInternalError(err) || apierrors.IsTimeout(err) || apierrors.IsServerTimeout(err) ||
			apierrors.IsTooManyRequests(err) || utilnet.IsProbableEOF(err) || utilnet.IsConnectionReset(err) {
			// Formula: timeToWait = duration * factor^retry_number
			// Note that timeToWait should equal to duration for the first retry attempt.
			// When timeToWait is more than maxDuration retry should be performed at maxDuration.
			timeToWait := float64(retryDuration) * (math.Pow(float64(factor), float64(attempt)))
			if maxRetryDuration &gt; 0 {
				timeToWait = math.Min(float64(maxRetryDuration), timeToWait)
			}
			time.Sleep(time.Duration(timeToWait))
		}
	}
	return
}

It seems that when 2 requests hit the repo concurrently, one of them fails. But this behavior does not start immediately, some amount of concurrent requests succeeds first.

So this looks very much like a deliberate throttling by Azure DevOps.

For now the resolution is to increase the maxAttemptsCount from the default of 1 to 50 by setting the ARGOCD_GIT_ATTEMPTS_COUNT environment variable.

I observed the retry count to rise to 12 until it finally succeeds. Need to check if this throttling can be controlled. If not, maybe this ArgoCD code could be improved. For example, randomizing the pause between retries may yield better results.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

ARGOCD ssh: handshake failed: read tcp 10.#.3.21:36808->20.#.#.#:22: read: connection reset by peer and failed to get git client for repo

问题

答案1

关于golang代码库的HTTPS/HTTP概念问题

AKHQ: 无法在您的配置文件中找到任何集群，请确保配置文件已正确加载

如何获取Loki/Grafana查询中唯一标签的数量

无法使用Helm在K8s集群上部署MariaDB。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。