Connector timeouts using eks

Hello,

In our company, we started using Twingate in our CI/CD to connect with our ArgoCD instance, but We’ve encountered a high ratio of errors from Twingate connectors - mostly timeouts.

We have a very basic EKS setup with coreDNS and connectors installed using Twingate helm-chart.

In the Network section of the Control Panel, the error is shown like this:

xx@xx requested <IP>
Relay patched connection
Connector Received request
Failed to connecto to <IP>
<IP> could not be reached

Due to 5-10% error rate We removed it from our CI/CD pipelines.
Errors still occur when using a single user and I hit a timeout every ~10 tries.

On EKS we have 2 pods running with connectors (nothing suspicious in logs, CPU usage is normal)
CoreDNS also don’t show any errors (CPU usage is normal)

Error occurs on:

  • internal domains
  • just private IP
  • AWS generated domain

I’m wondering, if maybe there are some guidelines on how to set up it properly with EKS or some common issues with AWS.
Also, any idea where to search for error source will be very useful.

K8s deployment:

spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/instance: twingate-primary
      app.kubernetes.io/name: connector
  template:
    metadata:
      labels:
        app.kubernetes.io/instance: twingate-primary
        app.kubernetes.io/name: connector
    spec:
      containers:
        - env:
            - name: TWINGATE_LABEL_DEPLOYED_BY
              value: helm
            - name: TWINGATE_LABEL_HELM_CHART
              value: connector-0.1.23
            - name: TWINGATE_URL
              value: 'https://oxla.twingate.com'
            - name: TWINGATE_LOG_LEVEL
              value: '7'
          envFrom:
            - secretRef:
                name: twingate-primary-connector-credentials
                optional: false
          image: 'twingate/connector:1'
          imagePullPolicy: Always
          name: connector
          resources:
            requests:
              cpu: 50m
              memory: 200Mi
          securityContext:
            allowPrivilegeEscalation: false

EKS → 1.27

.

Found that the problem is about resources in the same node exposed by nginx ingress