Linux systemd connector going offline

Hi all

We’ve got a few connectors deployed in different parts of our network but having issues with a couple of them in particular.

They’re running in Ubuntu LXC containers in ProxmoxVE and are randomly going offline for no reason after working for 1-2 days at a time. Restarting the twingate-connector service brings them back straight away. Doing a health check on the service prior to restarting it comes back OK.

We have other connectors running in linux that haven’t had any issues, however these are the only ones running in LXC in Proxmox, however I can’t see how that should make any difference.

Not sure where to start with troubleshooting this one, has anyone else experienced this?

TIA

@lee.fishlock,

Are the containers themselves going offline, or just the service within the container? I’m running the same thing as my primary Connector and I haven’t had any issues with it like that, I’m just curious if you’re seeing anything in syslog whenever it goes down?

One thing you could try is increasing the log level by running twingate-connector --log-level=7 on that container when the Connector is active, let it run until it stops and then check syslog to see what was going on with it. That will hopefully provide a lot more information on what may be causing the service to stop, if you don’t see it in systemctl status twingate-connector.

Thanks for the response. Having dug a little more when it happens, it actually appears neither the container nor the service is stopping. When it stops checking into twingate i can still see the service in an active state and the container itself can still resolve and contact the appropriate IPs/hostnames, it just stops reporting in. Upon simple restart of the service it checks in right away.

One main difference I didn’t outline before that occurred to me is that these connectors are exclusively reliant on a 4G service for internet, so while that service isn’t going down, i wonder if there is some sort of session timeout or something that’s happening with the communication channel, and a service restart forces a new connection to the TG servers.

Totally possible that’s what’s happening, might be hard to test it conclusively though. What you could do is a simple cron job to restart the service on a routine basis, or even better to run a bash script that does a simple test, and if that test fails restart the service.