Troubleshooting Loadbalacer Issues - Fluid Topics

Category
How To
Audience
public

Load Balancer

The load balancer is essential to ensure high availability and service continuity, even in case of a server failure. In our case, only European machines have duplicated instances.

HAProxy

Initially, we used a physical load balancer, but we migrated to a virtual solution with HAProxy, which has proven effective for managing virtual IPs.

We can check HAProxy health by clicking on this link:

http://haproxy01.infra.prod.vtr.antidot.net:8404/

AWS

With the arrival of GP, it became necessary to migrate the load balancer to Amazon AWS.

On AWS, the load balancer operates with listeners and target groups. To check if a front-end is active in the pool, we use the healthcheck.

Our load balancers, as well as Amazon's, redirect requests to the same destinations, while performing periodic health checks on the front-end servers.

During server maintenance (maintenance.bash), HAProxy adjusts the server's status, but the AWS Load Balancer does not automatically perform this change.

Troubleshooting advices

In case of issues, the standard procedure is to check HAProxy to confirm whether the machines are still in the pool.

A front-end pool consists of machines on the same load balancer that share a common purpose. To determine the IPs of the load balancers, we can use the dig command with the corresponding URL, allowing us to identify where they point to.

Our monitoring is facilitated by the use of Grafana, where an Apache dashboard with FQDN enables us to visualize the requests:

https://grafana.antidot.net/d/apache-fqdn/apache-with-fqdn

Additionally, we use a Web Application Firewall (WAF). More information on the WAF and its return codes can be found in the documentation:

https://scm.mrs.antidot.net/sre/sre-docs/-/wikis/infra/Amazon-Web-Services/WAF-return-codes