Systemd prevents pause_minority re-join by restarting RabbitMQ

I have a three node cluster to test pause_minority. All nodes run RHEL 7.9, Erlang 23.3.4.5 and RabbitMQ 3.9.0. I use RPMs from https://github.com/rabbitmq (erlang-rpm and rabbitmq-server). Nodes are joined manually to the cluster, I use the same rabbitmq.conf for all three nodes (rabbit1, rabbit2, rabbit3).

When I pull the network cable from rabbit2, a minute later it detects minority status and stops the applications. 90 seconds later, systemd detects that something is wrong with rabbitmq-server and restarts it:

```systemd: rabbitmq-server.service stop-sigterm timed out. Killing.
systemd: rabbitmq-server.service: main process exited, code=killed, status=9/KILL
systemd: Unit rabbitmq-server.service entered failed state.
systemd: rabbitmq-server.service failed.
systemd: rabbitmq-server.service holdoff time over, scheduling restart.
systemd: Stopped RabbitMQ broker.
systemd: Starting RabbitMQ broker...
```
After that, nothing else happens although I reconnect the cable. I would expect rabbit2 to re-join the cluster, but that seems to be sabotaged by systemd restarting RabbitMQ.

The node re-joins the cluster when I reconnect the cable before 90 seconds, but systemd mercilessly kills and restarts RabbitMQ anyway after 90 seconds.

Here is the time table for the things I did:
```17:56:08 systemctl start rabbitmq-server.service
17:57:08 disconnect eth0
17:58:10 Node rabbit2 detects loss of connectivity
17:59:40 systemd reports: stop-sigterm timed out. Killing
18:01:41 reconnect eth0
```

Log files and rabbitmq.conf attached.
[rabbit@rabbit2.log](https://github.com/rabbitmq/rabbitmq-server/files/6932464/rabbit%40rabbit2.log)
[/var/log/messages](https://github.com/rabbitmq/rabbitmq-server/files/6932473/messages.log)
[rabbitmq.conf](https://github.com/rabbitmq/rabbitmq-server/files/6932479/rabbitmq.conf.txt)





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Systemd prevents pause_minority re-join by restarting RabbitMQ #3261

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Systemd prevents pause_minority re-join by restarting RabbitMQ #3261

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions