I have a DRBD setup similar to an old post that’s being used between two Ubuntu servers hosting MySQL. Every few months though the pair goes into a split-brain situation where the secondary can’t see the primary and refuses to reconnect. Users are unaffected as the primary is still working fine, but the HA is lost.
After trying a few different combinations of commands this is what seems to work best for me and cause the quickest recovery. I’m only dealing with a 10GB device so a full sync takes about 10min. If you’re using DRBD for a much larger device, make sure you consider the sync time before doing this.
On the secondary node:
drbdadm secondary all
drbdadm disconnect all (it's status goes to Secondary/Unknown)
drbdadm invalidate all
drbdadm connect all
On the functioning primary node:
drbdadm connect all (a full sync now starts)
Remember, it’s your data you’re dealing with so make sure you’re responsible before you run commands like this.
Update – no sign of the root cause of the issue either. After a system update that included the drbd package, things seem to have settled down.