Network Security & Firewall Rules for MariaDB Galera Cluster
Galera’s synchronous, multi-master architecture eliminates traditional primary-replica bottlenecks but enforces strict network determinism. Unlike asynchronous replication, where relay logs absorb latency, Galera requires bidirectional, low-jitter communication paths to maintain quorum, execute state transfers, and complete write-set certification. Misconfigured firewalls, asymmetric routing, or overly permissive CIDR blocks will immediately trigger node evictions, certification queue saturation, or silent split-brain conditions. The foundational networking model is documented in MariaDB Galera Core Architecture & Fundamentals, which establishes how cluster communication relies on deterministic port mapping and predictable packet delivery.
Port Architecture & Baseline Firewall Requirements
Galera utilizes four distinct communication channels that must be explicitly permitted between all participating nodes. Blocking or misrouting any of these ports disrupts the Understanding Galera Synchronous Replication workflow, causing immediate degradation in write throughput and eventual cluster fragmentation.
| Port | Protocol | Function | Direction | Security Scope |
|---|---|---|---|---|
3306 |
TCP | Client SQL connections & administrative access | Bidirectional | Application subnets, bastion hosts |
4567 |
TCP/UDP | Group Communication System (GCS) & replication traffic | Bidirectional | Cluster CIDR only |
4568 |
TCP | Incremental State Transfer (IST) | Bidirectional | Cluster CIDR only |
4444 |
TCP | State Snapshot Transfer (SST) & donor handoff | Bidirectional | Cluster CIDR only |
Platform teams must enforce strict source/destination IP scoping. Allowing 0.0.0.0/0 on ports 4567 or 4568 exposes the cluster to unauthorized state injection and certification bypass attacks. Firewall rules must be applied symmetrically across all nodes to prevent asymmetric packet drops that corrupt the GCS membership protocol. Note that port 4567 requires both TCP and UDP: UDP handles lightweight GCS heartbeats and multicast discovery, while TCP manages bulk replication, flow control, and certification voting.
Production Firewall Implementation (nftables)
Modern infrastructure favors nftables over legacy iptables for atomic rule updates, connection tracking efficiency, and native set handling. The following production-grade script generates a validated ruleset, enforces strict IP validation, and applies the configuration atomically to prevent transient connectivity loss during deployment.
#!/usr/bin/env bash
set -euo pipefail
# Parameter Validation
validate_cidr() {
local cidr="$1"
if [[ ! "$cidr" =~ ^([0-9]{1,3}\.){3}[0-9]{1,3}/[0-9]{1,2}$ ]]; then
echo "ERROR: Invalid CIDR format: $cidr" >&2
exit 1
fi
}
CLUSTER_CIDR="${GALERA_CLUSTER_CIDR:-10.0.50.0/24}"
NODE_IP="${GALERA_NODE_IP:-}"
validate_cidr "$CLUSTER_CIDR"
if [[ -z "$NODE_IP" ]]; then
echo "ERROR: GALERA_NODE_IP environment variable is required." >&2
exit 1
fi
# Generate and apply nftables ruleset atomically
cat <<EOF | nft -f -
table inet galera_firewall {
set cluster_nodes {
type ipv4_addr
flags interval
elements = { $CLUSTER_CIDR }
}
chain input {
type filter hook input priority 0; policy drop;
# Allow established/related connections
ct state established,related accept
# Allow loopback
iif lo accept
# Allow ICMPv4/ICMPv6 for path MTU discovery
ip protocol icmp accept
ip6 nexthdr icmpv6 accept
# Galera cluster ports (TCP + UDP for 4567)
ip saddr @cluster_nodes tcp dport { 3306, 4567, 4568, 4444 } accept
ip saddr @cluster_nodes udp dport { 4567 } accept
# Allow SSH from management subnet
ip saddr 10.0.100.0/24 tcp dport 22 accept
}
}
EOF
echo "nftables ruleset applied successfully."
This configuration uses nftables sets for efficient CIDR matching and applies a default-drop policy. Atomic application via nft -f - prevents transient connectivity loss during rule replacement. For comprehensive syntax reference and kernel integration details, consult the official nftables quick reference.
Python Automation & Infrastructure-as-Code Integration
Platform teams frequently embed firewall provisioning into CI/CD pipelines or configuration management workflows. Python’s ipaddress module and subprocess execution provide deterministic validation and idempotent rule application. The following pattern demonstrates how to generate and validate nftables payloads programmatically:
import ipaddress
import subprocess
import sys
def validate_cluster_network(cidr: str) -> ipaddress.IPv4Network:
try:
network = ipaddress.IPv4Network(cidr, strict=False)
if network.prefixlen < 16 or network.prefixlen > 28:
raise ValueError("CIDR prefix must be between /16 and /28 for Galera clusters")
return network
except ValueError as e:
sys.exit(f"Invalid cluster network: {e}")
def generate_nft_set(cidr: str) -> str:
return f"""
table inet galera_firewall {{
set cluster_nodes {{
type ipv4_addr
flags interval
elements = {{ {cidr} }}
}}
chain input {{
type filter hook input priority 0; policy drop;
ct state established,related accept
iif lo accept
ip saddr @cluster_nodes tcp dport {{ 3306, 4567, 4568, 4444 }} accept
ip saddr @cluster_nodes udp dport {{ 4567 }} accept
}}
}}
"""
if __name__ == "__main__":
cluster_cidr = validate_cluster_network("10.0.50.0/24")
payload = generate_nft_set(str(cluster_cidr))
subprocess.run(["nft", "-f", "-"], input=payload.encode(), check=True)
This approach ensures that infrastructure-as-code templates fail fast on malformed networks before touching the kernel netfilter stack. Python’s standard library handles CIDR parsing natively, eliminating regex edge cases. See the official Python ipaddress documentation for advanced subnet manipulation and validation patterns.
Operational Dependencies & Validation
Firewall rules alone do not guarantee cluster stability. Platform engineers must verify symmetric routing, MTU consistency, and connection tracking limits. Asymmetric routing—where return traffic traverses a different firewall or interface—breaks stateful inspection and drops GCS packets. Validate connectivity using tcpdump and nft monitor trace to confirm packet flow across all four ports. Additionally, tune net.netfilter.nf_conntrack_max to accommodate Galera’s high connection churn during SST operations.
The Write-Set Certification Process Explained relies on uninterrupted UDP traffic for membership voting and flow control. If 4567 UDP is throttled or dropped by intermediate load balancers, nodes will experience certification timeouts and transition to Donor/Desynced states. Monitor wsrep_local_state and wsrep_cluster_size to detect firewall-induced membership drift before it cascades into write stalls.
For environments requiring encrypted inter-node traffic, TLS termination must align with firewall port mappings. SST and IST traffic can be wrapped in TLS without altering base port assignments, but certificate validation failures will manifest as connection resets rather than explicit firewall blocks. Implementation details are covered in Setting Up Secure TLS for Galera Cluster Communication.
Conclusion
Deterministic networking is a hard requirement for Galera’s synchronous replication model. Strict port scoping, atomic firewall deployment, and automated validation prevent split-brain scenarios and certification queue saturation. By embedding validated nftables rulesets into CI/CD pipelines, tuning kernel connection tracking, and monitoring GCS membership metrics, platform teams can maintain production-grade stability across multi-master topologies.