Initial Data Synchronization Methods in MariaDB Galera: SST, IST, and Automated Provisioning

This procedure builds on the node lifecycle model described in Galera Cluster Setup & Node Management, and solves one specific operational problem: how a fresh or lagging node obtains a byte-identical copy of the dataset before it is admitted as a writable member. When a joiner enters a MariaDB Galera group without a valid write-set history — or its last committed position has aged out of the donor’s cache — the provider must transfer state before replication can resume. That transfer, and how you tune it, dictates recovery-time objectives, donor I/O saturation, and whether a rolling upgrade completes in seconds or stalls the group for hours. This page is the operational reference for database administrators, DevOps engineers, and platform teams who provision, replace, and scale Galera nodes as a repeatable, validated workflow.

Concept: State Snapshot Transfer versus Incremental State Transfer

Galera has exactly two mechanisms for synchronizing a joiner, and the provider chooses between them automatically at join time. Incremental State Transfer (IST) replays only the write-sets the node missed while it was absent, streamed from a donor’s in-memory GCache ring buffer. State Snapshot Transfer (SST) is a full physical or logical copy of the entire dataset, used whenever an incremental catch-up is impossible. IST is measured in seconds and never blocks the donor; SST on a multi-terabyte dataset can saturate a network interface for minutes to hours and temporarily desyncs the donor from the write path.

The decision hinges on a single question: is the joiner’s last committed sequence number (seqno) still present in the donor’s gcache.size ring buffer? If the write volume produced during the outage exceeded the GCache capacity, the required range has wrapped and been overwritten, forcing a fallback to full SST. This is why GCache sizing is the highest-leverage tuning decision for painless node additions — the theory behind the ordered write-set stream that both transfers replay is covered in Understanding Galera Synchronous Replication.

The provider picks the transfer automatically: if the joiner's last seqno still lives in the donor's GCache it takes the fast IST path; otherwise it falls back to a full SST. Both end at SYNCED.

A joining node moves through the states OPEN → PRIMARY → JOINER → JOINED → SYNCED. It becomes a viable donor or a fully caught-up writer only once it reaches SYNCED. During an SST the elected donor transitions to Donor/Desynced and stops accepting new write-sets into its apply queue, which is why donor selection and flow control matter as much as the raw transfer speed. The clean-shutdown discipline that keeps a rejoin on the fast IST path — rather than forcing a full SST every restart — is detailed in Graceful Node Join and Leave Procedures.

The three SST backends

Galera ships three SST methods, each with strict operational boundaries:

Method	Layer	Donor impact	Best fit
`mariabackup`	Physical (InnoDB pages)	Non-blocking; donor stays writable	Production default, any dataset size
`rsync`	Filesystem	Full donor write-block for the transfer	Small datasets, fast reprovisioning in a lab
`mysqldump`	Logical (SQL replay)	Global read lock; slow `INSERT` replay	Cross-version edge cases only

Production deployments standardize on mariabackup, which streams physical InnoDB tablespaces through xbstream without a full table lock, preserving page structures, undo logs, and redo sequences so post-transfer crash recovery stays fast. The rsync method operates at the filesystem level but blocks donor writes for the duration, making it unsuitable for high-throughput OLTP where write latency must remain sub-millisecond. The legacy mysqldump method forces a FLUSH TABLES WITH READ LOCK on the donor and burns CPU replaying logical inserts on the joiner. A method-by-method benchmark for large datasets, including parallelism and compression trade-offs, lives in Choosing the Right SST Method for Large Datasets.

Prerequisites & Environment Requirements

State transfer is unforgiving of environment drift because the joiner commits to overwriting its datadir with the donor’s copy. Validate every item below on the joiner and all candidate donors before triggering a join.

Software versions

MariaDB 10.6 LTS or later (11.4 LTS recommended for new builds), with the Galera 4 provider (libgalera_smm.so) that ships with the server package.
mariabackup installed on every node — it is bundled with the MariaDB server package and is the executable the mariabackup SST method invokes on both donor and joiner.
Matching major/minor versions across nodes. A version skew between donor and joiner can abort SST during the prepare/apply phase.
Streaming helpers xbstream, plus a compressor (zstd or pigz) on both ends if you enable compressed transfer.

Network ports — bidirectional reachability is mandatory:

Port	Protocol	Purpose
3306	TCP	Client / SQL traffic
4567	TCP + UDP	Group communication (gcomm) and write-set replication
4568	TCP	Incremental State Transfer (IST)
4444	TCP	State Snapshot Transfer (SST)

If firewall rules leave 4444 or 4568 closed, the transfer stalls at the connection stage and the joiner eventually times out — locking these ports to known peers is covered in Network Security & Firewall Rules for Galera.

System settings

Free disk on the joiner of at least 1.5× the donor’s dataset size — mariabackup stages the stream before applying redo, so headroom beyond the raw data size is required. Capacity sizing for CPU, RAM, and disk lives in Galera Cluster Hardware Requirements.
A dedicated SST account replicated across all nodes, granted RELOAD, PROCESS, LOCK TABLES, and REPLICATION CLIENT (add BINLOG MONITOR on MariaDB 10.5+).
Swap disabled or vm.swappiness=0 so an OOM kill cannot corrupt grastate.dat mid-transfer.

Step-by-Step: Provisioning a Joiner via State Transfer

The workflow is: converge configuration, create the transfer account on a donor, run a pre-flight validation gate, start the joiner, and confirm it reaches SYNCED before routing traffic.

Step 1 — Configure the SST backend on every node

State transfer is governed by a small set of wsrep directives that must be consistent across the fleet. The [mysqld] section below selects mariabackup, points GCache large enough to favor IST on rejoin, and pins compression so wide-area transfers stay bandwidth-bound rather than disk-bound.

[mysqld]
# --- State transfer method & credentials ---
wsrep_sst_method=mariabackup
wsrep_sst_auth="sst_user:secure_password"

# --- GCache sizing decides IST vs SST on rejoin ---
wsrep_provider_options="gcache.size=8G; gcache.page_size=1G; ist.recv_addr=10.0.1.10:4568"

# --- mariabackup streaming tuning (donor side) ---
wsrep_sst_mariabackup_options="--parallel=4 --compress --compress-algorithm=zstd"

gcache.size is the buffer that decides the IST-versus-SST fallback: size it to exceed the write volume produced across your longest expected maintenance window. ist.recv_addr should bind to the node’s primary replication interface so IST is not routed over a management NIC in a multi-homed host. Full parameter semantics and loading precedence are documented in the wsrep.cnf Configuration Deep Dive; inject wsrep_sst_auth from a secrets manager rather than committing plaintext.

Step 2 — Create the transfer account on a live donor

The joiner authenticates to its donor with the wsrep_sst_auth credentials. Create the account on any SYNCED node so the grant replicates cluster-wide before the join begins.

CREATE USER IF NOT EXISTS 'sst_user'@'localhost' IDENTIFIED BY 'secure_password';
GRANT RELOAD, PROCESS, LOCK TABLES, REPLICATION CLIENT,
      BINLOG MONITOR ON *.* TO 'sst_user'@'localhost';
FLUSH PRIVILEGES;

Missing or under-privileged grants are the single most common cause of an immediate SST abort. mariabackup needs RELOAD to flush, LOCK TABLES and BINLOG MONITOR to read a consistent position, and PROCESS to inspect the server state.

Step 3 — Run a pre-flight validation gate

Before starting the joiner, verify donor reachability, the configured method, joiner disk capacity, and that the prospective donor is not already throttled by flow control. The Python probe below uses PyMySQL, targets Python 3.9+, and emits structured JSON so a CI/CD pipeline can gate the provisioning step on its exit code.

#!/usr/bin/env python3
"""Galera SST pre-flight validator — gate a node join in CI/CD."""
import json
import os
import shutil
import sys

import pymysql
from pymysql.err import OperationalError, MySQLError


def preflight(donor_host: str, user: str, password: str,
              datadir: str = "/var/lib/mysql", min_free_gb: int = 50) -> dict:
    checks: list[dict] = []

    # 1. Joiner free disk (mariabackup needs ~1.5x dataset size of headroom)
    free_gb = shutil.disk_usage(datadir).free / (1024 ** 3)
    checks.append({"check": "joiner_free_gb",
                   "value": round(free_gb, 1),
                   "pass": free_gb > min_free_gb})

    # 2. mariabackup present on this host
    checks.append({"check": "mariabackup_installed",
                   "value": bool(shutil.which("mariabackup")),
                   "pass": shutil.which("mariabackup") is not None})

    # 3. Donor: supported method + not flow-control paused
    try:
        conn = pymysql.connect(host=donor_host, user=user, password=password,
                               database="mysql", connect_timeout=5)
        with conn.cursor() as cur:
            cur.execute("SHOW GLOBAL VARIABLES LIKE 'wsrep_sst_method'")
            method = cur.fetchone()[1]
            cur.execute("SHOW GLOBAL STATUS LIKE 'wsrep_flow_control_paused'")
            paused = float(cur.fetchone()[1])
        conn.close()
    except OperationalError as exc:
        checks.append({"check": "donor_reachable", "value": str(exc), "pass": False})
        return {"status": "blocked", "details": checks}
    except MySQLError as exc:
        checks.append({"check": "donor_status_query", "value": str(exc), "pass": False})
        return {"status": "blocked", "details": checks}

    checks.append({"check": "donor_sst_method", "value": method,
                   "pass": method in ("mariabackup", "rsync")})
    checks.append({"check": "donor_flow_control_paused", "value": paused,
                   "pass": paused < 0.1})

    status = "ready" if all(c["pass"] for c in checks) else "blocked"
    return {"status": status, "details": checks}


if __name__ == "__main__":
    result = preflight(os.environ.get("DONOR_HOST", "10.0.1.11"),
                       "sst_user", os.environ.get("SST_PASSWORD", ""))
    print(json.dumps(result, indent=2))
    sys.exit(0 if result["status"] == "ready" else 1)

A blocked result means the join would fail or degrade the group — a paused donor is already struggling to apply writes, and adding an SST load on top guarantees a group-wide stall.

Step 4 — Start the joiner and let the provider choose the transfer

With the gate green, start MariaDB the normal way. Because the config lists the full membership, the node discovers the live group, negotiates a donor, and requests IST — falling back to SST only if its seqno has aged out of the donor’s GCache.

sudo systemctl start mariadb
# Watch the transfer negotiation in real time
journalctl -u mariadb -f | grep -E "WSREP:.*(IST|SST|State transfer|Donor)"

For a brand-new node with an empty datadir there is no history to replay, so a full SST is expected and correct. The distinction between joining an existing group and forming a new one is covered in Bootstrapping Your First Galera Cluster.

Step 5 — Pin the donor for predictable cross-AZ joins

Automatic donor selection can route a joiner to a high-latency peer in another availability zone, turning a fast transfer into a slow one that trips flow control. Pin a low-latency, same-AZ donor explicitly at join time:

[mysqld]
# Prefer db-node-02; fall back to any SYNCED node if it is unavailable
wsrep_sst_donor="db-node-02,"

The trailing comma is significant — it tells Galera to fall back to automatic selection if the named donor is not available, rather than refusing to join. Omitting it makes the joiner hard-fail when the pinned donor is down.

Parameter Deep-Dive

These are the knobs that most directly determine whether a transfer succeeds and how much it costs the donor.

Parameter	Recommended value	Why it matters
`wsrep_sst_method`	`mariabackup`	Non-blocking physical SST; keeps the donor writable, unlike the blocking `rsync` default.
`gcache.size`	4G–16G (≥ peak write volume during maintenance)	Decides IST-versus-SST fallback. Undersized, every rejoin degrades to a full snapshot.
`gcs.fc_limit`	128–512	Flow-control queue depth. During an SST a low limit lets the desynced donor stall all writers; too high accumulates unbounded backlog.
`gcs.fc_factor`	0.5–0.8	Resume ratio. Replication lifts once the recv queue drains to `fc_limit × fc_factor`.
`wsrep_sst_donor`	`"preferred-node,"`	Pins a low-latency donor with automatic fallback (trailing comma). Prevents cross-AZ transfers.
`socket.ssl`	`YES` (cross-datacenter)	Encrypts the SST/replication stream. Mandatory when transfers cross a network boundary.

Misconfigured flow control is the subtle failure here: while a donor is desynced feeding an SST, its own apply queue can back up, and if gcs.fc_limit is too low the whole group pauses writes until the transfer finishes. Provider-option syntax and the full EVS/GCS matrix are documented in the wsrep.cnf Configuration Deep Dive. For the authoritative backend list and streaming flags, MariaDB’s State Snapshot Transfer reference and wsrep_provider_options documentation are the canonical sources.

Verification & Health Checks

A transfer is only successful when the joiner reports Synced inside a Primary component with the full cluster size. Confirm all four before routing traffic.

SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';  -- expect Synced
SHOW GLOBAL STATUS LIKE 'wsrep_cluster_status';       -- expect Primary
SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';         -- expect N
SHOW GLOBAL STATUS LIKE 'wsrep_ready';                -- expect ON

During an active transfer, the joiner reports wsrep_local_state_comment = Joiner and the donor reports Donor/Desynced; both must return to Synced when it completes. A Python probe that a provisioning job can poll, with explicit handling for the transient wsrep conflict codes 1213 (deadlock / certification conflict) and 1205 (lock wait timeout):

import sys
import pymysql
from pymysql.err import OperationalError, MySQLError

RETRYABLE = {1205, 1213}  # lock wait timeout, deadlock / cert conflict


def transfer_complete(host: str, user: str, password: str, expected_size: int = 3) -> bool:
    """Return True only when the joiner is a Synced member of a full Primary Component."""
    try:
        conn = pymysql.connect(host=host, user=user, password=password,
                               database="mysql", connect_timeout=5)
    except OperationalError as exc:
        # A node still mid-SST refuses SQL connections — treat as "not ready yet".
        print(f"[wait] {host} not accepting queries: {exc}", file=sys.stderr)
        return False

    try:
        with conn.cursor() as cur:
            cur.execute("SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment'")
            state = cur.fetchone()[1]
            cur.execute("SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size'")
            size = int(cur.fetchone()[1])
    except MySQLError as exc:
        code = exc.args[0]
        if code in RETRYABLE:
            print(f"[wait] transient wsrep code {code} on {host}", file=sys.stderr)
        else:
            print(f"[error] status query failed on {host}: {exc}", file=sys.stderr)
        return False
    finally:
        conn.close()

    if state != "Synced" or size < expected_size:
        print(f"[wait] {host}: state={state} size={size}")
        return False
    return True


if __name__ == "__main__":
    ok = transfer_complete("10.0.1.10", "monitor", "secure_pass", expected_size=3)
    sys.exit(0 if ok else 1)

Ongoing polling, alert thresholds, and dashboards for these variables are expanded in Automated Node Health Monitoring.

Automation Integration

The pattern that scales is: render identical configuration to every node, create the transfer account once, run the pre-flight gate, then start joiners one at a time so no two SSTs hit the same donor concurrently. An Ansible sketch that serializes joins and keeps the transfer account idempotent:

- name: Render SST configuration to every node
  ansible.builtin.template:
    src: galera-sst.cnf.j2
    dest: /etc/mysql/mariadb.conf.d/60-sst.cnf
    mode: "0640"
  # never auto-restart here; joins are sequenced explicitly below

- name: Ensure the SST account exists on a live donor
  community.mysql.mysql_user:
    name: sst_user
    password: "{{ vault_sst_password }}"
    priv: "*.*:RELOAD,PROCESS,LOCK TABLES,REPLICATION CLIENT,BINLOG MONITOR"
    host: localhost
    state: present
  run_once: true
  delegate_to: "{{ groups['galera'][0] }}"

- name: Gate each join on the pre-flight validator
  ansible.builtin.command: >
    python3 /opt/galera/sst_preflight.py
  environment:
    DONOR_HOST: "{{ groups['galera'][0] }}"
    SST_PASSWORD: "{{ vault_sst_password }}"
  changed_when: false

- name: Start joiners one at a time so SSTs never overlap
  ansible.builtin.service:
    name: mariadb
    state: started
  throttle: 1
  when: inventory_hostname != groups['galera'][0]

The throttle: 1 directive is what prevents two joiners from selecting the same donor simultaneously and desyncing it twice at once. The same converge-then-sequence approach maps onto Terraform provisioners or a CI/CD job that promotes only after the verification probe returns success. Rendering the underlying config idempotently from inventory is detailed in Automating Node Provisioning with Ansible.

Troubleshooting

WSREP_SST: [ERROR] Possible timeout in receiving first data from donor in gtid stage The joiner opened the SST socket but the donor never streamed. Almost always port 4444 is closed on the joiner or filtered between the nodes. Confirm 4444/tcp and 4568/tcp are reachable in both directions and that no stale mariabackup process is holding the port, then restart the joiner.

WSREP: Process completed with error: wsrep_sst_mariabackup ... Access denied for user 'sst_user' The wsrep_sst_auth account is missing on the donor or under-privileged. Recreate it on a SYNCED node (Step 2) so it replicates, and verify RELOAD, PROCESS, LOCK TABLES, REPLICATION CLIENT, BINLOG MONITOR are all granted. A password mismatch between the donor’s account and the joiner’s wsrep_sst_auth triggers the same error.

Joiner falls back to full SST on every restart instead of fast IST The node is leaving uncleanly (seqno: -1 in grastate.dat), typically from a SIGKILL or OOM rather than systemctl stop mariadb, or gcache.size is smaller than the write volume during the outage. A clean stop persists the last sequence number so the next start can use IST; enlarge gcache.size and always stop the service gracefully, as covered in Graceful Node Join and Leave Procedures.

SST aborts partway and the joiner is stuck in Joining A transfer that fails mid-stream leaves a partial datadir and orphaned mariabackup artifacts. Stop the joiner, remove /var/lib/mysql/grastate.dat, clear the partial InnoDB and .sst temporary files, ensure adequate free disk, then restart to trigger a clean full SST. Decoding the exact log lines behind any of these symptoms is covered in Handling Galera Startup Errors & Logs.

Whole cluster pauses writes during a large SST The desynced donor’s own recv queue backed up past gcs.fc_limit, tripping flow control across the group. Pin a dedicated donor with wsrep_sst_donor so the busiest node is never selected, raise gcs.fc_limit toward 256+, and monitor wsrep_flow_control_paused during the transfer window.

Frequently Asked Questions

Why does a rejoining node keep doing a full SST instead of a fast IST? IST replays only the write-sets a node missed while it was down, streamed from a donor’s GCache ring buffer. If gcache.size is smaller than the write volume produced during the outage, the required range ages out and the provider falls back to a full State Snapshot Transfer. Increase gcache.size to exceed peak write volume across your longest maintenance window, and always stop the node cleanly so its last seqno is persisted.

Which SST method should I use in production? mariabackup for almost every deployment: it streams physical InnoDB tablespaces without blocking donor writes, so the donor stays available during the transfer. Reserve rsync for small datasets in a lab where a brief donor write-block is acceptable, and avoid mysqldump except for rare cross-version edge cases where a logical copy is unavoidable.

How do I stop a large SST from pausing the whole cluster? Pin a low-latency, non-critical donor with wsrep_sst_donor="node," so the busiest node is never desynced, raise gcs.fc_limit to 256 or higher, and gate the join on a pre-flight check that a candidate donor’s wsrep_flow_control_paused is near zero before you start the transfer.

Choosing the Right SST Method for Large Datasets — mariabackup vs rsync benchmarking and compression tuning
Bootstrapping Your First Galera Cluster — forming the first Primary Component before any node joins
Graceful Node Join and Leave Procedures — clean shutdown that keeps rejoins on the fast IST path
wsrep.cnf Configuration Deep Dive — full provider-option and flow-control parameter matrix
Automated Node Health Monitoring — polling the wsrep status variables that confirm a transfer completed

Initial Data Synchronization Methods in MariaDB Galera: SST, IST, and Automated Provisioning

Concept: State Snapshot Transfer versus Incremental State Transfer #

The three SST backends #

Prerequisites & Environment Requirements #

Step-by-Step: Provisioning a Joiner via State Transfer #

Step 1 — Configure the SST backend on every node #

Step 2 — Create the transfer account on a live donor #

Step 3 — Run a pre-flight validation gate #

Step 4 — Start the joiner and let the provider choose the transfer #

Step 5 — Pin the donor for predictable cross-AZ joins #

Parameter Deep-Dive #

Verification & Health Checks #

Automation Integration #

Troubleshooting #

Frequently Asked Questions #

Related #