docs: update after making consistent_cluster_management mandatory
We remove Raft documentation irrelevant in 5.5. One of the changes is removing a part of the "Enabling Raft" section in raft.rst. Since Raft is mandatory in 5.5, the only way to enable it in this version is by performing a rolling upgrade from 5.4. We only need to have this case well-documented. In particular, we remove information that also appears in the upgrade guides like verifying schema synchronization. Similarly, we remove a sentence from the "Manual Recovery Procedure" section in handling-node-failures.rst because it mentions enabling Raft manually, which is impossible in 5.5. The rest of the changes are just removing information about checking or setting consistent_cluster_management, which has become unused.
This commit is contained in:
@@ -35,50 +35,26 @@ of the DCs is down.
|
||||
Enabling Raft
|
||||
---------------
|
||||
|
||||
.. note::
|
||||
In ScyllaDB 5.2 and ScyllaDB Enterprise 2023.1 Raft is Generally Available and can be safely used for consistent schema management.
|
||||
It will get enabled by default when you upgrade your cluster to ScyllaDB 5.4 or 2024.1.
|
||||
If needed, you can explicitly prevent it from getting enabled upon upgrade.
|
||||
|
||||
.. only:: opensource
|
||||
|
||||
See :doc:`the upgrade guide from 5.2 to 5.4 </upgrade/index>` for details.
|
||||
|
||||
ScyllaDB Open Source 5.2 and later, and ScyllaDB Enterprise 2023.1 and later come equipped with a procedure that can setup Raft-based consistent cluster management in an existing cluster. We refer to this as the **Raft upgrade procedure** (do not confuse with the :doc:`ScyllaDB version upgrade procedure </upgrade/index/>`).
|
||||
|
||||
.. warning::
|
||||
Once enabled, Raft cannot be disabled on your cluster. The cluster nodes will fail to restart if you remove the Raft feature.
|
||||
In ScyllaDB Open Source 5.5 and ScyllaDB Enterprise 2024.2 Raft is mandatory.
|
||||
|
||||
To enable Raft in an existing cluster, you need to enable the ``consistent_cluster_management`` option in the ``scylla.yaml`` file
|
||||
for **each node** in the cluster:
|
||||
When all the nodes in the cluster are upgraded to ScyllaDB Open Source 5.5 or ScyllaDB Enterprise 2024.2, the cluster will start the **Raft upgrade procedure**.
|
||||
|
||||
#. Ensure that the schema is synchronized in the cluster by executing :doc:`nodetool describecluster </operating-scylla/nodetool-commands/describecluster>` on each node and ensuring that the schema version is the same on all nodes.
|
||||
#. Perform a :doc:`rolling restart </operating-scylla/procedures/config-change/rolling-restart/>`, updating the ``scylla.yaml`` file for **each node** in the cluster before restarting it to enable the ``consistent_cluster_management`` option:
|
||||
.. only:: opensource
|
||||
|
||||
.. code-block:: yaml
|
||||
See :doc:`the upgrade guide from 5.4 to 5.5 </upgrade/index>` for details.
|
||||
|
||||
consistent_cluster_management: true
|
||||
|
||||
When all the nodes in the cluster and updated and restarted, the cluster will start the **Raft upgrade procedure**.
|
||||
**You must then verify** that the Raft upgrade procedure has finished successfully. Refer to the :ref:`next section <verify-raft-procedure>`.
|
||||
|
||||
Alternatively, you can enable the ``consistent_cluster_management`` option when you are:
|
||||
|
||||
* Performing a rolling upgrade from version 5.1 to 5.2 or version 2022.x to 2023.1 by updating ``scylla.yaml`` before restarting each node. The Raft upgrade procedure will start as soon as the last node was upgraded and restarted. As above, this requires :ref:`verifying <verify-raft-procedure>` that the procedure successfully finishes.
|
||||
* Creating a new cluster. This does not use the Raft upgrade procedure; instead, Raft is functioning in the cluster and managing schema right from the start.
|
||||
|
||||
Until all nodes are restarted with ``consistent_cluster_management: true``, it is still possible to turn this option back off. Once enabled on every node, it must remain turned on (or the node will refuse to restart).
|
||||
.. warning::
|
||||
Once enabled, Raft cannot be disabled on your cluster.
|
||||
|
||||
.. _verify-raft-procedure:
|
||||
|
||||
Verifying that the Raft upgrade procedure finished successfully
|
||||
========================================================================
|
||||
|
||||
The Raft upgrade procedure starts as soon as every node in the cluster restarts with ``consistent_cluster_management`` flag enabled in ``scylla.yaml``.
|
||||
|
||||
.. TODO: update the above sentence once 5.3 and later are released.
|
||||
|
||||
The procedure requires **full cluster availability** to correctly setup the Raft algorithm; after the setup finishes, Raft can proceed with only a majority of nodes, but this initial setup is an exception.
|
||||
The Raft upgrade procedure requires **full cluster availability** to correctly setup the Raft algorithm; after the setup finishes, Raft can proceed with only a majority of nodes, but this initial setup is an exception.
|
||||
An unlucky event, such as a hardware failure, may cause one of your nodes to fail. If this happens before the Raft upgrade procedure finishes, the procedure will get stuck and your intervention will be required.
|
||||
|
||||
To verify that the procedure finishes, look at the log of every Scylla node (using ``journalctl _COMM=scylla``). Search for the following patterns:
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
* endpoint_snitch - ``grep endpoint_snitch /etc/scylla/scylla.yaml``
|
||||
* Scylla version - ``scylla --version``
|
||||
* Authenticator - ``grep authenticator /etc/scylla/scylla.yaml``
|
||||
* consistent_cluster_management - ``grep consistent_cluster_management /etc/scylla/scylla.yaml``
|
||||
|
||||
.. Note::
|
||||
|
||||
|
||||
@@ -119,7 +119,6 @@ Add New DC
|
||||
* **listen_address** - IP address that Scylla used to connect to the other Scylla nodes in the cluster.
|
||||
* **endpoint_snitch** - Set the selected snitch.
|
||||
* **rpc_address** - Address for client connections (Thrift, CQL).
|
||||
* **consistent_cluster_management** - set to the same value as used by your existing nodes.
|
||||
|
||||
The parameters ``seeds``, ``cluster_name`` and ``endpoint_snitch`` need to match the existing cluster.
|
||||
|
||||
|
||||
@@ -59,8 +59,6 @@ Procedure
|
||||
|
||||
* **seeds** - Specifies the IP address of an existing node in the cluster. The new node will use this IP to connect to the cluster and learn the cluster topology and state.
|
||||
|
||||
* **consistent_cluster_management** - set to the same value as used by your existing nodes.
|
||||
|
||||
.. note::
|
||||
|
||||
In earlier versions of ScyllaDB, seed nodes assisted in gossip. Starting with Scylla Open Source 4.3 and Scylla Enterprise 2021.1, the seed concept in gossip has been removed. If you are using an earlier version of ScyllaDB, you need to configure the seeds parameter in the following way:
|
||||
|
||||
@@ -70,7 +70,6 @@ the file can be found under ``/etc/scylla/``
|
||||
- **listen_address** - IP address that the Scylla use to connect to other Scylla nodes in the cluster
|
||||
- **endpoint_snitch** - Set the selected snitch
|
||||
- **rpc_address** - Address for client connection (Thrift, CQLSH)
|
||||
- **consistent_cluster_management** - ``true`` by default, can be set to ``false`` if you don't want to use Raft for consistent schema management in this cluster (will be mandatory in later versions). Check the :doc:`Raft in ScyllaDB document</architecture/raft/>` to learn more.
|
||||
|
||||
3. In the ``cassandra-rackdc.properties`` file, edit the rack and data center information.
|
||||
The file can be found under ``/etc/scylla/``.
|
||||
|
||||
@@ -26,7 +26,6 @@ The file can be found under ``/etc/scylla/``
|
||||
- **listen_address** - IP address that Scylla used to connect to other Scylla nodes in the cluster
|
||||
- **endpoint_snitch** - Set the selected snitch
|
||||
- **rpc_address** - Address for client connection (Thrift, CQL)
|
||||
- **consistent_cluster_management** - ``true`` by default, can be set to ``false`` if you don't want to use Raft for consistent schema management in this cluster (will be mandatory in later versions). Check the :doc:`Raft in ScyllaDB document</architecture/raft/>` to learn more.
|
||||
|
||||
3. This step needs to be done **only** if you are using the **GossipingPropertyFileSnitch**. If not, skip this step.
|
||||
In the ``cassandra-rackdc.properties`` file, edit the parameters listed below.
|
||||
|
||||
@@ -63,7 +63,6 @@ Perform the following steps for each node in the new cluster:
|
||||
* **rpc_address** - Address for client connection (Thrift, CQL).
|
||||
* **broadcast_address** - The IP address a node tells other nodes in the cluster to contact it by.
|
||||
* **broadcast_rpc_address** - Default: unset. The RPC address to broadcast to drivers and other Scylla nodes. It cannot be set to 0.0.0.0. If left blank, it will be set to the value of ``rpc_address``. If ``rpc_address`` is set to 0.0.0.0, ``broadcast_rpc_address`` must be explicitly configured.
|
||||
* **consistent_cluster_management** - ``true`` by default, can be set to ``false`` if you don't want to use Raft for consistent schema management in this cluster (will be mandatory in later versions). Check the :doc:`Raft in ScyllaDB document</architecture/raft/>` to learn more.
|
||||
|
||||
#. After you have installed and configured Scylla and edited ``scylla.yaml`` file on all the nodes, start the node specified with the ``seeds`` parameter. Then start the rest of the nodes in your cluster, one at a time, using
|
||||
``sudo systemctl start scylla-server``.
|
||||
|
||||
@@ -29,7 +29,6 @@ Login to one of the nodes in the cluster with (UN) status, collect the following
|
||||
* seeds - ``cat /etc/scylla/scylla.yaml | grep seeds:``
|
||||
* endpoint_snitch - ``cat /etc/scylla/scylla.yaml | grep endpoint_snitch``
|
||||
* Scylla version - ``scylla --version``
|
||||
* consistent_cluster_management - ``grep consistent_cluster_management /etc/scylla/scylla.yaml``
|
||||
|
||||
Procedure
|
||||
---------
|
||||
|
||||
@@ -72,8 +72,6 @@ Procedure
|
||||
|
||||
- **rpc_address** - Address for client connection (Thrift, CQL)
|
||||
|
||||
- **consistent_cluster_management** - set to the same value as used by your existing nodes.
|
||||
|
||||
#. Add the ``replace_node_first_boot`` parameter to the ``scylla.yaml`` config file on the new node. This line can be added to any place in the config file. After a successful node replacement, there is no need to remove it from the ``scylla.yaml`` file. (Note: The obsolete parameters "replace_address" and "replace_address_first_boot" are not supported and should not be used). The value of the ``replace_node_first_boot`` parameter should be the Host ID of the node to be replaced.
|
||||
|
||||
For example (using the Host ID of the failed node from above):
|
||||
|
||||
@@ -1,16 +1,6 @@
|
||||
Handling Node Failures
|
||||
------------------------
|
||||
|
||||
.. note::
|
||||
|
||||
This page applies to ScyllaDB clusters that use Raft to ensure consistency.
|
||||
You can verify that Raft-based consistent management is enabled for your
|
||||
cluster in the ``scylla.yaml`` file (enabled by default):
|
||||
``consistent_cluster_management: true``
|
||||
|
||||
.. REMOVE IN FUTURE VERSIONS - Remove the above note when Raft is mandatory
|
||||
and default for both new and existing clusters.
|
||||
|
||||
ScyllaDB relies on the Raft consensus algorithm, which requires at least a quorum
|
||||
of nodes in a cluster to be available. If one or more nodes are down, but the quorum
|
||||
is live, reads, writes, and schema updates proceed unaffected. When the node that
|
||||
@@ -81,9 +71,7 @@ You can follow the manual recovery procedure when:
|
||||
|
||||
* The majority of nodes (for example, 2 out of 3) failed and are irrecoverable.
|
||||
* :ref:`The Raft upgrade procedure <verify-raft-procedure>` got stuck because one
|
||||
of the nodes failed in the middle of the procedure and is irrecoverable. This
|
||||
may occur in existing clusters where Raft was manually enabled.
|
||||
See :ref:`Enabling Raft <enabling-raft-existing-cluster>` for details.
|
||||
of the nodes failed in the middle of the procedure and is irrecoverable.
|
||||
|
||||
.. warning::
|
||||
|
||||
|
||||
Reference in New Issue
Block a user