Edit page

Clustering and Sharing SCALE Volumes with TrueCommand

TrueCommand 3.0 has not passed validation for Clustering and that feature is expected to be highly unstable in this release. With the current unmaintained state of the upstream Gluster project, consider this functionality deprecated. The clustering feature is scheduled for removal in a future TrueCommand revision.

Further, TrueNAS SCALE 24.04 has removed the deprecated gluster backend. Systems installed with SCALE 24.04 (Dragonfish) are unable to use this deprecated TrueCommand feature.

Introduction

TrueNAS SCALE SMB clustering combines the benefits of the self-healing OpenZFS file system with the open-source Gluster scalable network file system.

TrueNAS SCALE SMB clustering requires a minimum of three TrueNAS SCALE nodes (systems), but you can scale it to a substantially higher number of physical nodes. A properly configured Active Directory environment is also required for SMB or gluster clustering. Gluster data consists of volumes, which can have multiple SMB shares, stored across bricks, the basic unit of storage in the Gluster File System, on the individual servers.

Requirements

Software

  • A minimum of 3 to 20 TrueNAS SCALE systems running the same release of 22.12.0 or later
  • A TrueCommand instance (cloud or on-premises) running 2.3.0 or later
  • An Active Directory (AD) environment with domain service roles, DNS roles, and reverse lookup zones configured.

Hardware

Each TrueNAS SCALE system must have two network interfaces:

  • One network interface for SMB, AD, and TrueCommand traffic (static IP/DHCP reservation recommended)
  • One network interface for the node-to-node cluster traffic using static IP addresses (private network recommended)

Each TrueNAS SCALE system must also have:

  • A third IP address for the cluster VIP outside of DHCP range for users to access clustered shares.
  • A preconfigured storage pool(s) with appropriate performance and parity

Warnings and Restrictions

SMB clusters created in TrueNAS SCALE Bluefin are not available for cluster expansion. TrueNAS SCALE Cobia plans to implement a method to enable new volumes for SMB cluster expansion. TrueNAS SCALE Dragonfish removed the gluster backend and these systems cannot be used for SMB clusters.

Clustering is considered experimental and should not be used in a production environment or for handling critical data!

Clustering is a back-end feature in TrueNAS SCALE. You should only configure clustering using the TrueCommand web interface. Configuring or managing clusters within the TrueNAS SCALE UI or Shell can result in cluster failures and permanent data loss.

Using the clustering feature on a SCALE system adds some restrictions to that system:

  • Activating clustering disables individual SMB share creation on cluster member TrueNAS systems.
  • Systems join a single cluster at a time.
  • Removing or migrating systems from a cluster requires deleting the entire cluster.

Cluster nodes (systems) must be on the same release of SCALE.

Supported cluster types are replicated, distributed, distributed replicated, and dispersed. Distributed dispersed clustering is not currently supported.

Setting up the Environment

Configuring the cluster feature is a multi-step process that spans multiple systems.

Setup Guide (click to expand)

TrueNAS SCALE Systems

Follow this procedure for each TrueNAS SCALE system you want to connect to TrueCommand and use in the cluster.

  1. Log in to the SCALE UI and go to Storage. Ensure a storage pool is available for use in the cluster. If not, click Create Pool and make a new pool using any available disks.

  2. Go to Network and look at the Interfaces card.

    • Ensure two interfaces are available. Note which is the primary interface that allows SCALE UI access and access between SCALE systems, TrueCommand, and Active Directory environments. Having two interfaces allows connecting the SCALE systems to Active Directory and using TrueCommand to create and manage the cluster.

    • Ensure the second interface has a static IP address on a different network/subnet that connects all the SCALE systems. This interface securely handles all the data-sharing traffic between the clustered systems.

    • Ensure the network has an IP address available. TrueCommand creates a virtual IP address (VIP) that allows access to the cluster. All cluster nodes (systems) use the same VIP.

    TrueNAS automatically adds entries to AD DNS for CTDB public IP addresses. Administrators should verify the addresses before joining AD to prevent configuration errors.
  3. Go to Shares and look at the Windows (SMB) Shares section. Take steps to ensure that disabling any critical shares is not disruptive.

Repeat this procedure for each SCALE system you want to add to the cluster.

Microsoft Active Directory

  1. Verify that the Active Directory (AD) environment to pair with the cluster is available and administratively accessible on the same network as the TrueCommand and TrueNAS SCALE systems.

  2. Log in to the Windows Server system and open the Server Manager. Click Tools > DNS to open the DNS Manager.

  3. Expand Reverse Lookup Zones on the left side menu, then select the Active Directory-Integrated Primary zone to use for the cluster.

    If no zone exists, see the Microsoft guide for creating DNS Zones.

  4. Click Action > New Pointer (PTR…) and configure a New Resource Record. Enter the SCALE system IP address and host name, then click OK.

Repeat this process for each system intended for clustering. The new records appear inside the zone as they save.

TrueCommand Container

If not already complete:

  1. Deploy TrueCommand 2.2 or later in a Docker container. The system used for the TrueCommand container cannot be any of the TrueNAS SCALE systems intended for the cluster.

  2. Enter the TrueCommand IP address in a browser, and create the first user. Log in with these user credentials to see the Dashboard.

  3. Click New System and add the credentials for the first SCALE system. Use the SCALE root account password. When ready, click ADD AND CONTINUE and repeat the process for each SCALE system intended for the cluster. When complete, each SCALE system has a card on the TrueCommand Dashboard that displays system statistics.

We recommend you back up the SCALE system configuration before creating the cluster. Backups allow users to quickly restore the system configuration to the initial working state if something goes wrong.

In the TrueCommand Dashboard, click on the name of a connected system to open a detailed view of that system. Click Config Backups and CREATE BACKUP to store the SCALE configuration file with TrueCommand.

Creating the Cluster

When the SCALE, AD, and TrueCommand environments are ready, log into TrueCommand to configure the cluster of SCALE systems.

Click the Clusters icon in the upper left. Click CREATE CLUSTER to see the cluster creation options.

Network Options for Clustered Systems
Figure 5: Network Options for Clustered Systems
  1. Enter a unique name for the cluster, and then select the systems to include from the dropdown list. A list of SCALE systems displays.

  2. Open the Network Address dropdown for each system and choose the static IP address from the previously configured subnet dedicated to cluster traffic.

  3. Click NEXT, verify the settings, then click CREATE.

TrueCommand might take a while to create the cluster.

Configuring the Cluster

After creating the cluster, TrueCommand opens another sidebar to configure it for AD connectivity and SMB sharing.

Assigning the Virtual IPs (VIPs)

For each system:

Configure Cluster SMB Network
Figure 6: Configure Cluster SMB Network
  1. Choose the IP address related to the primary subnet (typically the IP address you use to connect the SCALE system to TrueCommand).

  2. Click NEXT.

Assigning the Associate VIPs

For each system:

Configure Associate VIP
Figure 7: Configure Associate VIPs
  1. Select the interfaces to associate with the VIPs. You should select the interface configured for the SCALE system IP address.

  2. Click Next.

Entering Active Directory Credentials

Enter user for Active Directory for the cluster:

Configure Cluster Active Directory Connection
Figure 8: Configure Cluster Active Directory Connection
  1. Enter the Microsoft Active Directory credentials.

  2. Click NEXT.

Confirming the Configuration

SMB service does not start if the cluster systems (nodes) are incorrectly configured!
Configure Cluster: Review and confirm
Figure 9: Review and confirm
  1. Verify the connection details are correct.

  2. Click CONFIRM to configure the cluster, or click BACK to adjust the settings.

Creating a cluster has no visible effect on each SCALE web interface. To verify the cluster is created and active, open the SCALE Shell and enter gluster peer status. The command returns the list of SCALE IP addresses and current connection status.

Creating Cluster Volumes

  1. In the TrueCommand Clusters screen, find the cluster to use and click CREATE VOLUME.

  2. Enter a unique name for the cluster and select a Type.

    Current Volume Types (click to expand)

    Replicated

    Replicated volumes are the most similar to ZFS mirrors. They have exact copies of all data on all bricks. Since TrueNAS SCALE SMB cluster implementation requires a minimum of three nodes, a replicated volume has three identical copies of all data.

    A replicated volume can experience multiple brick failures, yet you can still access the data if a single brick is still accessible. Replicated volumes excel in data reliability and data redundancy at the cost of lower overall storage.

    Distributed Replicated

    Distributed replicated volumes distribute files across replicated sets of bricks. You set the replica count during the initial volume configuration.

    Distributed replicated volumes require a minimum of three replicas to avoid potential issues with split-brain. The number of bricks must be a multiple of the replica count. The minimum number of nodes for this volume type is six since each replica set requires three nodes.

    Distributed replicated volumes are best when you need highly-available data with redundancy protection, although they scale poorly.

    TrueCommand currently allows distributed replicated volumes with two replicas. This unintended behavior can lead to potential data loss due to split-brain situations. We are working to resolve this in TC-2626.

    Dispersed

    Dispersed volumes are most similar to RAIDZ. Data is striped across the bricks with parity added. You configure the number of redundant bricks during volume creation. The number of parity bricks determines the number of bricks the cluster can lose without impacting volume operation.

  3. After selecting an option in Type, enter a value based on the available storage from the clustered pools and your storage requirements in Brick Size.

  4. Review the pools for each SCALE system in the cluster. If any system does not show the desired pool for this cluster volume, select it from the Pools dropdown.

  5. Click NEXT.

  6. Review the settings for the new volume and click CREATE.

TrueCommand adds new cluster volumes to the individual cluster cards on the Clusters screen.

The web interface for the individual SCALE systems does not show any datasets created for cluster volumes. To verify the volume created, go to the Shell and enter gluster volume info all.

Sharing the Cluster Volume

To share a cluster volume, go to the TrueCommand Clusters screen, finding the cluster card, and click on the desired cluster volume. Click CREATE SHARE.

Add Cluster Share
Figure 11: Add Cluster Share
  1. Enter a unique name for the share.

  2. Select the ACL type to apply to the share from the ACL dropdown list.

    Current Options (click to expand)
    • POSIX_OPEN - Grants read, write, and execute permissions for all users.
    • POSIX_RESTRICTED - Grants read, write, and execute to owner and group, but not others. The template might optionally include the special-purpose builtin_users and builtin_administrators groups, domain_users and domain_admins groups in Active Directory environments.

  3. (Optional) Select Readonly to prevents users from changing the cluster volume contents.

  4. Click CONFIRM to create the SMB share and make it immediately active.

The SMB share adds to the SCALE Shares > SMB section for each system in the cluster. Attempting to manage the share from the SCALE UI is not recommended.

Connecting to the Shared Volume

There are several ways to access an SMB share, but this article demonstrates using Windows 10 File Explorer. From a Windows 10 system:

  1. Connected to the same network as the clustering environment, open File Explorer.

  2. Clear the contents and enter \\ followed by the IP address or host name of one of the clustered SCALE systems in the Navigation bar. Press Enter.

  3. Enter the user name and password for an Active Directory user account when prompted. Enter the Active Directory system name followed by the user account name (for example: AD01\sampuser).

  4. Browse to the cluster volume folder to view or modify files.

Replacing Cluster Nodes

A node is a single TrueNAS storage system in a cluster.

Cluster node replacement only works if you are using TrueCommand 2.3 or later and SCALE 22.12.0 or later.

New replacement nodes must have the same hardware as the old node you are replacing. The old node must also have a configuration backup that is safe and accessible.

The method you use to replace a cluster node differs depending on whether or not the node has access to the data on the brick.

The Node Has Access to Brick Data

If replacing a node that still has access to the data on the brick, you must first install the same SCALE version on the replacement system (node).

After installing SCALE on the new system, log into the SCALE web UI and go to System Settings > General. Click Manage Configuration, then select Upload Config. Select the configuration file from the node you are replacing and click Upload.

After applying the configuration, the system reboots and uses the same configuration as the node you are replacing. The new system automatically joins the cluster and heals damaged data before returning to a healthy state.

The Node Does Not Have Access to Brick Data

If the node you are replacing does not have access to the data on the brick, you must first install the same SCALE version on the replacement system (node).

After installing SCALE on the new system, access the SCALE web UI and go to Storage. Create a pool with the same name as the pool on the node you are replacing.

Go to System Settings > Shell and enter midclt call gluster.peer.initiate_as_replacement poolname clustervolumename>

Where:

  • poolname is the name of the pool you created.
  • clustervolumename is the name of the cluster volume you are currently using.

After the command succeeds, go to System Settings > General. Click Manage Configuration, then select Upload Config. Select the configuration file from the node you are replacing and click Upload.

After applying the configuration, the system reboots and uses the same configuration as the node you are replacing. The new system automatically joins the cluster and heals damaged data before returning to a healthy state.

Clustered Backup Strategies

TrueNAS Enterprise

TrueNAS Enterprise customers can contact iX Support to discuss their clustered backup strategy options.

Contacting iX Support

Customers who purchase iXsystems hardware or that want additional support must have a support contract to use iXsystems Support Services. The TrueNAS Community forums provides free support for users without an iXsystems Support contract.

Contact MethodContact Options
Webhttps://support.ixsystems.com
Emailsupport@ixsystems.com
TelephoneMonday - Friday, 6:00AM to 6:00PM Pacific Standard Time:

US-only toll-free: 1-855-473-7449 option 2
Local and international: 1-408-943-4100 option 2
TelephoneAfter Hours (24x7 Gold Level Support only):

US-only toll-free: 1-855-499-5131
International: 1-408-878-3140 (international calling
rates apply)