Replacing 2 disks at same time seems to operate in series, not parallel.

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
I've got an array of 24 disks, 4x 6 drive Z2 vdevs. I'm in the process of expanding the pool by replacing 2 drives at a time, one in each vdev (so as to not loose more than 2 drives of redundancy at once). The replacement is fairly fast, around 12 hours per drive (8->14TB replacements). However Even though I replace two disks at once, it seems truenas only replaces one at a time:

Code:
                                                    capacity     operations     bandwidth
pool                                              alloc   free   read  write   read  write
------------------------------------------------  -----  -----  -----  -----  -----  -----
freenas-boot                                      25.0G  33.5G      0      0      0      0
  ada0p2                                          25.0G  33.5G      0      0      0      0
------------------------------------------------  -----  -----  -----  -----  -----  -----
sirius                                             109T  98.1T  2.43K    195   738M   145M
  raidz2-0                                        25.4T  18.1T  2.40K    195   731M   145M
    da20p2                                            -      -    494      0   146M      0
    da22p2                                            -      -    557      0   146M      0
    gptid/ceaabe76-db25-11e7-a7b6-0025905decac        -      -    537      0   146M      0
    gptid/cf4ec403-db25-11e7-a7b6-0025905decac        -      -    557      0   146M      0
    gptid/cffae5f2-db25-11e7-a7b6-0025905decac        -      -    311      0   147M      0
    replacing-5                                       -      -      0    194      0   144M
      gptid/d0ac6387-db25-11e7-a7b6-0025905decac      -      -      0      0      0      0
      da4p2                                           -      -      0    194      0   144M
  raidz2-1                                        23.6T  19.9T      7      0  1.96M      0
    da21p2                                            -      -      0      0      0      0
    gptid/d97a286d-3359-11eb-b62b-90e2ba266fca        -      -      1      0   503K      0
    gptid/d313bbc9-db25-11e7-a7b6-0025905decac        -      -      1      0   503K      0
    da23p2                                            -      -      1      0   503K      0
    gptid/d461b238-db25-11e7-a7b6-0025905decac        -      -      1      0   503K      0
    replacing-5                                       -      -      0      0      0      0
      gptid/d51a1be0-db25-11e7-a7b6-0025905decac      -      -      0      0      0      0
      da6p2                                           -      -      0      0      0      0
  raidz2-2                                        25.2T  18.3T      3      0  1006K      0
    gptid/1875a1fb-4bd6-11e8-bf99-0025905decac        -      -      0      0   251K      0
    gptid/19249b47-4bd6-11e8-bf99-0025905decac        -      -      0      0   251K      0
    gptid/19f4a663-4bd6-11e8-bf99-0025905decac        -      -      0      0   251K      0
    gptid/1aad2fe0-4bd6-11e8-bf99-0025905decac        -      -      0      0   251K      0
    gptid/1b7e61a6-4bd6-11e8-bf99-0025905decac        -      -      0      0      0      0
    gptid/1c4fd8b3-4bd6-11e8-bf99-0025905decac        -      -      0      0      0      0
  raidz2-5                                        34.7T  41.7T     15      0  3.93M      0
    gptid/0edc7301-591a-11ec-99c7-90e2ba266fca        -      -      3      0  1006K      0
    gptid/0fa198ca-591a-11ec-99c7-90e2ba266fca        -      -      3      0  1006K      0
    gptid/1083ec3e-591a-11ec-99c7-90e2ba266fca        -      -      3      0  1006K      0
    gptid/10cdfbc8-591a-11ec-99c7-90e2ba266fca        -      -      3      0  1006K      0
    gptid/10f313e6-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
    gptid/115b2ec3-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
logs                                                  -      -      -      -      -      -
  gptid/85e90938-a24a-11eb-bc80-90e2ba266fca         2M   260G      0      0      0      0


you can see that one disk from raidz2-0 and one disk from raidz2-1 are being replaced, started within 5 minutes of each other, but only the disk in raidz2-0 shows any activity (the activity form raidz2-1 is probably from minor usage of the array during the resilver).

A few years ago I worked for a cloud storage company and we used solaris and ZFS, and we were able to trigger multiple rebuilds at once, in parallel. Even using the CLI, it seems that this is not the case with truenas (as I'd expect to see both vdevs chugging away at replacing their disks). Is this by default, or can this behavior be changed?

Edit:

The command I ran was
Code:
python3 replace_disk.py sirius gptid/d0ac6387-db25-11e7-a7b6-0025905decac da4


due to the current Truenas 13 bug with doing it via the UI.
 

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
First drive finished resilvering, second drive then proceeds to ramp up:
Code:
pool                                              alloc   free   read  write   read  write
------------------------------------------------  -----  -----  -----  -----  -----  -----
freenas-boot                                      25.0G  33.5G      0      0      0      0
  ada0p2                                          25.0G  33.5G      0      0      0      0
------------------------------------------------  -----  -----  -----  -----  -----  -----
sirius                                             109T  98.0T  1.88K    304   571M   114M
  raidz2-0                                        25.4T  18.1T      8      0  35.7K      0
    da20p2                                            -      -      1      0  7.94K      0
    da22p2                                            -      -      1      0  7.94K      0
    gptid/ceaabe76-db25-11e7-a7b6-0025905decac        -      -      0      0  3.97K      0
    gptid/cf4ec403-db25-11e7-a7b6-0025905decac        -      -      0      0  3.97K      0
    gptid/cffae5f2-db25-11e7-a7b6-0025905decac        -      -      0      0  3.97K      0
    da4p2                                             -      -      1      0  7.94K      0
  raidz2-1                                        23.6T  19.9T  1.87K    298   571M   114M
    da21p2                                            -      -    467      0   114M      0
    gptid/d97a286d-3359-11eb-b62b-90e2ba266fca        -      -    462      0   114M      0
    gptid/d313bbc9-db25-11e7-a7b6-0025905decac        -      -    132      0   114M      0
    da23p2                                            -      -    451      0   114M      0
    gptid/d461b238-db25-11e7-a7b6-0025905decac        -      -    398      0   114M      0
    replacing-5                                       -      -      0    298      0   114M
      gptid/d51a1be0-db25-11e7-a7b6-0025905decac      -      -      0      0      0      0
      da6p2                                           -      -      0    298      0   114M
  raidz2-2                                        25.2T  18.3T      2      0  11.9K      0
    gptid/1875a1fb-4bd6-11e8-bf99-0025905decac        -      -      0      0  3.97K      0
    gptid/19249b47-4bd6-11e8-bf99-0025905decac        -      -      0      0  3.97K      0
    gptid/19f4a663-4bd6-11e8-bf99-0025905decac        -      -      0      0  3.97K      0
    gptid/1aad2fe0-4bd6-11e8-bf99-0025905decac        -      -      0      0      0      0
    gptid/1b7e61a6-4bd6-11e8-bf99-0025905decac        -      -      0      0      0      0
    gptid/1c4fd8b3-4bd6-11e8-bf99-0025905decac        -      -      0      0      0      0
  raidz2-5                                        34.7T  41.7T      0      0      0      0
    gptid/0edc7301-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
    gptid/0fa198ca-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
    gptid/1083ec3e-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
    gptid/10cdfbc8-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
    gptid/10f313e6-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
    gptid/115b2ec3-591a-11ec-99c7-90e2ba266fca        -      -      0      0      0      0
logs                                                  -      -      -      -      -      -
  gptid/85e90938-a24a-11eb-bc80-90e2ba266fca      2.69M   260G      0      5      0  23.8K
------------------------------------------------  -----  -----  -----  -----  -----  -----



This is not very efficient due to taking ~2x the time to replace, I'd expect both vdevs to resilver at the same time.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
The command I ran was
Code:
python3 replace_disk.py sirius gptid/d0ac6387-db25-11e7-a7b6-0025905decac da4


due to the current Truenas 13 bug with doing it via the UI.
Even using the GUI, why call a python script rather than zpool replace —by gptid?
 

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
Even using the GUI, why call a python script rather than zpool replace —by gptid?
I don't understand the question? You can obviously use zpool replace, but 1.) ix Systems has always recommended using the GUI to do certain things, 2.) truenas's own release notes (https://www.truenas.com/docs/core/corereleasenotes/) indicate an issue with disk replacement and to use the CLI method of doing things, specifically pointing at using the python script (https://www.truenas.com/docs/core/corereleasenotes/#cli-disk-replacements), and 3.) since Truenas layers a bunch of stuff on top of ZFS/Freebsd, I have no easy way of telling if running certain CLI commands break something else (for example maybe the GUI writes some status to a database, that if running via CLI, does not...as an example)

In general, if you were to post in here "I ran x y z" instead of the recommended procedures, you're less likely to get help, so following the truenas release notes is probably the correct method, and that's how you arrive at the python script, which is:

Code:
#!/usr/bin/env python3
import argparse
import sys

from middlewared.client import Client


def main():

    parser = argparse.ArgumentParser()
    parser.add_argument('pool')
    parser.add_argument('label')
    parser.add_argument('disk')
    parser.add_argument('passphrase', nargs='?')

    args = parser.parse_args()

    with Client() as c:
        assert '13.0-RELEASE' in c.call('system.version')
        pool = c.call('pool.query', [['name', '=', args.pool]])
        if not pool:
            print('Pool not found.')
            sys.exit(1)

        disk = list(filter(lambda x: x['devname'] == args.disk, c.call('disk.get_unused')))

        if not disk:
            print('Unused disk not found.')
            sys.exit(1)

        arg = {'label': args.label, 'disk': disk[0]['identifier']}

        if pool[0]['encrypt'] == 2:
            if not args.passphrase:
                print('Passphrse required.')
                sys.exit(1)
            args['passphrase'] = args.passphrase

        c.call('pool.replace', pool[0]['id'], arg, job=True)
        print('Replace initiated.')


if __name__ == '__main__':
    main()


now there is a separate issue with this, in that whether invoking from the GUI (in 12.x) or via CLI, the script, while successfully starting a disk replacement, always ends up bombing out with an error anyway.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Oh! Thanks for the heads up.
I will not even consider moving to TN13 before 13-U3 (or 13-U3.1 rather…) and had not read the release notes yet, so using a script with a device name rather a gptid looked wrong.

As the officially recommended procedure is not working to legitimate expectations, I suggest filing a JIRA ticket.
 
Top