I have been working on numerous enhancement to replication (for 9.1 release code) with the aim of getting them incorporated upstream; however before I request a pull I would appreciate community code review/testing - especially on error situations
The following change has been tested for several weeks and appears to be working as intended:
- replication status included on storage/zfs snapshots tab. Distinguish between replica, latest, new and in progress transfers (part ticket #778)
- replication progress for in flight transfers shown as % of data to transfer
- If replication fails (for whatever reason) then replica snapshots are not ALL auto expired on replication server (if it also performing snapshots) which then requires a full resync #2115
- Removing expired replica snapshots on replica server (excluding the latest one) is possible via running autosnap.py in cron. This is an alternative to keeping primary and replica server in sync (#388 appears to be a legacy ticket which can be closed)
- Also in testing ability to have multiple snapshots scheduled on same dataset is working as expected (eg hourly retained for 12 hours, daily for 5 days etc) ticket #1646 - NB I think this was upstream changes though I also stopped referencing a DB field which may become invalid in this use case
Items outstanding:
- Amend function which deletes periodic zfs replication to also remove 'replica' status on replication server snapshots (essential)
- Update of ZFS snapshot screen every x secs - to show updating replication progress (desirable)
Code is on https://github.com/noprobs/freenas/tree/repl-progress branch the following files have been changed (and should be copied to a test server)
gui/common/__init__.py
gui/freeadmin/api/resources.py
gui/middleware/notifier.py
gui/middleware/zfs.py
gui/templates/storage/snapshots.html
gui/tools/autosnap.py
gui/tools/autorepl.py
Feedback appreciated!
FYI Other development in progress/consideration
1) Allow option to make a replication inactive (rather than deleting it)
2) Enable primary and replication server to keep snapshots for different time periods eg keep for 1 day on primary server and 5 day on replica server
3) Improve control over replication bandwidth to allow eg 1Mbps at any time and 10Mbps overnight
4) Remove snapshots which have zero size (see fracai script)
To achieve the above I will need to create new field in the SQLite DB. I have yet to investigate how I do this without messing up version upgrades - comments appreciated.
The following change has been tested for several weeks and appears to be working as intended:
- replication status included on storage/zfs snapshots tab. Distinguish between replica, latest, new and in progress transfers (part ticket #778)
- replication progress for in flight transfers shown as % of data to transfer
- If replication fails (for whatever reason) then replica snapshots are not ALL auto expired on replication server (if it also performing snapshots) which then requires a full resync #2115
- Removing expired replica snapshots on replica server (excluding the latest one) is possible via running autosnap.py in cron. This is an alternative to keeping primary and replica server in sync (#388 appears to be a legacy ticket which can be closed)
- Also in testing ability to have multiple snapshots scheduled on same dataset is working as expected (eg hourly retained for 12 hours, daily for 5 days etc) ticket #1646 - NB I think this was upstream changes though I also stopped referencing a DB field which may become invalid in this use case
Items outstanding:
- Amend function which deletes periodic zfs replication to also remove 'replica' status on replication server snapshots (essential)
- Update of ZFS snapshot screen every x secs - to show updating replication progress (desirable)
Code is on https://github.com/noprobs/freenas/tree/repl-progress branch the following files have been changed (and should be copied to a test server)
gui/common/__init__.py
gui/freeadmin/api/resources.py
gui/middleware/notifier.py
gui/middleware/zfs.py
gui/templates/storage/snapshots.html
gui/tools/autosnap.py
gui/tools/autorepl.py
Feedback appreciated!
FYI Other development in progress/consideration
1) Allow option to make a replication inactive (rather than deleting it)
2) Enable primary and replication server to keep snapshots for different time periods eg keep for 1 day on primary server and 5 day on replica server
3) Improve control over replication bandwidth to allow eg 1Mbps at any time and 10Mbps overnight
4) Remove snapshots which have zero size (see fracai script)
To achieve the above I will need to create new field in the SQLite DB. I have yet to investigate how I do this without messing up version upgrades - comments appreciated.