Performance drops over night

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
I have TrueNAS running on a HP Proliant Microserver Gen8 with
2 x 3TB WD Red (old ones) and 16 GB RAM.
Right after installing and starting the NAS I get 40-50 MB/s read and
write performance. While the system is on the speed drops almost to 0.

System was installed on a Kingston SSD connected via
DeLOCK - Controller - 2 Channel - SATA 6Gb/s Low-Profile - 600MBps - PCIe 3.0 (90431)
Yesterday I disconnected the controller with the SSD and installed
the new TrueNAS 12-U8.1 on a internal USB-Stick connected on the motherboard.
Right after installation at 1 A.M. I had 40-50 MB/s .
This morning the performance dropped to 3-4 MB/s at 10 A.M.
After rebooting the NAS I had 30-40 MB/s again.
Does somebody can help me and give a hint, where I can find a solution
for this problem?

Thanks for helping.

Just checked again now after a hour up-time, already dropped to 10 MB/s
 
Last edited:

LarsR

Guru
Joined
Oct 23, 2020
Messages
719
What model number are those WD Red drives?
Kinda looks like they are SMR drives... First file transfers go to arc and when arc is full your drives choke.
After the reboot arc is empty and is used again.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
ada0: WDC WD30EFRX-68EUZN0
ada1: WDC WD30EFRX-68EUZN0

Should be a CMR drive:
WD-Red-NAS-CMR-SMR-pcgh.jpg


root@truenas[~]# arc_summary

------------------------------------------------------------------------
ZFS Subsystem Report Sat Apr 23 13:13:57 2022
FreeBSD 12.2-RELEASE-p14 zpl version 5
Machine: truenas.local (amd64) spa version 5000

ARC status: HEALTHY
Memory throttle count: 0

ARC size (current): 6.4 % 974.7 MiB
Target size (adaptive): 9.3 % 1.4 GiB
Min size (hard limit): 3.3 % 509.4 MiB
Max size (high water): 29:1 14.9 GiB
Most Frequently Used (MFU) cache size: 46.1 % 336.8 MiB
Most Recently Used (MRU) cache size: 53.9 % 393.1 MiB
Metadata cache size (hard limit): 75.0 % 11.2 GiB
Metadata cache size (current): 4.1 % 470.1 MiB
Dnode cache size (hard limit): 10.0 % 1.1 GiB
Dnode cache size (current): 11.4 % 130.3 MiB

ARC hash breakdown:
Elements max: 40.2k
Elements current: 94.3 % 37.9k
Collisions: 2.1k
Chain max: 2
Chains: 346

ARC misc:
Deleted: 56
Mutex misses: 0
Eviction skips: 5.4k

ARC total accesses (hits + misses): 10.1M
Cache hit ratio: 99.6 % 10.1M
Cache miss ratio: 0.4 % 38.2k
Actual hit ratio (MFU + MRU hits): 99.5 % 10.0M
Data demand efficiency: 96.8 % 332.8k
Data prefetch efficiency: 15.6 % 2.6k

Cache hits by cache type:
Most frequently used (MFU): 94.7 % 9.5M
Most recently used (MRU): 5.2 % 524.2k
Most frequently used (MFU) ghost: 0.0 % 0
Most recently used (MRU) ghost: 0.0 % 0
Anonymously used: 0.1 % 12.4k

Cache hits by data type:
Demand data: 3.2 % 322.0k
Demand prefetch data: < 0.1 % 405
Demand metadata: 96.7 % 9.7M
Demand prefetch metadata: 0.1 % 12.3k

Cache misses by data type:
Demand data: 28.3 % 10.8k
Demand prefetch data: 5.8 % 2.2k
Demand metadata: 48.4 % 18.5k
Demand prefetch metadata: 17.6 % 6.7k

DMU prefetch efficiency: 99.6k
Hit ratio: 13.0 % 13.0k
Miss ratio: 87.0 % 86.6k

L2ARC not detected, skipping section

Tunables:
abd_chunk_size 4096
abd_scatter_enabled 1
allow_redacted_dataset_mount 0
anon_data_esize 0
anon_metadata_esize 0
anon_size 262144
arc.average_blocksize 8192
arc.dnode_limit 0
arc.dnode_limit_percent 10
arc.dnode_reduce_percent 10
arc.evict_batch_limit 10
arc.eviction_pct 200
arc.grow_retry 0
arc.lotsfree_percent 10
arc.max 0
arc.meta_adjust_restarts 4096
arc.meta_limit 0
arc.meta_limit_percent 75
arc.meta_min 0
arc.meta_prune 10000
arc.meta_strategy 1
arc.min 0
arc.min_prefetch_ms 0
arc.min_prescient_prefetch_ms 0
arc.p_dampener_disable 1
arc.p_min_shift 0
arc.pc_percent 0
arc.prune_task_threads 1
arc.shrink_shift 0
arc.sys_free 0
arc_free_target 86693
arc_max 0
arc_min 0
arc_no_grow_shift 5
async_block_max_blocks 18446744073709551615
autoimport_disable 1
ccw_retry_interval 300
checksum_events_per_second 20
commit_timeout_pct 5
compressed_arc_enabled 1
condense.indirect_commit_entry_delay_ms 0
condense.indirect_obsolete_pct 25
condense.indirect_vdevs_enable 1
condense.max_obsolete_bytes 1073741824
condense.min_mapping_bytes 131072
condense_pct 200
crypt_sessions 0
dbgmsg_enable 1
dbgmsg_maxsize 4194304
dbuf.cache_shift 5
dbuf.metadata_cache_max_bytes 18446744073709551615
dbuf.metadata_cache_shift 6
dbuf_cache.hiwater_pct 10
dbuf_cache.lowater_pct 10
dbuf_cache.max_bytes 18446744073709551615
dbuf_state_index 0
ddt_data_is_special 1
deadman.checktime_ms 60000
deadman.enabled 1
deadman.failmode wait
deadman.synctime_ms 600000
deadman.ziotime_ms 300000
debug 0
debugflags 0
dedup.prefetch 0
default_bs 9
default_ibs 15
delay_min_dirty_percent 60
delay_scale 500000
dirty_data_max 1709219430
dirty_data_max_max 4273048576
dirty_data_max_max_percent 25
dirty_data_max_percent 10
dirty_data_sync_percent 20
disable_ivset_guid_check 0
dmu_object_alloc_chunk_shift 7
dmu_offset_next_sync 0
dmu_prefetch_max 134217728
dtl_sm_blksz 4096
flags 0
fletcher_4_impl [fastest] scalar superscalar superscalar4 sse2 ssse3
free_bpobj_enabled 1
free_leak_on_eio 0
free_min_time_ms 1000
history_output_max 1048576
immediate_write_sz 32768
initialize_chunk_size 1048576
initialize_value 16045690984833335022
keep_log_spacemaps_at_export 0
l2arc.feed_again 1
l2arc.feed_min_ms 200
l2arc.feed_secs 1
l2arc.headroom 2
l2arc.headroom_boost 200
l2arc.meta_percent 33
l2arc.mfuonly 0
l2arc.noprefetch 1
l2arc.norw 0
l2arc.rebuild_blocks_min_l2size 1073741824
l2arc.rebuild_enabled 0
l2arc.trim_ahead 0
l2arc.write_boost 8388608
l2arc.write_max 8388608
l2arc_feed_again 1
l2arc_feed_min_ms 200
l2arc_feed_secs 1
l2arc_headroom 2
l2arc_noprefetch 1
l2arc_norw 0
l2arc_write_boost 8388608
l2arc_write_max 8388608
l2c_only_size 0
livelist.condense.new_alloc 0
livelist.condense.sync_cancel 0
livelist.condense.sync_pause 0
livelist.condense.zthr_cancel 0
livelist.condense.zthr_pause 0
livelist.max_entries 500000
livelist.min_percent_shared 75
lua.max_instrlimit 100000000
lua.max_memlimit 104857600
max_async_dedup_frees 100000
max_auto_ashift 16
max_dataset_nesting 50
max_log_walking 5
max_logsm_summary_length 10
max_missing_tvds 0
max_missing_tvds_cachefile 2
max_missing_tvds_scan 0
max_nvlist_src_size 0
max_recordsize 1048576
metaslab.aliquot 524288
metaslab.bias_enabled 1
metaslab.debug_load 0
metaslab.debug_unload 0
metaslab.df_alloc_threshold 131072
metaslab.df_free_pct 4
metaslab.df_max_search 16777216
metaslab.df_use_largest_segment 0
metaslab.force_ganging 16777217
metaslab.fragmentation_factor_enabled 1
metaslab.fragmentation_threshold 70
metaslab.lba_weighting_enabled 1
metaslab.load_pct 50
metaslab.max_size_cache_sec 3600
metaslab.mem_limit 75
metaslab.preload_enabled 1
metaslab.preload_limit 10
metaslab.segment_weight_enabled 1
metaslab.sm_blksz_no_log 16384
metaslab.sm_blksz_with_log 131072
metaslab.switch_threshold 2
metaslab.unload_delay 32
metaslab.unload_delay_ms 600000
mfu_data_esize 196468736
mfu_ghost_data_esize 0
mfu_ghost_metadata_esize 0
mfu_ghost_size 0
mfu_metadata_esize 42200576
mfu_size 353112064
mg.fragmentation_threshold 95
mg.noalloc_threshold 0
min_auto_ashift 9
min_metaslabs_to_flush 1
mru_data_esize 294615040
mru_ghost_data_esize 0
mru_ghost_metadata_esize 0
mru_ghost_size 0
mru_metadata_esize 8054272
mru_size 412179968
multihost.fail_intervals 10
multihost.history 0
multihost.import_intervals 20
multihost.interval 1000
multilist_num_sublists 0
no_scrub_io 0
no_scrub_prefetch 0
nocacheflush 0
nopwrite_enabled 1
obsolete_min_time_ms 500
pd_bytes_max 52428800
per_txg_dirty_frees_percent 5
prefetch.array_rd_sz 1048576
prefetch.disable 0
prefetch.max_distance 8388608
prefetch.max_idistance 67108864
prefetch.max_streams 8
prefetch.min_sec_reap 2
read_history 0
read_history_hits 0
rebuild_max_segment 1048576
reconstruct.indirect_combinations_max 4096
recover 0
recv.queue_ff 20
recv.queue_length 16777216
recv.write_batch_size 1048576
reference_tracking_enable 0
removal_suspend_progress 0
remove_max_segment 16777216
resilver_disable_defer 0
resilver_min_time_ms 3000
scan_checkpoint_intval 7200
scan_fill_weight 3
scan_ignore_errors 0
scan_issue_strategy 0
scan_legacy 0
scan_max_ext_gap 2097152
scan_mem_lim_fact 20
scan_mem_lim_soft_fact 20
scan_strict_mem_lim 0
scan_suspend_progress 0
scan_vdev_limit 4194304
scrub_min_time_ms 1000
send.corrupt_data 0
send.no_prefetch_queue_ff 20
send.no_prefetch_queue_length 1048576
send.override_estimate_recordsize 0
send.queue_ff 20
send.queue_length 16777216
send.unmodified_spill_blocks 1
send_holes_without_birth_time 1
slow_io_events_per_second 20
spa.asize_inflation 24
spa.discard_memory_limit 16777216
spa.load_print_vdev_tree 0
spa.load_verify_data 1
spa.load_verify_metadata 1
spa.load_verify_shift 4
spa.slop_shift 5
space_map_ibs 14
special_class_metadata_reserve_pct 25
standard_sm_blksz 131072
super_owner 0
sync_pass_deferred_free 2
sync_pass_dont_compress 8
sync_pass_rewrite 2
sync_taskq_batch_pct 75
top_maxinflight 1000
traverse_indirect_prefetch_limit 32
trim.extent_bytes_max 134217728
trim.extent_bytes_min 32768
trim.metaslab_skip 0
trim.queue_limit 10
trim.txg_batch 32
txg.history 100
txg.timeout 5
unflushed_log_block_max 262144
unflushed_log_block_min 1000
unflushed_log_block_pct 400
unflushed_max_mem_amt 1073741824
unflushed_max_mem_ppm 1000
user_indirect_is_special 1
validate_skip 0
vdev.aggregate_trim 0
vdev.aggregation_limit 1048576
vdev.aggregation_limit_non_rotating 131072
vdev.async_read_max_active 3
vdev.async_read_min_active 1
vdev.async_write_active_max_dirty_percent 60
vdev.async_write_active_min_dirty_percent 30
vdev.async_write_max_active 5
vdev.async_write_min_active 1
vdev.bio_delete_disable 0
vdev.bio_flush_disable 0
vdev.cache_bshift 16
vdev.cache_max 16384
vdev.cache_size 0
vdev.def_queue_depth 32
vdev.default_ms_count 200
vdev.default_ms_shift 29
vdev.file.logical_ashift 9
vdev.file.physical_ashift 9
vdev.initializing_max_active 1
vdev.initializing_min_active 1
vdev.max_active 1000
vdev.max_auto_ashift 16
vdev.min_auto_ashift 9
vdev.min_ms_count 16
vdev.mirror.non_rotating_inc 0
vdev.mirror.non_rotating_seek_inc 1
vdev.mirror.rotating_inc 0
vdev.mirror.rotating_seek_inc 5
vdev.mirror.rotating_seek_offset 1048576
vdev.ms_count_limit 131072
vdev.nia_credit 5
vdev.nia_delay 5
vdev.queue_depth_pct 1000
vdev.read_gap_limit 32768
vdev.rebuild_max_active 3
vdev.rebuild_min_active 1
vdev.removal_ignore_errors 0
vdev.removal_max_active 2
vdev.removal_max_span 32768
vdev.removal_min_active 1
vdev.removal_suspend_progress 0
vdev.remove_max_segment 16777216
vdev.scrub_max_active 3
vdev.scrub_min_active 1
vdev.sync_read_max_active 10
vdev.sync_read_min_active 10
vdev.sync_write_max_active 10
vdev.sync_write_min_active 10
vdev.trim_max_active 2
vdev.trim_min_active 1
vdev.validate_skip 0
vdev.write_gap_limit 4096
version.acl 1
version.ioctl 15
version.module v2022012000-zfs_e870bd8cf
version.spa 5000
version.zpl 5
vnops.read_chunk_size 1048576
vol.mode 2
vol.recursive 0
vol.unmap_enabled 1
zap_iterate_prefetch 1
zevent.cols 80
zevent.console 0
zevent.len_max 512
zevent.retain_expire_secs 900
zevent.retain_max 2000
zfetch.max_distance 8388608
zfetch.max_idistance 67108864
zil.clean_taskq_maxalloc 1048576
zil.clean_taskq_minalloc 1024
zil.clean_taskq_nthr_pct 100
zil.maxblocksize 131072
zil.nocacheflush 0
zil.replay_disable 0
zil.slog_bulk 786432
zio.deadman_log_all 0
zio.dva_throttle_enabled 1
zio.exclude_metadata 0
zio.requeue_io_start_cut_in_line 1
zio.slow_io_ms 30000
zio.taskq_batch_pct 80
zio.taskq_batch_tpq 0
zio.use_uma 1

VDEV cache disabled, skipping section
 
Last edited:

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
Is there any chance to find out what's causing this slow-down?
Now I'm connected directly to the Netgear switch and get 115 MB/s.
But it go's down to a couple of MB/s.
Yesterday I had the 115 MB/s on my laptop. I left the laptop for
hours open. After coming back I only had around 15 MB/s.
The Web-GUI then is also sluggish.
Now again over night it dropped from 115 to 10 MB/s.
Just restarted TrueNAS again and I get 115 MB/s again.
I set up the users and smb-shares like shown here:
Was a lonk to a YouTube video from David McKone
With a placeholder user and group for the homefolder dataset.

By the way, CPU load is between 1 and 15-20%
10.6 GB RAM are free (only 0,9 used for ZFS-cache and 4,4 for service)
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Do you have SCRUB or SMART tests scheduled overnight?

Don't direct us to a Youtube video in order to know what you set... it just seems like you're trying to get views on your channel and wastes our time to find the information needed to help you.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
Okay, I set the users up with a SMB-Share as a home-share like shown from David McKone.
With a placeholder user and group for the homefolder dataset.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
After disabling "Watch for Changes" in the options of SyncThing now I had normal speed
for a couple of days, but this morning it was down again (10 MB/s). After rebooting the TrueNAS
it works again at 115 MB/s.
Last night there was the SCRUB. Is this the last problem?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Last night there was the SCRUB. Is this the last problem?
Was the scrub finished? maybe it was still running and could be contributing to "slowness".

A scrub will stop on a reboot and doesn't continue after it.

YOu can also use zpool scrub -s poolname to stop a scrub in progress without a reboot
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
Scrub was finished.
Just tried it with a manual scrub.
Before scrub ~115 MB/s, during scrub ~10 MB/s, after scrub ~10 MB/s.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
Can I find something in the reporting graphs?
I don't know how the ZFS graphs should look like.

Now I stopped all plugins to see if this helps.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
After 4 hours the speed is down to 10 MB/s from 115 MB/s without any plugin and jail running.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
It looks like this:
truenas01.PNG

and around every 5 sec. like this:
truenas02.PNG
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
Looks the same after reboot, but its 10-12 sec instead of 5 sec between the I/Os
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
After 11 hours now over night, the performance is still up at 115 MB/s.
Really strange.
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
After 16 hours it's getting down again -> 30 MB/s
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I did notice in your arc_summary:

Target size (adaptive): 9.3 % 1.4 GiB

That's a very small percentage of your RAM, I have a system with 16GB of RAM here as well which has:

Target Size: (Adaptive) 77.18% 11.22 GiB

Do you have a memory leak perhaps? Take a snapshot of htop with SortBy PERCENT_MEM (use F6 to toggle the sort field)
 

Nordlicht-13

Explorer
Joined
Apr 2, 2022
Messages
69
Here is my screenshot of htop sorted by PERCENT_MEM:
truenas03.PNG

How can I fix the memory leak?
 
Top