References this thread:
https://forums.freenas.org/index.php?threads/limiting-the-scope-of-swap-pool.73117/#post-506799
I have been running minio since 5 days, and suddenly the server is running out of swap space. I thought it was because of USB disks I had attached to this server, but removing them did not solve the issue. Rebooting does not help, it is almost freezing the whole system after just a few minutes.
I mentioned the USB disks there, these are USB attached hard drives, not flash, and I have removed them.
There was no swap space on the freenas boot drive, so that was OK.
I added a 64 GB SSD for system data.
(It used to be on one of the USB drives, but after seeing many USB related errors on other systems, especially with USB3 disks, I wanted to get rid of USB drives altogether, apart from boot disks.) It is an annoyance that FreeBSD does not support USB3 well. Being limited to ca. 25 MB/s transfers for removable storage was OK in the 2000s but USB3 has been around since 2008 and should be supported by now. Disk speeds of 2.5 inch portable drives are around 100 MB/s... That is one of the reasons why I orderd 2 Chelsio 10 Gbit cards.... They should do even better but it is not a portable solution. /one of these rants
(Yes, I know it's free. Free as in beer. But it's only free when your time is worthless. Mine is not. Starting to get annoyed.)
System:
FreeNAS 11.2 U1
Quad core Xeon E3-1245 (should be OK right?)
24 GB of RAM (should be plenty right?)
8 disks on a SAS / SATA controller RAIDZ2
Of 58 TB disk space, [only] 1.5 TB of data has been written.
[Oddly, in the legacy web interface, 38.9 TiB is shown as free, and in the modern Dashboard, 56.5 TiB.]
Now I can only run the system for a few minutes before it runs out of RAM and swap space.
When running for a few minutes top looks like this:
last pid: 5654; load averages: 1.11, 1.14, 1.09 up 0+00:42:05 18:53:31
61 processes: 1 running, 54 sleeping, 1 zombie, 5 waiting
CPU: 0.2% user, 0.0% nice, 12.8% system, 0.0% interrupt, 86.9% idle
Mem: 20G Active, 68K Inact, 2611M Wired, 98M Free
ARC: 1622M Total, 1366M MFU, 215M MRU, 96K Anon, 21M Header, 18M Other
893M Compressed, 1589M Uncompressed, 1.78:1 Ratio
Swap: 8192M Total, 8192M Used, K Free, 100% Inuse
PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
226 root 33 20 0 230M 105M 0K uwait 4 0:36 1.54% python3.6
3012 root 22 20 0 152M 24776K 0K umtxn 3 0:11 0.53% collectd
2786 minio 63 52 0 28458M 20820M 0K uwait 2 4:45 0.15% minio
This looks a lot like this thread:
https://github.com/minio/minio/issues/6490
There it was running on Windows, and the problem was solved by giving minio access to a whole disk (drive letter). (WTF?)
I tried something like that and shared the whole data pool with minio, not just the pool miniodata in it. [I created that directory ("volume" it says in Legacy) in the Legacy interface.]
Now it is doing this:
last pid: 17651; load averages: 0.22, 0.16, 0.14 up 0+02:36:19 20:47:45
45 processes: 1 running, 44 sleeping
CPU: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.5% idle
Mem: 1325M Active, 297M Inact, 13G Wired, 8780M Free
ARC: 8623M Total, 5918M MFU, 908M MRU, 352K Anon, 148M Header, 1646M Other
1665M Compressed, 6380M Uncompressed, 3.83:1 Ratio
Swap: 8192M Total, 407M Used, 7785M Free, 4% Inuse
PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
12704 root 1 21 0 1196M 1193M 0K zio->i 2 1:18 2.66% winacl
11722 root 1 20 0 7944K 3576K 0K CPU5 5 0:01 0.05% top
2641 root 1 20 0 29312K 5424K 0K select 2 0:02 0.03% nmbd
2776 nobody 1 20 0 6932K 2476K 0K select 6 0:00 0.03% mdnsd
So I guess it is changing ACLs. I'm not sure that was the right move.
Bug reported here: https://redmine.ixsystems.com/issues/71875
Edit: a couple of hours later, winacl is still running... what is it doing? All files are already owned by minio.
https://github.com/minio/minio/issues/6490
I reported this issue there.
If and when winacl is ready I will test it again, but I expect it not to be a solution.
I guess I will have to revert to the former S3 setup (with the directory within the data pool shared by minio, not the data pool) because I think minio buckets will not be accessible when directory structure is changed.
The next thing I will try is just installing a 256 GB SSD and set the default swap space (for new drives) at e.g. 200 GB. That will create a 200 GB swap partition on the disk. Not sure if that will help.
(You see... I am impatient, and usually will have found a solution before someone else thinks about it, with all the gurus running around. Maybe it's just throwing more RAM and swap space at it that solves it. Still it sounds ridiculous that serving some files should take tens of GBs of RAM.)
See? I have created a nice few pages of prose for
- The newbies to agree with
- The "believers" to shake their heads about
- Some of you to laugh at
EDIT
winacl finished, but as I feared all buckets are now subdirectories in one big bucket. I reverted to sharing the directory.
Will see what that brings.
EDIT
The buckets are in place again.
I have now added a 200 GB swap partition. It shows 185 G.
Immediately after starting minio it starts using CPU and RAM.
last pid: 91747; load averages: 5.95, 5.47, 5.14 up 0+13:41:53 07:46:26
46 processes: 1 running, 45 sleeping
CPU: 28.4% user, 0.0% nice, 50.8% system, 0.0% interrupt, 20.7% idle
Mem: 10G Active, 350M Inact, 16M Laundry, 6986M Wired, 5636M Free
ARC: 3953M Total, 1913M MFU, 1789M MRU, 128K Anon, 99M Header, 108M Other
2947M Compressed, 7430M Uncompressed, 2.52:1 Ratio
Swap: 185G Total, 374M Used, 184G Free
PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
86081 minio 26 52 0 10584M 10590M 0K uwait 3 199:02 609.67% minio
16100 root 20 44 0 234M 171M 0K kqread 2 3:56 0.36% python3.6
5749 root 1 20 0 7944K 2960K 0K CPU3 3 0:26 0.11% top
4853 root 1 20 0 29312K 5348K 0K select 2 0:07 0.01% nmbd
(One of the buckets has 4M files in it. I suspect minio just can't handle that...)
https://forums.freenas.org/index.php?threads/limiting-the-scope-of-swap-pool.73117/#post-506799
I have been running minio since 5 days, and suddenly the server is running out of swap space. I thought it was because of USB disks I had attached to this server, but removing them did not solve the issue. Rebooting does not help, it is almost freezing the whole system after just a few minutes.
I mentioned the USB disks there, these are USB attached hard drives, not flash, and I have removed them.
There was no swap space on the freenas boot drive, so that was OK.
I added a 64 GB SSD for system data.
(It used to be on one of the USB drives, but after seeing many USB related errors on other systems, especially with USB3 disks, I wanted to get rid of USB drives altogether, apart from boot disks.) It is an annoyance that FreeBSD does not support USB3 well. Being limited to ca. 25 MB/s transfers for removable storage was OK in the 2000s but USB3 has been around since 2008 and should be supported by now. Disk speeds of 2.5 inch portable drives are around 100 MB/s... That is one of the reasons why I orderd 2 Chelsio 10 Gbit cards.... They should do even better but it is not a portable solution. /one of these rants
(Yes, I know it's free. Free as in beer. But it's only free when your time is worthless. Mine is not. Starting to get annoyed.)
System:
FreeNAS 11.2 U1
Quad core Xeon E3-1245 (should be OK right?)
24 GB of RAM (should be plenty right?)
8 disks on a SAS / SATA controller RAIDZ2
Of 58 TB disk space, [only] 1.5 TB of data has been written.
[Oddly, in the legacy web interface, 38.9 TiB is shown as free, and in the modern Dashboard, 56.5 TiB.]
Now I can only run the system for a few minutes before it runs out of RAM and swap space.
When running for a few minutes top looks like this:
last pid: 5654; load averages: 1.11, 1.14, 1.09 up 0+00:42:05 18:53:31
61 processes: 1 running, 54 sleeping, 1 zombie, 5 waiting
CPU: 0.2% user, 0.0% nice, 12.8% system, 0.0% interrupt, 86.9% idle
Mem: 20G Active, 68K Inact, 2611M Wired, 98M Free
ARC: 1622M Total, 1366M MFU, 215M MRU, 96K Anon, 21M Header, 18M Other
893M Compressed, 1589M Uncompressed, 1.78:1 Ratio
Swap: 8192M Total, 8192M Used, K Free, 100% Inuse
PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
226 root 33 20 0 230M 105M 0K uwait 4 0:36 1.54% python3.6
3012 root 22 20 0 152M 24776K 0K umtxn 3 0:11 0.53% collectd
2786 minio 63 52 0 28458M 20820M 0K uwait 2 4:45 0.15% minio
This looks a lot like this thread:
https://github.com/minio/minio/issues/6490
There it was running on Windows, and the problem was solved by giving minio access to a whole disk (drive letter). (WTF?)
I tried something like that and shared the whole data pool with minio, not just the pool miniodata in it. [I created that directory ("volume" it says in Legacy) in the Legacy interface.]
Now it is doing this:
last pid: 17651; load averages: 0.22, 0.16, 0.14 up 0+02:36:19 20:47:45
45 processes: 1 running, 44 sleeping
CPU: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.5% idle
Mem: 1325M Active, 297M Inact, 13G Wired, 8780M Free
ARC: 8623M Total, 5918M MFU, 908M MRU, 352K Anon, 148M Header, 1646M Other
1665M Compressed, 6380M Uncompressed, 3.83:1 Ratio
Swap: 8192M Total, 407M Used, 7785M Free, 4% Inuse
PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
12704 root 1 21 0 1196M 1193M 0K zio->i 2 1:18 2.66% winacl
11722 root 1 20 0 7944K 3576K 0K CPU5 5 0:01 0.05% top
2641 root 1 20 0 29312K 5424K 0K select 2 0:02 0.03% nmbd
2776 nobody 1 20 0 6932K 2476K 0K select 6 0:00 0.03% mdnsd
So I guess it is changing ACLs. I'm not sure that was the right move.
Bug reported here: https://redmine.ixsystems.com/issues/71875
Edit: a couple of hours later, winacl is still running... what is it doing? All files are already owned by minio.
https://github.com/minio/minio/issues/6490
I reported this issue there.
If and when winacl is ready I will test it again, but I expect it not to be a solution.
I guess I will have to revert to the former S3 setup (with the directory within the data pool shared by minio, not the data pool) because I think minio buckets will not be accessible when directory structure is changed.
The next thing I will try is just installing a 256 GB SSD and set the default swap space (for new drives) at e.g. 200 GB. That will create a 200 GB swap partition on the disk. Not sure if that will help.
(You see... I am impatient, and usually will have found a solution before someone else thinks about it, with all the gurus running around. Maybe it's just throwing more RAM and swap space at it that solves it. Still it sounds ridiculous that serving some files should take tens of GBs of RAM.)
See? I have created a nice few pages of prose for
- The newbies to agree with
- The "believers" to shake their heads about
- Some of you to laugh at
EDIT
winacl finished, but as I feared all buckets are now subdirectories in one big bucket. I reverted to sharing the directory.
Will see what that brings.
EDIT
The buckets are in place again.
I have now added a 200 GB swap partition. It shows 185 G.
Immediately after starting minio it starts using CPU and RAM.
last pid: 91747; load averages: 5.95, 5.47, 5.14 up 0+13:41:53 07:46:26
46 processes: 1 running, 45 sleeping
CPU: 28.4% user, 0.0% nice, 50.8% system, 0.0% interrupt, 20.7% idle
Mem: 10G Active, 350M Inact, 16M Laundry, 6986M Wired, 5636M Free
ARC: 3953M Total, 1913M MFU, 1789M MRU, 128K Anon, 99M Header, 108M Other
2947M Compressed, 7430M Uncompressed, 2.52:1 Ratio
Swap: 185G Total, 374M Used, 184G Free
PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
86081 minio 26 52 0 10584M 10590M 0K uwait 3 199:02 609.67% minio
16100 root 20 44 0 234M 171M 0K kqread 2 3:56 0.36% python3.6
5749 root 1 20 0 7944K 2960K 0K CPU3 3 0:26 0.11% top
4853 root 1 20 0 29312K 5348K 0K select 2 0:07 0.01% nmbd
(One of the buckets has 4M files in it. I suspect minio just can't handle that...)
Last edited: