Minio data loss condition

uberthoth · Mar 16, 2022

I have a Minio app running in distributed mode with 4 nodes each with 32GB of ram.

Two of the nodes failed in an out of memory condition, after rebooting the machines, they show logs like this:

```
2022-03-16 15:47:49.798681+00:00Waiting for all other servers to be online to format the disks.
```

```
2022-03-16 15:48:45.789080+00:00Waiting for the first server to format the disks.
```

The other two machines have logs like this:

```
2022-03-16 15:52:14.513028+00:002022-03-16T15:52:14.513028950Z
2022-03-16 15:52:14.513080+00:00API: SYSTEM()
2022-03-16 15:52:14.513133+00:00Time: 15:52:14 UTC 03/16/2022
2022-03-16 15:52:14.513185+00:00Error: Marking https://1.example.local:9000/minio/storage/data/v42 temporary offline; caused by Post "https://1.example.local:9000/minio/...k-id=&file-path=format.json&volume=.minio.sys": lookup 1.example.local on 10.0.0.11:53: dial udp 10.0.0.11:53: i/o timeout (*fmt.wrapError)
2022-03-16 15:52:14.513239+00:006: internal/rest/client.go:149:rest.(*Client).Call()
2022-03-16 15:52:14.513292+00:005: cmd/storage-rest-client.go:152:cmd.(*storageRESTClient).call()
2022-03-16 15:52:14.513354+00:004: cmd/storage-rest-client.go:520:cmd.(*storageRESTClient).ReadAll()
2022-03-16 15:52:14.513430+00:003: cmd/format-erasure.go:406:cmd.loadFormatErasure()
2022-03-16 15:52:14.513504+00:002: cmd/format-erasure.go:326:cmd.loadFormatErasureAll.func1()
2022-03-16 15:52:14.513560+00:001: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
```

```
2022-03-16 14:19:11.134801+00:002022-03-16T14:19:11.134801123Z
2022-03-16 14:19:11.135063+00:00API: SYSTEM()
2022-03-16 14:19:11.135086+00:00Time: 14:19:11 UTC 03/16/2022
2022-03-16 14:19:11.135104+00:00DeploymentID: deadbeef
2022-03-16 14:19:11.135121+00:00Error: Operation timed out (cmd.OperationTimedOut)
2022-03-16 14:19:11.135139+00:001: cmd/iam.go:339:cmd.(*IAMSys).watch()
```

Important to note, the two nodes that have not been rebooted continue to operate just fine, I can still pull files from them etc.

How do I get the two rebooted nodes to rejoin the cluster without initializing?

morganL · Mar 17, 2022

If no answer here, then you might want to try the Minio forums.
Most Minio use in TrueNAS has been in a standalone configuration.

uberthoth · Mar 17, 2022

I'm not certain if minio have forums? I did join their slack and make a post referencing this one https://minio.slack.com/archives/C3NDUB8UA/p1647525947891459

Important Announcement for the TrueNAS Community.

Minio data loss condition

uberthoth

Dabbler

morganL

Captain Morgan

uberthoth

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Minio data loss condition

uberthoth

Dabbler

morganL

Captain Morgan

uberthoth

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Minio data loss condition"

Similar threads