JackAlltrades
Cadet
- Joined
- Oct 2, 2019
- Messages
- 9
Okay, where to begin... there are so many things on fire.
Basic background - I show up to solve problems because this company has suffered high turnover in system/network admins for the past few years. Basically no one documented anything useful and so no one really knew how anything work and just duct-taped and baling-wired stuff on top of existing systems.
I am not a networking or storage genius. I'm figuring most of this out as I go along. I know VMware decently well and know my way around most tools ranging from software to hammer drills. (Hence "JackAlltrades".) Put it this way - I own my own hardhat and I'm not afraid of Linux.
We have over a dozen ESXi hosts that are (thanks to me) now centrally-managed across two sites with a pair of linked vCenters with SSO and all the trimmings. We also have an aging Dell EqualLogic with about 17TB in RAID 6 with a dead drive. (yay) This is set up on its own switch which is a pair of stacked Dell Force10 S25 (stacking cables in the back) on a 172.31.0.0/24 storage network. Everything runs via 1Gb/s Ethernet - the SAN and the hosts mostly run 4 cables each to this switch.
The new SAN is a Dell PowerEdge R510 with 128GB RAM and 12x 3TB 7200 disks in RAID-z2 running FreeNAS 11.2-U6. (Probably not the "best" config, but there's a reason for it based on my admittedly limited understanding of ZFS.)
Critical VMs run on local datastores in specific hosts with RAID'd SSD drives. Non-critical VMs run mostly on the old SAN or on HDD local datastores on older hosts.
The new SAN is intended to take over for the old one so that it can be safely "reconditioned" and put back into service as a backup datastore. (There is a frankly terrifying lack of backups right now.) It doesn't have to be fast, it needs to be big and reasonably resilient. Hence z2. This will be a production environment, but of non-critical systems that A: should be backed up to the old SAN and B: can be rebuilt from templates with unfortunate but acceptable downtime. We have other solutions in place for things that need fast disk.
The Problem:
I followed the guides I found online for connecting ESXi hosts to iSCSI shares on ZFS. There's a bunch, they're all basically the same, and I hit the same problem no matter what I do: the hosts do not find the zvol datastore when I scan.
I set an Ethernet interface to the 172.16.1.0/24 LAN for management, and one to the 17.31.0.0/24 LAN for storage, and connect them to the correct switches. Since I already have these ESXi hosts connected to another SAN on the same network I should just need to add the FreeNAS storage interface's IP to the Dynamic Discovery list, right? I've added additional hosts to the existing old SAN - set up the virtual switches, done the network port binding, etc. and that's worked. But none of them see the FreeNAS datastore.
I tried moving it to a different network: 172.16.1.0/24 and matching a Port Group in ESXi to it, still nothing.
I tried connecting to a separate switch I found laying around and reset to be completely flat (Cisco Catalyst 2960G). Nothing.
I plug my Surface into that switch and scan the network and there's FreeNAS, right where it should be, answering the port scan.
I'm at a loss and I assume making some stupid, simple mistake. Here's the settings:
17.57TiB zvol: "vmware-target"
Block iSCSI:
Target Global Configuration: left it alone, it set a Base Name, ISNS Servers blank, Pool Available Space Threshold blank.
Portals: 0.0.0.0:3260, Discovery Auth Method - None, Discovery Auth Group - None (just want to get it working for now, we can lock it down later.)
Initiators: ALL/ALL
Authorized Acces: blank
Targets: vmware-target - Portal Group ID-1, Initator Group ID-2 (the ALL/ALL one), Auth Method - None, Authentication Group number - None
Extents: name - vmware-extant, Extent type - Device, Device FreePool/FreeNAS (17.5T), Serial - random, Logical block size - 512, Enable TPC - checked, LUN RPM - SSD, everything else blank.
Associated Targets: target - vmware-target, LUN - 0, Extent - vmware-exent
iSCSI Services are turned on and set to autostart.
At this point I've pretty much hit a wall and I'm just trying random changes to settings. I need some guidance on what to try next.
Once the new SAN is up and we can migrate the VMs to it I can finally start fixing the other giant fires in this server room (particularly the problem that there's no backups to speak of, or anywhere to store them.)
Basic background - I show up to solve problems because this company has suffered high turnover in system/network admins for the past few years. Basically no one documented anything useful and so no one really knew how anything work and just duct-taped and baling-wired stuff on top of existing systems.
I am not a networking or storage genius. I'm figuring most of this out as I go along. I know VMware decently well and know my way around most tools ranging from software to hammer drills. (Hence "JackAlltrades".) Put it this way - I own my own hardhat and I'm not afraid of Linux.
We have over a dozen ESXi hosts that are (thanks to me) now centrally-managed across two sites with a pair of linked vCenters with SSO and all the trimmings. We also have an aging Dell EqualLogic with about 17TB in RAID 6 with a dead drive. (yay) This is set up on its own switch which is a pair of stacked Dell Force10 S25 (stacking cables in the back) on a 172.31.0.0/24 storage network. Everything runs via 1Gb/s Ethernet - the SAN and the hosts mostly run 4 cables each to this switch.
The new SAN is a Dell PowerEdge R510 with 128GB RAM and 12x 3TB 7200 disks in RAID-z2 running FreeNAS 11.2-U6. (Probably not the "best" config, but there's a reason for it based on my admittedly limited understanding of ZFS.)
Critical VMs run on local datastores in specific hosts with RAID'd SSD drives. Non-critical VMs run mostly on the old SAN or on HDD local datastores on older hosts.
The new SAN is intended to take over for the old one so that it can be safely "reconditioned" and put back into service as a backup datastore. (There is a frankly terrifying lack of backups right now.) It doesn't have to be fast, it needs to be big and reasonably resilient. Hence z2. This will be a production environment, but of non-critical systems that A: should be backed up to the old SAN and B: can be rebuilt from templates with unfortunate but acceptable downtime. We have other solutions in place for things that need fast disk.
The Problem:
I followed the guides I found online for connecting ESXi hosts to iSCSI shares on ZFS. There's a bunch, they're all basically the same, and I hit the same problem no matter what I do: the hosts do not find the zvol datastore when I scan.
I set an Ethernet interface to the 172.16.1.0/24 LAN for management, and one to the 17.31.0.0/24 LAN for storage, and connect them to the correct switches. Since I already have these ESXi hosts connected to another SAN on the same network I should just need to add the FreeNAS storage interface's IP to the Dynamic Discovery list, right? I've added additional hosts to the existing old SAN - set up the virtual switches, done the network port binding, etc. and that's worked. But none of them see the FreeNAS datastore.
I tried moving it to a different network: 172.16.1.0/24 and matching a Port Group in ESXi to it, still nothing.
I tried connecting to a separate switch I found laying around and reset to be completely flat (Cisco Catalyst 2960G). Nothing.
I plug my Surface into that switch and scan the network and there's FreeNAS, right where it should be, answering the port scan.
I'm at a loss and I assume making some stupid, simple mistake. Here's the settings:
17.57TiB zvol: "vmware-target"
Block iSCSI:
Target Global Configuration: left it alone, it set a Base Name, ISNS Servers blank, Pool Available Space Threshold blank.
Portals: 0.0.0.0:3260, Discovery Auth Method - None, Discovery Auth Group - None (just want to get it working for now, we can lock it down later.)
Initiators: ALL/ALL
Authorized Acces: blank
Targets: vmware-target - Portal Group ID-1, Initator Group ID-2 (the ALL/ALL one), Auth Method - None, Authentication Group number - None
Extents: name - vmware-extant, Extent type - Device, Device FreePool/FreeNAS (17.5T), Serial - random, Logical block size - 512, Enable TPC - checked, LUN RPM - SSD, everything else blank.
Associated Targets: target - vmware-target, LUN - 0, Extent - vmware-exent
iSCSI Services are turned on and set to autostart.
At this point I've pretty much hit a wall and I'm just trying random changes to settings. I need some guidance on what to try next.
Once the new SAN is up and we can migrate the VMs to it I can finally start fixing the other giant fires in this server room (particularly the problem that there's no backups to speak of, or anywhere to store them.)