System compromised?

Status
Not open for further replies.

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I don't think this is quite the right place for this question, but it's the closest I can think of. If it belongs somewhere else, please let me know.

I woke up this morning to an apparently-compromised FreeNAS box. The reporting graphs were showing 100% CPU usage, almost all of which (95%+) was "system" processes; the load average was over 200; and top showed a whole mess of 'sudo' processes, all owned by root. The console was scrolling the same message over and over, something dealing with sudo. My IPMI web interface showed a sharp increase in power consumption starting about 2:00 am, so I suspect that's when whatever happened, started to happen.

Since the only outside access to anything on my machine is port 32400 forwarded from the Internet to my Plex jail, I stopped the Plex jail, with no immediate effect. The web GUI then became unresponsive, or at least slow enough that I wasn't willing to wait for it to respond, so I shut the system down from the console (option 14 on the menu, not a hard power off). On reboot, the console didn't immediately start scrolling messages, but the web UI wouldn't load any of the images, or in fact any of the alt text for the buttons in the top frame. I shut down the system again and left it shut down. I have not yet investigated whether there's any data loss or damage.

The system is running 9.10-20160426..., which was freshly installed onto an SSD, and then my configuration from my previous box (running a recent 9.3 build) was uploaded. The jails were manually installed using FreeBSD packages where available (CouchPotato and Urbackup were installed from source), and the VirtualBox jail was using the FreeNAS VirtualBox jail template.

I have a recursive snapshot task running on my main pool, though I hadn't gotten around to setting one up on the jails pool (but I just installed those yesterday anyway). I'm sure I could simply reinstall 9.10, upload my config, and possibly revert to last night's snapshot, and maybe rebuild my jails, but that wouldn't do anything toward figuring out what happened, which I'd like to do (if I don't know what happened, I don't have much of a chance of preventing it happening again).

Current status is that the server is shut down. When it was last up, the web GUI wouldn't load. I'm keeping it shut down until I have some plan of action to further investigate. I'm sure a debug file would be helpful, but I'm not sure how to generate one without web GUI access. How should I best proceed to figure out what happened?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
Hardware info and what is the error message.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The logs should still be available on the .system datatset, and they'd probably go a long way towards building a picture of what's going on.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Slight update: I did bring the system back up, but it's disconnected from the Internet. The error message I just gave (which repeats several times per second) is from /var/log/auth.log, which is 452 MB in size; the logrotate-d previous days are each under 10 KB (though of course they're bzip'd).

Poking through some of my data doesn't show any .gpg file extensions, which would be a sign of a cryptolocker attack as I understand it. The system as running now doesn't show any unusual processes or unreasonable system load.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
http://doc.freenas.org/9.10/freenas_cli.html#freenas-debug

It's possible that if someone compromised a jail, they could use it as a springboard to attack IPMI, the freenas webgui / API, or other services on the NAS. Especially if you reused SSH keys. I'd look back through /var/log/auth.log for the beginning point at which you start to see weird activity, then try to correlate with other log entries.

Might be a good time to switch to using a VPN. :D
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
It's amazing what TFM will tell you if you just read it... Thanks for the pointer, @anodos. I ran freenas-debug -v, and it reports that "All Files, Directories and Symlinks in the system were verified successfully". But now it's getting curiouser and curiouser.

I can get into the shell from the console without issue. I can ssh into the server as root. But when I try to su from root to an unprivileged user, I get
Code:
[root@freenas2] ~# su dan
su: /bin/csh: Permission denied

When I try to ssh in as that user, which was possible previously, I get
Code:
root@pve:~# ssh dan@192.168.2.2
dan@192.168.2.2's password:
Last login: Wed Apr 27 19:43:01 2016 from 192.168.1.198
Could not chdir to home directory /mnt/tank/dan: Permission denied
/bin/csh: Permission denied
Connection to 192.168.2.2 closed.

Permissions look correct:
Code:
[root@freenas2] ~# ll /bin/csh
-r-xr-xr-x  2 root  wheel  382080 Apr 27 09:45 /bin/csh*

drwx------  3 dan  staff  4 Dec 22  2014 dan/
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Interestingly, the error in auth.log corresponds exactly in time to this bit in /var/log/messages:
Code:
Apr 28 01:28:23 freenas2        (pass17:mps0:0:25:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 154 command timeout cm 0xfffffe000115ea20 ccb 0xfffff81a71d22000
Apr 28 01:28:23 freenas2        (noperiph:mps0:0:4294967295:0): SMID 1 Aborting command 0xfffffe000115ea20
Apr 28 01:28:23 freenas2 mps0: Sending reset from mpssas_send_abort for target ID 25
Apr 28 01:28:25 freenas2 mountd[3056]: export request succeeded from 192.168.1.3
Apr 28 01:28:25 freenas2 mps0: mpssas_prepare_remove: Sending reset for target ID 25
Apr 28 01:28:25 freenas2 da17 at mps0 bus 0 scbus0 target 25 lun 0
Apr 28 01:28:25 freenas2 da17: <ATA WL6000GSA6457 0F.3> s/n WOL240336070         detached
Apr 28 05:28:27 freenas2 devd: Executing '[ -e /tmp/.sync_disk_done ] && LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/python /usr/local/www/freenasUI/tools/sync_disks.py'
Apr 28 01:28:27 freenas2        (da17:mps0:0:25:0): WRITE(16). CDB: 8a 00 00 00 00 02 9e eb 54 00 00 00 01 00 00 00 length 131072 SMID 957 terminated ioc 804b scsi 0 state c xfer 0
Apr 28 01:28:27 freenas2 mps0: (da17:mps0:0:25:0): WRITE(16). CDB: 8a 00 00 00 00 02 9e eb 54 00 00 00 01 00 00 00 
Apr 28 01:28:27 freenas2 IOCStatus = 0x4b while resetting device 0x1b
Apr 28 01:28:27 freenas2 (da17:mps0:0:25:0): CAM status: Unconditionally Re-queue Request
Apr 28 01:28:27 freenas2 mps0: (da17:Unfreezing devq for target ID 25
Apr 28 01:28:27 freenas2 mps0:0:mps0: 25:Unfreezing devq for target ID 25
Apr 28 01:28:27 freenas2 0): Error 5, Periph was invalidated
Apr 28 01:28:27 freenas2 (da17:mps0:0:25:0): Periph destroyed


That was a newly-purchased drive that I was testing, and which I think is going back to the vendor. The drive wasn't part of a pool or otherwise mounted in any way; it was just undergoing badblocks testing. There's nothing else anywhere close in time in messages, other than NFS mounts succeeding. I can't imagine how this could relate to the problem I was seeing, but the "Periph destroyed" is at exactly the same time as the auth.log messages start, which is a little too coincidental to just ignore.

I'm seeing a number of "permission denied" like the following in nginx-error.log, but they're from when I tried to access the web GUI, nothing around the time the auth.log messages started:
Code:
2016/04/28 05:47:52 [error] 41424#0: *26043 open() "/usr/local/www/freenasUI/static/images/ui/menu/system.png" failed (13: Permission denied), client: 192.168.1.216, server: localhost, request: "GET /static/images/ui/menu/system.png HTTP/1.1", host: "freenas2.familybrown.org", referrer: "https://freenas2.familybrown.org/"


More to follow; I need to leave in a few minutes. It's looking like something's happened to hose permissions in some way, but as far as I can see so far, data isn't affected.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Interestingly, the error in auth.log corresponds exactly in time to this bit in /var/log/messages:
Code:
Apr 28 01:28:23 freenas2        (pass17:mps0:0:25:0): INQUIRY. CDB: 12 00 00 00 24 00 length 36 SMID 154 command timeout cm 0xfffffe000115ea20 ccb 0xfffff81a71d22000
Apr 28 01:28:23 freenas2        (noperiph:mps0:0:4294967295:0): SMID 1 Aborting command 0xfffffe000115ea20
Apr 28 01:28:23 freenas2 mps0: Sending reset from mpssas_send_abort for target ID 25
Apr 28 01:28:25 freenas2 mountd[3056]: export request succeeded from 192.168.1.3
Apr 28 01:28:25 freenas2 mps0: mpssas_prepare_remove: Sending reset for target ID 25
Apr 28 01:28:25 freenas2 da17 at mps0 bus 0 scbus0 target 25 lun 0
Apr 28 01:28:25 freenas2 da17: <ATA WL6000GSA6457 0F.3> s/n WOL240336070         detached
Apr 28 05:28:27 freenas2 devd: Executing '[ -e /tmp/.sync_disk_done ] && LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/python /usr/local/www/freenasUI/tools/sync_disks.py'
Apr 28 01:28:27 freenas2        (da17:mps0:0:25:0): WRITE(16). CDB: 8a 00 00 00 00 02 9e eb 54 00 00 00 01 00 00 00 length 131072 SMID 957 terminated ioc 804b scsi 0 state c xfer 0
Apr 28 01:28:27 freenas2 mps0: (da17:mps0:0:25:0): WRITE(16). CDB: 8a 00 00 00 00 02 9e eb 54 00 00 00 01 00 00 00
Apr 28 01:28:27 freenas2 IOCStatus = 0x4b while resetting device 0x1b
Apr 28 01:28:27 freenas2 (da17:mps0:0:25:0): CAM status: Unconditionally Re-queue Request
Apr 28 01:28:27 freenas2 mps0: (da17:Unfreezing devq for target ID 25
Apr 28 01:28:27 freenas2 mps0:0:mps0: 25:Unfreezing devq for target ID 25
Apr 28 01:28:27 freenas2 0): Error 5, Periph was invalidated
Apr 28 01:28:27 freenas2 (da17:mps0:0:25:0): Periph destroyed


That was a newly-purchased drive that I was testing, and which I think is going back to the vendor. The drive wasn't part of a pool or otherwise mounted in any way; it was just undergoing badblocks testing. There's nothing else anywhere close in time in messages, other than NFS mounts succeeding. I can't imagine how this could relate to the problem I was seeing, but the "Periph destroyed" is at exactly the same time as the auth.log messages start, which is a little too coincidental to just ignore.

I'm seeing a number of "permission denied" like the following in nginx-error.log, but they're from when I tried to access the web GUI, nothing around the time the auth.log messages started:
Code:
2016/04/28 05:47:52 [error] 41424#0: *26043 open() "/usr/local/www/freenasUI/static/images/ui/menu/system.png" failed (13: Permission denied), client: 192.168.1.216, server: localhost, request: "GET /static/images/ui/menu/system.png HTTP/1.1", host: "freenas2.familybrown.org", referrer: "https://freenas2.familybrown.org/"


More to follow; I need to leave in a few minutes. It's looking like something's happened to hose permissions in some way, but as far as I can see so far, data isn't affected.
You might also want to check logs on your plex jail. Since it's a jail you don't actually need to have it running to do this.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I've checked the logs in all the jails. Absolutely nothing of interest anywhere close to 0130 today.

There's clearly something strange going on with permissions, and I don't know what it is. Both the inability of the nonprivileged user to log in, and the issues with the web GUI, seem to stem from this, but the permissions on all the relevant files appear to be correct. What's disconcerting about this is that, if this is related to one of the jails, the apparent attacker managed to break out of the jail, which I didn't think they were supposed to be able to do.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
I've checked the logs in all the jails. Absolutely nothing of interest anywhere close to 0130 today.

There's clearly something strange going on with permissions, and I don't know what it is. Both the inability of the nonprivileged user to log in, and the issues with the web GUI, seem to stem from this, but the permissions on all the relevant files appear to be correct. What's disconcerting about this is that, if this is related to one of the jails, the apparent attacker managed to break out of the jail, which I didn't think they were supposed to be able to do.

Perhaps it's just a combination of bad juju with a failing boot medium and a misbehaving application. :D

If your system was indeed compromised, it does not indicate that the attacker broke out of the jail (I'd rather use such a vulnerability to attack something of higher value than a home NAS). Theoretically, an attacker can easily leverage a compromised jail to attack the host system through more mundane methods. For instance, you can start running post and get requests from the jail to the FreeNAS webgui's API. If you reuse SSH keys (i.e. use ssh the client inside jail to access other servers), an attacker can use them to access other systems on your home network. There was recently a user who ended up with his NAS encrypted by ransomware because a jail had an SSH key to rsync data to the main NAS.

That said, it's probably a good idea to download your freenas config, install a fresh copy of FreeNAS, create a new plex jail, and possibly look into securing remote access through a VPN.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Bad juju and misbehaving application are possible, but (1) a scrub of the boot pool came up clean, and (2) the OS verification also came up clean, which together I'd think would rule out a failing boot device (there does appear to be a failing device, but there's no obvious way that it would be relevant to what's going on). The clean OS verification puzzles me a bit, as I'd think it'd catch whatever the permissions issue is, but nothing there. Fortunately I don't ssh from any of my jails, so there are no keys there to grab.

That said, it's probably a good idea to download your freenas config, install a fresh copy of FreeNAS, create a new plex jail, and possibly look into securing remote access through a VPN.
That's about where I'm at, though I'm not sure about the VPN in this case. As I understand it, port 32400 (or some other specified port) needs to be open to the Plex installation for the remote features to work (which I could live without, at least for a while), and I'm not sure how a local Plex client would deal with not being able to discover the server through plex.tv. Easy enough to test in the short term, though; I can just remove the port forward from my router.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Bad juju and misbehaving application are possible, but (1) a scrub of the boot pool came up clean, and (2) the OS verification also came up clean, which together I'd think would rule out a failing boot device (there does appear to be a failing device, but there's no obvious way that it would be relevant to what's going on). The clean OS verification puzzles me a bit, as I'd think it'd catch whatever the permissions issue is, but nothing there. Fortunately I don't ssh from any of my jails, so there are no keys there to grab.


That's about where I'm at, though I'm not sure about the VPN in this case. As I understand it, port 32400 (or some other specified port) needs to be open to the Plex installation for the remote features to work (which I could live without, at least for a while), and I'm not sure how a local Plex client would deal with not being able to discover the server through plex.tv. Easy enough to test in the short term, though; I can just remove the port forward from my router.
Honestly, I have no experience with plex and so I can't comment on what will help in this situation. If your router allows more flexible firewall rules, you can filter the allowed ingress / egress IP addresses on your WAN interface for your Plex jail. For instance, In PFsense you can create an alias for allowed source IP addresses in your port-forward rules. This will only allow certain IP addresses to access your plex jail from the internet.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Even stranger: I just attempted to boot into the installer (ISO, mounted virtually via IPMI). This is the same ISO image I used to install yesterday, and installation ran without issue. The SHA256sum was just confirmed a few minutes ago. Here's the result now:
Screenshot from 2016-04-28 16-05-30.png

Time to burn it to a USB and boot from that, I guess.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Even stranger: I just attempted to boot into the installer (ISO, mounted virtually via IPMI). This is the same ISO image I used to install yesterday, and installation ran without issue. The SHA256sum was just confirmed a few minutes ago. Here's the result now:
View attachment 11590
Time to burn it to a USB and boot from that, I guess.
At this rate, by Tuesday, @anodos (next most liked guy after me and @danb35 ) will suffer a mysterious failure on a server of his that leads into more WTF events.

I propose we all pitch in and hire an exorcist. I know a guy.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Burned to a USB stick, that didn't even show up as a bootable device. Downloaded the ISO and sha256 again, verified the sha256 (it was fine), and this time the installer booted and ran without apparent error. The new install is booting now, whereupon I'll upload the config (copy taken right at midnight, so before whatever happened, happened) and see what happens.

Edit: Who knows, we might need the exorcist yet.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Burned to a USB stick, that didn't even show up as a bootable device. Downloaded the ISO and sha256 again, verified the sha256 (it was fine), and this time the installer booted and ran without apparent error. The new install is booting now, whereupon I'll upload the config (copy taken right at midnight, so before whatever happened, happened) and see what happens.

Edit: Who knows, we might need the exorcist yet.
It might be worthwhile to check the event logs in IPMI for any sort of hardware fault detected. Maybe run memtest as well.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Good idea on the IPMI logs, I hadn't thought of that. Nothing in them, though, except for chassis intrusion when I had the case open a few times. I'd run memtest for a few days last week when I got the server with no errors found (it is ECC RAM, which should make this a non-issue, though I'm awaiting @cyberjock's analysis of the faulty ECC stick he received).

The system is up and running now with the restored config file. I'm only allowing one SSH key access for root (and that only until I can figure out how to let Proxmox run pve-zsync as someone other than root on the remote box). I'll keep an eye on what happens.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Good idea on the IPMI logs, I hadn't thought of that. Nothing in them, though, except for chassis intrusion when I had the case open a few times. I'd run memtest for a few days last week when I got the server with no errors found (it is ECC RAM, which should make this a non-issue, though I'm awaiting @cyberjock's analysis of the faulty ECC stick he received).

The system is up and running now with the restored config file. I'm only allowing one SSH key access for root (and that only until I can figure out how to let Proxmox run pve-zsync as someone other than root on the remote box). I'll keep an eye on what happens.

Hardware can do weird stuff.

This is mostly conjecture, but what you described doesn't fit in with what I would expect from a network intrusion. I'd expect [1] your system to be really busy leading up to compromise (as attacker tries to force a buffer overflow or exploit some other vulnerability) followed by it [2] returning to a more-or-less normal state as he tries to move laterally on your network, compromise more systems, then (since you're a home user) a [3] lot of activity as he starts encrypting everything (because that's where the money is). Steps [1] and [2] are done in ways that minimally affect the filesystem to reduce chances of detection, and obviously no one got to [3] since you still have your files. I think hardware or software problem is more likely.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
That's at least somewhat encouraging. The system seems to be running fine for now. I think I'll turn off all the jails before bed tonight and make sure everything's fine in the morning without them (I'm not at all sure the problem had anything to do with them, but it at least eliminates a variable). I have badblocks running on the questionable drive again, hoping it will complete a full pass to get a true picture of the status of the drive.

One thing I did notice as I was going through the Plex jail was that there's parameter in the Preferences.xml file called "disableRemoteSecurity", and by default with the FreeBSD packages, that's enabled. That gives anyone who wanders across your Plex server full access to your data. That's frankly an idiotic default setting, and I've now changed it.

Still unexplained:
  • Borked permissions on something in the OS (still don't know what), resulting in the inability of a nonprivileged user to log in, that don't show up despite running both a boot pool scrub and a Verify Install
  • FreeNAS ISO didn't work, despite matching sha256, when mounted as virtual device over IPMI, or when burned to a USB stick. A fresh download worked fine.
 
Status
Not open for further replies.
Top