Intermittently slow CIFS share using Lightroom

Status
Not open for further replies.

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
I'm a total noob but I'm very interested in seeing how this turns out as I am a heavy Lightroom user and am currently trying to decide between FreeNAS and Synology. (My current rig is a Win7 machine with internal RAID5 that I have outgrown.)

My background is primarily networks so I have some questions/suggestions in case you haven't already tried them:

You mention you're using a Cisco switch. (What model?) Have you been able to rule out the network? Is it just your PC and the NAS on the same switch?

Regarding Lightroom, have you tried testing this with a new empty catalog, just to see if it has anything to do with it? Some of my catalogs are very large and I have noticed inconsistent performance problems, especially on imports. I know you said you did a parallel copy outside of Lightroom and it was slow as well, but I'm assuming you mean from the same PC. If at all possible, I would try the parallel copy from another PC while you're experiencing the slowness, just to rule out your Win7 PC and/or Lightroom. I would also try opening Resource Monitor on your Win7 PC to monitor network, disk and CPU activity while you're doing your imports. This is a great tool to help troubleshoot bottlenecks and may help you see what Lightroom is doing behind the scenes. One last comment about LR. There's an option that autowrites a metadata sidecar file (XMP?) for each of your photos, as well as the usual metadata in the catalog. I think those get written out to the same folder as the RAW files, which can result in random performance problems according to Adobe. Just a thought in case that's what you're doing.

FreeNAS vs. Synology: I had similar options and decided for FreeNAS because of ZFS support (my main priority is data integrity)

Router: Cisco/Linksys EA6300. PC and NAS on the same switch.
I'm now testing a direct connection PC-NAS. So far so good but being the issue intermittent I need more time to confirm that it's following the router.
Yes, the parallel copy I mentioned was from the same PC running Lightroom. I haven't been able to test copying from a second PC yet but I'm planning to do that.

The issue happens with catalogs different in size, creation date, etc. etc.
XMP sidecars are disabled.

I've been monitoring through Win7 Resource Monitor from the beginning: nothing to report, apart from the slow network speed


I'll keep posting if I'm able to make any progress.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Do the following:
Download the dtrace script below and run it while replicating the issue that you are having.

I believe the command you will want to run is "procsystime -a smbd"
Post the output here (enclosed in code tags).

http://www.brendangregg.com/DTrace/procsystime

Once you have done that, replicate the problem a second time with logging set to "debug" and post log.smbd here as well (or a link to where I can download it).
 

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
Do the following:
Download the dtrace script below and run it while replicating the issue that you are having.

I believe the command you will want to run is "procsystime -a smbd"
Post the output here (enclosed in code tags).

http://www.brendangregg.com/DTrace/procsystime

Once you have done that, replicate the problem a second time with logging set to "debug" and post log.smbd here as well (or a link to where I can download it).

I've tried to execute the script, unsuccessfully.
Not sure if it's the script itself or something in my settings (most likely).
This was my very first time running a script.
I
copied the script from the suggested link and saved it as file "procsystime";
created user "testuser";
placed the script file in testuser's home direcory, then;
executed "/bin/sh /mnt/zpool_system/JailsDataset/testuser/procsystime"

this is the result
Code:
testuser@freenas:~ % /bin/sh /mnt/zpool_system/JailsDataset/testuser/procsystime
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
/mnt/zpool_system/JailsDataset/testuser/procsystime: 68: Syntax error: expecting "in"


I then created a very simple script, followed the procedure described above and the script executed as expected.

Suggestions ?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I've tried to execute the script, unsuccessfully.
Not sure if it's the script itself or something in my settings (most likely).
This was my very first time running a script.
I
copied the script from the suggested link and saved it as file "procsystime";
created user "testuser";
placed the script file in testuser's home direcory, then;
executed "/bin/sh /mnt/zpool_system/JailsDataset/testuser/procsystime"

this is the result
Code:
testuser@freenas:~ % /bin/sh /mnt/zpool_system/JailsDataset/testuser/procsystime
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
: not foundsystem/JailsDataset/testuser/procsystime:
/mnt/zpool_system/JailsDataset/testuser/procsystime: 68: Syntax error: expecting "in"


I then created a very simple script, followed the procedure described above and the script executed as expected.

Suggestions ?
chmod +x procsystime
Out of curiosity, what jails are you running? Does the problem (cifs stalling) manifest itself if all jails are turned off?
 

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
chmod +x procsystime
Out of curiosity, what jails are you running? Does the problem (cifs stalling) manifest itself if all jails are turned off?

I had already checked and changed authorizations:
-rwxr-xr-x 1 testuser wheel 6496 Oct 23 00:24 procsystime*

I assume "rwx" for the user is what is required.

The only jail I've installed is feeipmi (just to play a bit with the settings of my Supermicro motherboard).
I'll disable it and check if this has any effect.
 

Pasquale61

Explorer
Joined
Oct 8, 2014
Messages
62
FreeNAS vs. Synology: I had similar options and decided for FreeNAS because of ZFS support (my main priority is data integrity)

Router: Cisco/Linksys EA6300. PC and NAS on the same switch.
I'm now testing a direct connection PC-NAS. So far so good but being the issue intermittent I need more time to confirm that it's following the router.
Yes, the parallel copy I mentioned was from the same PC running Lightroom. I haven't been able to test copying from a second PC yet but I'm planning to do that.

The issue happens with catalogs different in size, creation date, etc. etc.
XMP sidecars are disabled.

I've been monitoring through Win7 Resource Monitor from the beginning: nothing to report, apart from the slow network speed


I'll keep posting if I'm able to make any progress.

Have you ever tried running the "top" command in the shell, either in the console or through SSH? It may help with your troubleshooting by comparing what things looks like normally to what they look like while you're experiencing the slowness. Just a thought...

The other thing I wanted to mention since I think you said it only happens during imports is do you know if you are automatically applying develop presets based on ISO settings and camera model/serial number, etc? I have had varying import performance results based on what I'm doing with each of the presets. I tend to forget I even made these settings until I'm importing something that triggers one of them.

Are you importing from USB? I just think that you probably need to have a second PC to rule out anything on the Win7 PC. If you think about it, the I/O bus is shared by your Ethernet adapter, USB ports, local hard drive controllers, etc. Even though resource monitor doesn't show anything, there may be some type of local I/O contention that only happens while you're importing, and only under certain circumstances.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I had already checked and changed authorizations:
-rwxr-xr-x 1 testuser wheel 6496 Oct 23 00:24 procsystime*

I assume "rwx" for the user is what is required.

The only jail I've installed is feeipmi (just to play a bit with the settings of my Supermicro motherboard).
I'll disable it and check if this has any effect.
Depending on how you moved the script to the server, it might be an issue of unwanted carriage returns. The way I put scripts on my server is to open vi in a putty session and paste the script. Oh, and try putting it in /mnt/[zpool] and not in a jailed home folder.

Checking for bottlenecks on your workstation is a good idea. Did you ever have this problem with a QNAP NAS? If you didn't and if you can figure out how to access its smb.conf file on the QNAP, post it here.

Are you using LACP? Please also set samba logging to 'debug', recreate the problem, zip the log, post it somewhere, and link it here.
 

bestboy

Contributor
Joined
Jun 8, 2014
Messages
198
You might need to learn a thing or two, but if you could spare some time, try using NFS instead of CIFS (Samba).

FreeNAS would be happy to server NFS, and since you look like the only user, you do not need to worry about permissions...

Windows 7 has ability to use Client for NFS distributed by Microsoft. You can also try NFSv4.1 Client for Windows available (binaries too) at http://www.citi.umich.edu/projects/nfsv4/windows/

I think getting Samba out of the picture is really a good way to find out if Lightroom or Samba is doing something wrong IMHO.
If you are not willing to test NFS, maybe you'd be open to test iSCSI?
It's not really a file share and thus only an option for a single user setup. It's more like an external, virtual drive. Unlike NFS Windows 7 has good support for iSCSI and more importantly the "share" behaves just like a normal drive. You can format it with NTFS and it is totally transparent to the application. If Lightroom really has a problem with CIFS shares, then it is unlikely that it is also relevant for iSCSI "shares".

iSCSI @ Windows 7: http://www.windowsnetworking.com/articles-tutorials/windows-7/Connecting-Windows-7-iSCSI-SAN.html
iSCSI @ Windows 7 with FreeNAS: http://blog.pluralsight.com/freenas-8-iscsi-target-windows-7
 

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
Depending on how you moved the script to the server, it might be an issue of unwanted carriage returns. The way I put scripts on my server is to open vi in a putty session and paste the script. Oh, and try putting it in /mnt/[zpool] and not in a jailed home folder.

Checking for bottlenecks on your workstation is a good idea. Did you ever have this problem with a QNAP NAS? If you didn't and if you can figure out how to access its smb.conf file on the QNAP, post it here.

Are you using LACP? Please also set samba logging to 'debug', recreate the problem, zip the log, post it somewhere, and link it here.


Notice to fellow newbies: learning freenas can be painful and requires time.
So, if you are not ready to spend a couple of nights just to learn how to run scripts, you may want to look somewhere else ....

anodos was right: the issue with the script was the method I used to create the script file.
I copied the script text from the browser's page and pasted it in Notepad++.
Then I copied the file to the destination directory in freenas.
Unfortunately this added the unwanted carriage returns that caused the script to fail.
It's tricky since the CRs were not visible in Notepad++, they're in vi.
So you will need to learn also vi ....

Used the suggested method (putty session, launch vi, paste, save file) and it worked fine:

procsystime output:

Code:
# /bin/sh /mnt/zpool_system/users/testuser/test2 -a smbd
dtrace: 1532 dynamic variable drops with non-empty dirty list

Elapsed Times for command smbd,

  SYSCALL  TIME (ns)
  getsockopt  974
  thr_self  1000
  sigaction  1103
  setgid  1203
  lseek  1273
  setrlimit  1282
  rtprio_thread  1299
  mprotect  1300
  setuid  1398
  setregid  1498
  break  1926
  setreuid  2068
  sysarch  2912
  issetugid  3346
  umask  3365
  socket  3515
  dup2  4068
  getrlimit  4256
  sendto  5009
  fcntl  6411
  connect  6701
  readlink  8527
  getuid  8885
  getgid  9251
  getegid  9900
  getpid  10401
  stat  13939
  madvise  13957
  read  20095
  lstat  25037
  __sysctl  33782
  fstat  42433
  close  45351
  geteuid  56229
  munmap  57253
  sigprocmask  126383
  open  174987
  access  245540
  mmap  348709
  write  461073
  TOTAL:  1767639

CPU Times for command smbd,

  SYSCALL  TIME (ns)
  thr_self  172
  sigaction  378
  lseek  430
  issetugid  474
  getsockopt  479
  mprotect  574
  setgid  616
  rtprio_thread  636
  umask  655
  setrlimit  753
  setuid  775
  setregid  807
  sysarch  917
  break  1262
  getuid  1289
  setreuid  1401
  getgid  1415
  getpid  1596
  getegid  1653
  getrlimit  2559
  dup2  2595
  socket  2966
  fcntl  3277
  sendto  4423
  connect  6065
  readlink  7208
  madvise  8484
  geteuid  9348
  stat  12519
  read  15092
  fstat  18392
  close  21066
  lstat  21743
  sigprocmask  29207
  __sysctl  29482
  munmap  37625
  open  147809
  access  193772
  mmap  265287
  write  425148
  TOTAL:  1280349

Syscall Counts for command smbd,

  SYSCALL  COUNT
  break  1
  connect  1
  exit  1
  fork  1
  getsockopt  1
  lseek  1
  mprotect  1
  pipe  1
  rtprio_thread  1
  setreuid  1
  setrlimit  1
  setuid  1
  socket  1
  thr_self  1
  getgroups  2
  readlink  2
  sendto  2
  setgid  2
  sysarch  2
  sigaction  3
  setregid  4
  dup2  5
  getrlimit  5
  issetugid  5
  stat  7
  lstat  8
  umask  11
  madvise  12
  read  12
  __sysctl  13
  fcntl  17
  getuid  48
  getgid  50
  getpid  51
  getegid  53
  munmap  121
  close  132
  fstat  135
  open  136
  sigprocmask  253
  access  275
  write  279
  geteuid  329
  mmap  496
  TOTAL:  2484



Tomorrow I'll try to reproduce the problem and post the log file.

The QNAP NAS is offline, I can not do any test on it. However I've never seen the issue when using it. In addition I've made tests with photo files on the workstation and also in this case no issues.
No LACP.

In the meantime I've been able to rule out:
the router/ network (the issue repeated with direct connection workstation - freenas), and
jails: with all jails disabled and then removed the issue keeps repeating

I've also tested parallel file transfers from freenas to a second win7 laptop: no slow down on the second laptop.
 

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
Have you ever tried running the "top" command in the shell, either in the console or through SSH? It may help with your troubleshooting by comparing what things looks like normally to what they look like while you're experiencing the slowness. Just a thought...

The other thing I wanted to mention since I think you said it only happens during imports is do you know if you are automatically applying develop presets based on ISO settings and camera model/serial number, etc? I have had varying import performance results based on what I'm doing with each of the presets. I tend to forget I even made these settings until I'm importing something that triggers one of them.

Are you importing from USB? I just think that you probably need to have a second PC to rule out anything on the Win7 PC. If you think about it, the I/O bus is shared by your Ethernet adapter, USB ports, local hard drive controllers, etc. Even though resource monitor doesn't show anything, there may be some type of local I/O contention that only happens while you're importing, and only under certain circumstances.

No, I've not run "top" yet. It might be something I'll try tomorrow night ....
But I run performance monitors both on the freenas and win7: CPU, memory, disks and network loads are very low.

No, I do not apply any develop preset.

No, no USB in the loop. In my workflow photo files are downloaded from camera / Compact Fash to the win7 workstation, copied to freenas and then imported into lightroom.
 

Pasquale61

Explorer
Joined
Oct 8, 2014
Messages
62
No, I've not run "top" yet. It might be something I'll try tomorrow night ....
But I run performance monitors both on the freenas and win7: CPU, memory, disks and network loads are very low.

No, I do not apply any develop preset.

No, no USB in the loop. In my workflow photo files are downloaded from camera / Compact Fash to the win7 workstation, copied to freenas and then imported into lightroom.

So you're using Windows to copy the files to Freenas first, and that part is OK? I thought you were using the LR's import feature to copy the files from the compact flash to Freenas (while at the same time obviously adding it the catalog.) I may be wrong, but the way you're doing it may be inefficient because when you run your import, LR has to read the files back from Freenas to import them into the catalog. Whereas if you were using the import feature to copy the files, it would add them to the catalog while copying to the network drive. Just out of curiosity, do your LR catalogs, previews and cache live? Here is a good article that talks about file handling and cache size settings to help with performance. It may be a little old, but most of it still valid: http://digital-photography-school.c...-and-performance-without-additional-hardware/

I thought of another way to help you rule out something. If you still have that second Win7 PC, you could just create a share on it. Then point your LR workflow to it and see if you get any different results.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Notice to fellow newbies: learning freenas can be painful and requires time.
So, if you are not ready to spend a couple of nights just to learn how to run scripts, you may want to look somewhere else ....

anodos was right: the issue with the script was the method I used to create the script file.
I copied the script text from the browser's page and pasted it in Notepad++.
Then I copied the file to the destination directory in freenas.
Unfortunately this added the unwanted carriage returns that caused the script to fail.
It's tricky since the CRs were not visible in Notepad++, they're in vi.
So you will need to learn also vi ....

Used the suggested method (putty session, launch vi, paste, save file) and it worked fine:

procsystime output:

Code:
# /bin/sh /mnt/zpool_system/users/testuser/test2 -a smbd
dtrace: 1532 dynamic variable drops with non-empty dirty list

Elapsed Times for command smbd,

  SYSCALL  TIME (ns)
  getsockopt  974
  thr_self  1000
  sigaction  1103
  setgid  1203
  lseek  1273
  setrlimit  1282
  rtprio_thread  1299
  mprotect  1300
  setuid  1398
  setregid  1498
  break  1926
  setreuid  2068
  sysarch  2912
  issetugid  3346
  umask  3365
  socket  3515
  dup2  4068
  getrlimit  4256
  sendto  5009
  fcntl  6411
  connect  6701
  readlink  8527
  getuid  8885
  getgid  9251
  getegid  9900
  getpid  10401
  stat  13939
  madvise  13957
  read  20095
  lstat  25037
  __sysctl  33782
  fstat  42433
  close  45351
  geteuid  56229
  munmap  57253
  sigprocmask  126383
  open  174987
  access  245540
  mmap  348709
  write  461073
  TOTAL:  1767639

CPU Times for command smbd,

  SYSCALL  TIME (ns)
  thr_self  172
  sigaction  378
  lseek  430
  issetugid  474
  getsockopt  479
  mprotect  574
  setgid  616
  rtprio_thread  636
  umask  655
  setrlimit  753
  setuid  775
  setregid  807
  sysarch  917
  break  1262
  getuid  1289
  setreuid  1401
  getgid  1415
  getpid  1596
  getegid  1653
  getrlimit  2559
  dup2  2595
  socket  2966
  fcntl  3277
  sendto  4423
  connect  6065
  readlink  7208
  madvise  8484
  geteuid  9348
  stat  12519
  read  15092
  fstat  18392
  close  21066
  lstat  21743
  sigprocmask  29207
  __sysctl  29482
  munmap  37625
  open  147809
  access  193772
  mmap  265287
  write  425148
  TOTAL:  1280349

Syscall Counts for command smbd,

  SYSCALL  COUNT
  break  1
  connect  1
  exit  1
  fork  1
  getsockopt  1
  lseek  1
  mprotect  1
  pipe  1
  rtprio_thread  1
  setreuid  1
  setrlimit  1
  setuid  1
  socket  1
  thr_self  1
  getgroups  2
  readlink  2
  sendto  2
  setgid  2
  sysarch  2
  sigaction  3
  setregid  4
  dup2  5
  getrlimit  5
  issetugid  5
  stat  7
  lstat  8
  umask  11
  madvise  12
  read  12
  __sysctl  13
  fcntl  17
  getuid  48
  getgid  50
  getpid  51
  getegid  53
  munmap  121
  close  132
  fstat  135
  open  136
  sigprocmask  253
  access  275
  write  279
  geteuid  329
  mmap  496
  TOTAL:  2484



Tomorrow I'll try to reproduce the problem and post the log file.

The QNAP NAS is offline, I can not do any test on it. However I've never seen the issue when using it. In addition I've made tests with photo files on the workstation and also in this case no issues.
No LACP.

In the meantime I've been able to rule out:
the router/ network (the issue repeated with direct connection workstation - freenas), and
jails: with all jails disabled and then removed the issue keeps repeating

I've also tested parallel file transfers from freenas to a second win7 laptop: no slow down on the second laptop.

When you run the script, catch the total time elapsed as well. I.e.
Code:
time sh procsystime -a smbd
 

moon

Dabbler
Joined
Jul 17, 2014
Messages
32
Moon - Just curious if you ever figured this out.

Hi Pasquale.

No, unfortunately not.
Since I was unable to isolate the source of the problem I decided to give up troubleshooting and go back to just using the system.
I'm now waiting for 9.3 to see if that's bringing any benefit. If not I'll test the NFS share.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Moon, it's possible that the slowdown is being caused by how samba is storing DOS attributes. FreeNAS stores these in a different and less efficient way than is possible on your old synology box. You can try setting the following parameters under "Services" -> "CIFS" -> "Auxiliary Parameters"

Code:
store dos attributes = no
ea support = no
map hidden = no
map system = no
map readonly = no
map archive = no


You should be able to safely enable and disable these parameters without affecting your data. It's a pity we can't reliably cause the problem to manifest itself.
 

zambanini

Patron
Joined
Sep 11, 2013
Messages
479
Next time you see that issue, take a look how many open connection your samba daemon has.

smbstatus -L
 

Pasquale61

Explorer
Joined
Oct 8, 2014
Messages
62
Hi Pasquale.

No, unfortunately not.
Since I was unable to isolate the source of the problem I decided to give up troubleshooting and go back to just using the system.
I'm now waiting for 9.3 to see if that's bringing any benefit. If not I'll test the NFS share.

Alright thanks for the update. I just built my system and am still in the process of testing and moving everything over before moving my Lightroom workflow there. FYI, regarding your NFS comment: After a lot of research and trial and error, I finally got my Windows 7 Ultimate built-in client working. I can tell you that as far as performance goes, out of the box it's definitely considerably slower than CIFS for me so far. (ie for large files, 100+ MB/s with CIFS, and 30-40 MB/s with NFS, sometimes less.) I'll save the details for another thread if I want to pursue NFS further, but here's my setup in case you're interested in comparing:

FreeNAS: 9.2.1.8 64bit

MB/CPU: ASRock C2750D4I Intel Avoton Octa-Core 2.40Ghz
Drives: 5 HGST Deskstar NAS 4TB H3IKNAS40003272SN (RAIDZ1, no SLOG)
Boot Drive: SanDisk Cruzer 4GB USB
RAM: 32GB ECC (Crucial CT2KIT102472BD160B x 2)
Case: DS380B
PS: ST45SF-G
Network: Single Gigabit on both FreeNAS and Win7 PC, same Netgear switch
 
Status
Not open for further replies.
Top