File Scanning and Performance

When you install Simple File List the plugin will create a new directory within your WordPress uploads directory intentionally separate from the WordPress Media Library. The default location is:

wp-content/uploads/simple-file-list/

The Pro version allows you to change this location, as long as it remains relative to your WordPress home folder. No matter where you put this directory, the scanning will be the same.

Because the plugin may need to recognize files added, removed or changed outside of the plugin using FTP or other means, re-scanning is done to make sure the file list array stored in the database matches what is actually on your disk.

Manual Re-Scanning

First, if you rarely add files outside of the plugin, you do not need to use regular scanning and the rest of this article does not apply. Set the Disk Re-Scan setting to “Scan Only Manually”.

Then simply click the “Re-Scan Files” button on the Admin File List to perform a full disk scan.

Clicking this button forces a full re-scan of the file list. If your list is large this may take a while.

TIP – If you are having timeout issues when scanning, turn off the thumbnail generation. This process is much more resource intensive than simply listing the files on the disk.

If you do add files outside of the plugin you have multiple options to automatically add them into your list.

Scanning Each Time

If you add or remove files via FTP or using another method outside of the plugin, and you need the changes to appear immediately, set the Disk Re-Scan setting to “Scan Each Time”. This will force Simple File List Pro to scan the disk for changes upon each page load. If your file list is large, and your web server is limited, this can cause page loads to slow down.

Scanning on an Interval

If you add or remove files via FTP or using another method outside of the plugin, and you DO NOT need the changes to appear immediately, set the Disk Re-Scan setting to “Scan Each Day” or “Scan Each Hour”. Using one of these settings will cause the re-scanning to occur not more frequent than the chosen setting.

It is important to understand that the intervals are not tied to a clock. WordPress relies on website traffic to trigger actions. This means that if you use the “Scan Each Hour” setting, but your website does not see new visitors each hour, the job will not begin until the next visitor arrives, even if that is hours later.

To Background Scan or Not

Normally, Simple File List Pro uses a WordPress transient in the database to know if it’s time to re-scan or not. This transient has an expiration date. So if it has expired, the disk will be re-scanned before the file list page loads.

Unfortunately, this means that one unlucky visitor who arrives at your file list after the transient has expired must wait for it to be re-scanned before the page will load. This means one unlucky person per Day or Hour.

Checking the Scan in Background setting will instead use WordPress’ WP CRON system to run the re-scanning jobs. The advantage to this is that your front-end file list visitors will not have to wait for the scanning to complete, as this is handled as a background task. This works well if your site has frequent visits.

However, if you have low traffic, a new user may not see the disk changes because they themselves have triggered the background re-scanning task, and would not see the changes until a subsequent file list load.

Some web servers have trouble with the WP CRON system, therefore this setting is not ON by default.

Experiment with the different settings and see what works best for your website. Look at the time/memory statistics on the Admin List footer to see how fast your list is loading.

TIP – Install the WP Control plugin so you can see what is going on with the WordPress CRON system.

How Scanning Works

Scanning can be a resource intensive process, both in terms of time and memory. As your file list grows you will eventually run into certain limitation of your web server.

As the disk is scanned an array of file paths is stored in an array in memory. This array is then processed item-by-item to make a second array, which is what is stored in the database. This second array is much a larger because it stores data associated with the file, such as dates, ownership, nice names and descriptions. Sorting creates additional arrays. All of these can lead to running into memory limits.

The maximum size of the file array that can be stored in the WordPress database is 2 GB. It’s important to understand that this is not related to the file sizes, rather the text associated with each element in the data array. So 2GB could store data for many many files.

PHP has limitations defined on data passed back and forth that are much less than this 2 GB value. The amount of data your server can fetch from the database, and how fast it can do it, are your real limitations.

Thumbnail Generation

File scans are generally very fast, but checking for and generating thumbnails for applicable files takes time, especially for PDF files. This process happens right after the disk is scanned for changes. Each file in the array is checked to see if it uses thumbnail generation and if the thumbnail is present or not. If the process gets stuck on a too-large or problematic file, the default icon will be used instead. If your list is taking a long time to re-scan, turn thumbnail generation off.

Summary

In summary, if you do not add files to your list from outside of the plugin, use the Re-Scan Only Manually setting. Otherwise experiment with the scanning options and see what works best for your site.

3 thoughts on “File Scanning and Performance”

  1. When I change the Disk Re-scan setting to anything from the default and save, the filelist directory changes to:
    wp-content-5C-5C-5C-5CDocuments-5C-5C-5C-5CForms-and-Documents-5C-5C-5C-5CEmployee-Forms-5C-5C-5C-5CEmployee-Related-Forms/

    ie. it replaces the back slashes with -5C-5C-5C-5C and a new directory is created.

    Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.