[BBD] Cache anomalies: strange data, timings, effects

Started by MrPete, April 10, 2025, 09:43:58 PM

Previous topic - Next topic

MrPete

Hiya,

Running current version of iMatch. The first part of this report is more or less what I would expect. Down below is not what I expected. ;)

The setup
  • My sweetie has 330k+ images in her iMatch database
  • She was switching between a folder with 42 images, and one with ~1000
  • Complaint to me: "Pete, it's been getting slower and slower all morning! Every time I change folders, it takes several minutes to reload?! And it usually puts up a "Processing..." message."

Initial diagnosis

  • I took a look at the log. Ohhhh... iMatch is carefully removing hundreds of photos from the cache, keeping it at around 90GB. I bet there's a setting for that...
  • Yep, cache size was set to 100GB max, and yes allow caching of JPG. The cache is on 1TB NVMe super-fast storage. Changed it to 500GB for good measure. Also turned off Purge.
  • I then went through each of her top level folders, and took the time to display the preview of each. Took a while, no surprise.
  • Now iMatch loads image previews VERY fast. Even a top folder with 165k images only takes a few seconds. :)

So far, so good!

Now it gets wierd

  • On the preferences Cache tab, there's a current cache image count and size. After doing all of the above, that count has not changed. ~67k images in ~90GB
  • Looking at Windows Explorer, the available drive space has not changed (by 1+GB)
  • Looked at RAM memory use. Nothing has changed.
  • Looked at the log. Yep, it is not purging images anymore. Not since the change, not during adding well over 200k images to the cache.
  • Restarted iMatch. It's still speedy. The cache count still has not changed.

I'm reporting this as a bug, but in reality I'm just puzzled!

HOW can the system easily and quickly display any subset of 330k images now, with a cache that hasn't grown at all, and the cache size indicator says the cached image count has not changed?

And if Windows is caching this... any idea how or where? I haven't found anything else growing!

Feels like the cache is making use of 4th dimensional space LOL  ;D

Mario


QuoteComplaint to me: "Pete, it's been getting slower and slower all morning! Every time I change folders, it takes several minutes to reload?! And it usually puts up a "Processing..." message."
When I understand you correctly, you are switching between folders in the Media & Folders View.
And that takes any time?

Even for folders with 10K Files, it takes IMatch maybe a second or two to load everything.
All the stuff comes from the database and is loaded on-demand when you scroll.

The File Window uses thumbnails stored in the database, not cache images.

The cache is used for things like Viewer, Quick View Panel, Slide Show.
Which explains why the cache does not grow - because your operations don't require any cache images at all.

If IMatch takes more than a second to load a folder with just 1000 files, something is badly wrong.
Where is your database stored? On the local SSD or a NAS box / server?

If it's on a NAS/server, the slowness is explained because your NAS/network is slow.
And it is faste the second time because the Windows file system cache has the data in memory already and no NAS/server access is needed.

Consider the usual suspects: Virus checker scanning the database all the time, virus checker interfering with network performance, NAS is slow (a database on a NAS can be 1000 times slower than a database on a local SSD).

MrPete

Thanks for the insights, Mario!

No NAS. 
Database is on NVMe (3GB/sec) storage. Files are mostly in RAID1 enterprise HDD.

I dug in further... I think I may be understanding a little better. Not sure if this is helpful for you or others:
  • She's using viewer quite a bit. And looking at RAW (NEF) files. I suspect that fully explains why the cache filled up. (With 500GB cache, I don't think that's an issue anymore ;) )

I suspect the timing of delays was an unintended artifact of her workflow. In simplified form:
  • Go to drive/folder A, select a bunch of images, use viewer to choose "best", leave viewer, drag the chosen image to DxO Favorite for editing and JPG export
  • Export JPG to drive/folder B
  • Go to drive/folder B in iMatch. Use Google image search to identify the plant (in this case) or whatever
  • In iMatch, rename JPG and NEF based on what was learned


Her observed symptom: iMatch was slow to show files after switching drive/folder in the file viewer.
Log file says: image cache was being cleared for quite a while after DxO finished (or maybe after launch? Not debug log so can't tell.)
Papa Pete thinks: there's also an understandable delay creating the cached versions of NEF images before Viewer can work. That may well be the primary delay she is experiencing.

Mario, does that make some sense?

If so, then:
a) Growing the cache ought to eliminate any purge delays due to a full image cache
b) It may pay to pre-cache a lot of NEF images for her

Potential lesson learned: configure plenty of image cache before ingesting lots of images into iMatch. It will save on later reprocessing ;)

Mario

You did not mention using the Viewer in the first post.

The Viewer of course uses the cache.
A NEF file not cached may take 3 to 15 seconds to create a cache image, depending on the NEF variant and if WIC or LibRaw is used.

Which virus checker do you use?
Did you make an exception for the database folder (!) and IMatch.

Also, to provide a minimum of information,

- switch IMatch to debug logging (Help menu > Support)
- do whatever you do that's slow
- Copy the log file via Help > Support to a file on disk, ZIP the file and attach

This will show us what takes how long.


MrPete

Will do.

In the meantime,I goofed... And am wondering how best to work around this:

I started a build of the cache for a bunch of NEF images. It is not going quickly at all... And greatly slowing foreground work.

Question: is there a safe way to terminate a ctrl-shift-F5 "(re)cache images only" task?

Stop and restart imatch did not do it.

Mario


MrPete

Quote from: Mario on April 11, 2025, 06:12:30 PMClear Processing Queues
:)

The explanation there is a bit scary: it says "all files will be marked as up-to-date and removed from the update queue."
What if the general summary above were to say "This command allows you to clear any of the five background processing queues for files and metadata." Then just number the five actions (eg "-> 1. Clear file queue" ... "-> 2. Clear meta..." etc.)

In any case, it worked great, and I appreciate the help file saying the database must be closed and reopened. That isn't mentioned in the dialogue box. (Could the help be a little outdated? The five queues are not all documented yet in the help. ;))

Mario

#7
This is a last resort command that is somewhat hidden, documented sparsely and frighteningly with intent.

As usual, if you find unclear or missing content in the help system, use the "Feedback" link available at the bottom of  each help page to let me know. Help update requests go into a special queue and I process this queue every couple of weeks or before a new help release.

monochrome

Also, try "compact and optimize". If the query can be satisfied with a continuous read from the DB it will always be faster.

Mario

Quote from: monochrome on April 12, 2025, 10:04:47 PMAlso, try "compact and optimize". If the query can be satisfied with a continuous read from the DB it will always be faster.
Good advice. That being said, with today's 800 GB/sec .m2 SSD drives (or faster), optimizing a database has become less important. If the DB is on a spinning disk, or a NAS, optimization may help to streamline things. Somewhat.

I work on a 5 year old workstation-grade PC and on a 18 month old laptop, and performance for my 500,00 and one million managed assets test databases is very good. I keep several test databases on external USB 3/4 disks for portability and performance is excellent.