Author Topic: File relations - Splitting a complex rule to more simple ones OK?  (Read 381 times)

bekesizl

  • Jr. Member
  • *
  • Posts: 84
I am revisioning my versioning rules an was thinking about a restructuring.
The question is, if it would (probably) cause a performance issue, if I would split a complex rule to multiple simpler ones?

It would be easier to maintain rules like the simplified ones.

Original rule (buddy file)
\.(cr3)$
/^_*//
^(_*{name})[+\-_]*[0-9|a-z]*\.(jpg|jpeg|dng|tif|tiff|cr3.dop|on1|dng.dop)$


replaced by following 4

Simplification - Rule 1 - DxO sidecar
\.(cr3|cr2|nef|tif|tiff|jpg|jpeg|dng)$
/^_*//
{name}{ext}\.dop


Simplification - Rule 2 - ON1 sidecar
\.(cr3|cr2|nef|tif|tiff|jpg|jpeg|dng)$
/^_*//
{name}\.on1


Simplification - Rule 3 - RAW buddies/versions
\.(cr3|cr2|nef)$
/^_*//
^(_*{name})[+\-_]*[0-9|a-z]*\.(jpg|jpeg|dng|tif|tiff|afphoto)$


Simplification - Rule 4 - JPG buddies/versions (for my older/smartphone photos)
\.(jpg|jpeg)$
/^_*//
^(_*{name})[+\-_]*[0-9|a-z]*\.(jpg|jpeg|dng|tif|tiff|heic|afphoto)$


Mario

  • IMatch Developer
  • Administrator
  • *****
  • Posts: 27956
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #1 on: January 26, 2021, 02:06:52 PM »
The regular expression engine in IMatch is very fast.
Splitting into multiple rules would most likely cause performance degradation, because IMatch would have to apply the regular expression to 3 times as many file names, once for each rule you create.
Unless you have other reasons to apply different rules to different types of versions, one rule is best.

bekesizl

  • Jr. Member
  • *
  • Posts: 84
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #2 on: January 26, 2021, 02:16:14 PM »
Thank you!
I was hoping it would not be the case, but I will have to combine those rules to some more complex ones.

Mario

  • IMatch Developer
  • Administrator
  • *****
  • Posts: 27956
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #3 on: January 26, 2021, 03:26:56 PM »
Rules like \.(cr3|cr2|nef|tif|tiff|jpg|jpeg|dng)$ as master expressions are quite complex (IMatch must apply this to all files in the "changed" scope every time relations need updating, which happens very often, e.g. when files are added or updated).

Depending on the database size, this means hundreds of thousands of checks. You can se the runtime in the log file by searching for CIMRelationManager::UpdateRelations. It shows how many versions and masters were processed, how many files were checked and the execution time.

thrinn

  • Hero Member
  • ***
  • Posts: 921
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #4 on: January 26, 2021, 06:16:09 PM »
Maybe set up both variants in parallel (few more complex rules vs. more different but easier rules) but make sure to always deactivate one set. Then you can test both approaches, checking against the logged run time as Mario said. I assume that deactivated rules do not have any performance impact.

I use different rules myself and did not experience any performance issues. But my database is small (< 30.000 files), and my computer quite new, so your mileage may vary.

Just as a side note: I find it difficult to "read" RegExp without trying, but wouldn't your simplification rule 4 make a JPG file a version of itself?
Thorsten
Win 10 / 64, IMatch 2018, IMA

Mario

  • IMatch Developer
  • Administrator
  • *****
  • Posts: 27956
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #5 on: January 26, 2021, 07:06:41 PM »
Quote
rule 4 make a JPG file a version of itself?

Looks like it.

bekesizl

  • Jr. Member
  • *
  • Posts: 84
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #6 on: January 27, 2021, 07:58:53 AM »
Thank you Thorsten, it is a good idea with testing multiple rule sets for performance.

Regarding rule 4 it was a quick display of another rule structure, without test.
Although a JPG version of a JPG is OK, like editing a JPG in a lossless editor and exporting it with a string appended to the filename.
But I should probably change this one, so that the name has to be different (some string appended) and cannot stay the original. But these rules operate in the "Master folder", so this is taken care by the filesystem anyway.

Carlo Didier

  • Super Hero
  • ****
  • Posts: 1657
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #7 on: January 28, 2021, 08:11:40 AM »
If you don't notice a performance degradation, I would go for seperate rules.
Those would be simpler to understand, debug and maintain.

bekesizl

  • Jr. Member
  • *
  • Posts: 84
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #8 on: January 31, 2021, 05:53:47 PM »
I ended up with following rules in following order.

It took some time to redo all exisiting relations on my database (over 70.000 files), but for adding some new files processing time is alright.

DxO Sidecar (buddy)
\.(cr3|cr2|nef|tif|tiff|jpg|jpeg|dng|nef|rw2|raf|srw|arw)$
/^_*//
^{name}{ext}\.dop$


ON1 sidecar (buddy)
\.(cr3|cr2|nef|tif|tiff|jpg|jpeg|dng|nef|rw2|raf|srw|arw)$
/^_*//
^{name}\.on1$


Mylio XMP files (buddy) - Workaround for application compatibility
\.(tif|tiff|jpg|jpeg|dng)$
/^_*//
^{name}\.xmp$


JPG (buddy+version)
\.(jpg|jpeg)$
/^_*//
^{name}[+\-_]+.*\.(jpg|jpeg|dng|tif|tiff|afphoto|psd|heic)$


RAW (buddy+version)
\.(cr3|cr2|nef|rw2|raf|srw|arw)$
/^_*//
^(_*{name})[+\-_]*[0-9|a-z]*\.(jpg|jpeg|dng|tif|tiff|afphoto|psd|heic)$

Mario

  • IMatch Developer
  • Administrator
  • *****
  • Posts: 27956
Re: File relations - Splitting a complex rule to more simple ones OK?
« Reply #9 on: January 31, 2021, 06:17:29 PM »
If this is what you really need...

Be careful with metadata produced by Mylio. The XMP records I have seen are pretty basic and only contain the small subset of XMP fields Mylio knows about.