Help wanted on filtering when one or more values in a set are missing

Started by dcb, June 16, 2025, 10:46:52 AM

Previous topic - Next topic

dcb

Inspiration hit me today that I can use filters to help my workflow instead of a series of categories. It will save me time and reduce the number of intermediate categories I have.

Every thing is working perfectly. I can filter for missing copyright, no set event (missing from category), no GPS, no people and not marked final (i.e. workflow is done for this image). I even have it successfully filtering out versions and buddy files which otherwise confuse the issue.

I want to be able to filter on IPTC location fields and show files where one or more fields is missing. If ANY of Country Code Shown, Country Shown, State/Province Shown, City Shown or Sublocation Shown are missing, I want to see the file.  Some files won't have a City or Sublocation and that's fine. For my workflow I at least need to know what they are so I can determine if an error or not.

Aside: I am using Autofill so missing metadata will disappear, and I have a dynamic category for my locations. I also have a large number of images to check.

The Value filter isn't working for me. I have one set up for each variable. Images only show if all are blank. As soon as one is filled in, the filter hides the image.



I can't crack it. Maybe filters here are not the answer. The other alternative is to have a Dynamic Category for each of Country, City, etc, filtered for missing values and then use the filter on Categories to select images in any of them.

Any help appreciated.
Have you backed up your photos today?

Mario


QuoteThe Value filter isn't working for me. I have one set up for each variable. Images only show if all are blank. As soon as one is filled in, the filter hides the image.
Filters are combined via AND not via OR, so using multiple value filter to do a "no city" OR "no location", ... cannot work.


QuoteIf ANY of Country Code Shown, Country Shown, State/Province Shown, City Shown or Sublocation Shown are missing, I want to see the file. 
In other words, you want IMatch to check the contents of these tags:

+ Code Shown
+ Country Shown
+ State/Province Shown
+ City Shown
+ Sublocation Shown

and if at least one is empty, the file should be returned by your filter.

Several possible approaches come to mind:

1. A data-driven category for each of these tags, with "Other".
Then in the filter panel use a category filter and tick the "Other" category for each of the filter categories and use the "Or" operator.

2. Formula-based categories similar to the standard "No Description" or "No Title" categories in the "IMatch Workflow" categories set, e.g. "No City", "No Location". Then tick each of these categories in the Category Filter.

3. A variable-based filters with multiple chained "hasvalue" functions, one for each tag. The variable outputs e.g. "A" if a tag has a value, like:

{File.MD.countrycode|hasvalue:A}{File.MD.country|hasvalue:A}{File.MD.XMP::iptcExt\LocationShownProvinceState\LocationShownProvinceState|hasvalue:A}

Then you can filter for all files with a value <> AAAAA.

Performance is OK (1) to not so good (3) depending on your database size.



dcb

Thanks Mario. As always you describe the problem better than I manage to. 

Option 1 would be my preferred option as well. Getting late here in Australia so I'll try that tomorrow after work.
Have you backed up your photos today?

Mario

Very good. Give it a try and give us some results (database size, performance etc.). I'm always happy to get more data.

dcb

Hi Mario,

I started with option 1, the creation of dynamic categories for a filter on Other. It proved unworkable very quickly. There is too much delay between setting the metadata fields (City, State etc) and the category updating for the filter to update. Instead I've gone with the brute force approach.

Almost every image will have a Location/City/State/Country/Country Code. For those which don't have a Location, almost all of those will have City/State/Country/Country Code. And everything will have State/Country/Country Code. This is fast.

For now I've created 5 filters, each identical apart from the field. As I walk down the list, using Autofill wherever possible, I'll see fewer and fewer "errors". By State I should have none. For example, when processing a folder of photos taken in my city of Bendigo, everyone will have Bendigo/Victoria/Australia/AUS and then I'll just have to assign locations.

My filters in order are:

5.1 Missing Location
5.2 Missing City
5.3 Missing State/Province
5.4 Missing Country
5.5 Missing Country Code

I will put a more detailed description up on my website at the weekend (and post here) that describes all the filters I've created. I use a "Final" label to those images I've added all the metadata to and I'm combining that with the filters above. Let's say for whatever reason a photo has none of the above values. I can tag it Final and it will then disappear from the list as processed. And for good measure I'm filtering out buddy and version files each time as well.


Have you backed up your photos today?

Tveloso

I have a Stored Filter called Incomplete Location Data which I think might do what you need.

It's based on both the Other element of a Tag-based Data Driven Category (what Mario has described, and you have already done), and on a Formula Category (to cover "partially populated Location Data"), combined with OR:

    You cannot view this attachment.

The first is a 5-Level Data Driven Category:

    1. Country
    2. Country Code
    3. State/Province
    4. City
    5. Location

...and the Other Element at Level-4 is the only one selected:

    You cannot view this attachment.

So that returns files where Country, State/Province, and City are all blank.

    (Having levels for both
Country Code and Country is a bit redundant, but I include both to ensure that they're consistent)

The second Category uses this formula:
("@MetadataTag[countrycode,hasvalue]" OR
"@MetadataTag[country,hasvalue]" OR
"@MetadataTag[state,hasvalue]" OR
"@MetadataTag[city,hasvalue]" OR
"@MetadataTag[location,hasvalue]") AND
("@MetadataTag[countrycode,novalue]" OR
"@MetadataTag[country,novalue]" OR
"@MetadataTag[state,novalue]" OR
"@MetadataTag[city,novalue]")
This covers files where I have partially populated any of the various Location Levels, but it allows the Location Tag to be blank. 

So my requirements are that if a File has the Location Tag filled, all tags "above that" (City, State/Province, and Country) must also be filled...and every file must have data at lease down to City (so a blank Location does not get called out by the Filter, as long as those other Tags have a value).

Quote from: dcb on June 17, 2025, 11:18:14 AMI started with option 1, the creation of dynamic categories for a filter on Other. It proved unworkable very quickly. There is too much delay between setting the metadata fields (City, State etc) and the category updating for the filter to update
I sometimes also see a slight delay, with files that no longer violate the Filter, being removed from the scope.  But most of the time, it's immediate.

So as soon as I populate "the required Location Tags" (whether that be via AutoFill, Reverse Geocoding, or Copy&Paste in the MD Panel), the files usually immediately disappear from the File Window when the Incomplete Location Data Filter is active.  When there is the occasional delay, it can be many seconds before that update takes place (presumably due to the Categories not updating right away, as you said), but that doesn't cause me any issues.

When I reach the Location Phase of my workflow, I activate the Incomplete Location Data Filter (and many files immediately disappear from the FileWindow, since they were given proper location data at ingest - thanks to IMatch Locations), then send that subset of files to a Result Window.

Since the Filter Panel is automatically paused for a Result Window, it doesn't matter that there may be the occasional delay in updating the categories (since I'm not using the filter now anyway).  If the delay happens, I can still see that, by virtue of the Thumbnail Color Bar not updating immediately (since I have color coding active for the location Category, and when a file gets moved from the "Other" category, into a "proper location category", it contributes a green segment to the color bar), but it's of no consequence if that doesn't happen immediately.
--Tony

dcb

Quote from: Tveloso on June 17, 2025, 06:45:07 PMI have a Stored Filter called Incomplete Location Data which I think might do what you need.

Sounds great Tony. I'll give it a look. At worst, it forms part of my general, every now and then "QA to check what's wandered off the path" series of checks that apply across the whole database. 
Have you backed up your photos today?

dcb

Quote from: dcb on June 17, 2025, 11:18:14 AMI will put a more detailed description up on my website at the weekend (and post here) that describes all the filters I've created. I use a "Final" label to those images I've added all the metadata to and I'm combining that with the filters above. Let's say for whatever reason a photo has none of the above values. I can tag it Final and it will then disappear from the list as processed. And for good measure I'm filtering out buddy and version files each time as well.
This has been my morning, but it's done.

- Filtering the Working Folder: A smarter start to image cataloguing

I've also updated my general workflow at Mediabank.
Have you backed up your photos today?

dcb

Quote from: Tveloso on June 17, 2025, 06:45:07 PMThe first is a 5-Level Data Driven Category:

    1. Country
    2. Country Code
    3. State/Province
    4. City
    5. Location

...and the Other Element at Level-4 is the only one selected:
What do you mean by Other element at Level-4. That's City. Are you select every City's Other element in the filter?

Quote    (Having levels for both Country Code and Country is a bit redundant, but I include both to ensure that they're consistent)


In my travels I read that if both Country Code and Country are present, the Country Code wins. Sure, in most cases redundant but I'm using it for photos from England, Scotland and Wales. Country Code = GBR (correct), Country = England or Scotland or Wales (incorrect) but matches the way I want to find images.

QuoteI sometimes also see a slight delay, with files that no longer violate the Filter, being removed from the scope.  But most of the time, it's immediate.

So as soon as I populate "the required Location Tags" (whether that be via AutoFill, Reverse Geocoding, or Copy&Paste in the MD Panel), the files usually immediately disappear from the File Window when the Incomplete Location Data Filter is active.  When there is the occasional delay, it can be many seconds before that update takes place (presumably due to the Categories not updating right away, as you said), but that doesn't cause me any issues.


Updates are always too slow when you're in testing mode. Your category formula is fast enough for me today to be workable. 

I'll have to look into location matching on import. That will save me some time.


QuoteWhen I reach the Location Phase of my workflow, I activate the Incomplete Location Data Filter (and many files immediately disappear from the FileWindow, since they were given proper location data at ingest - thanks to IMatch Locations), then send that subset of files to a Result Window.

Since the Filter Panel is automatically paused for a Result Window, it doesn't matter that there may be the occasional delay in updating the categories (since I'm not using the filter now anyway).  If the delay happens, I can still see that, by virtue of the Thumbnail Color Bar not updating immediately (since I have color coding active for the location Category, and when a file gets moved from the "Other" category, into a "proper location category", it contributes a green segment to the color bar), but it's of no consequence if that doesn't happen immediately.
Again, stunned at all the hidden features in IMatch and I've been using it since 2003! Never knew I could sent a subset of files to a result window. 
Have you backed up your photos today?

bekesizl

I recently started also to use a Categories approach similar to this Working folder / Workflow categories.After some thinking I revised those categories and formulated them differently.

My question to Mario is, if it is more efficient this way:
1) Create one formula category for "No Country". As I understand, this evaluates the complete database anyway.
2) If I want to use this information elsewhere, create a formula category, e.g. "Working Folder" AND "No Country"
In my opinion this is only a "set operation" and if should be more efficient, then to have multiple categories needing to evaulate the formula for metadata query everytime.

sinus





That is why I have read the help-files of IMatch several times to lern such stuff.  ;) 
Use a subset of files and send it to a result window is a very powerful thing. 
I use it very often. 

What I also use quite often, are 10 categories, what I have created and named with "bookmark-1", bookmark-2 and so on.
There I can throw some images in and even after closing, reopen IMatch, they are still there. Very good for working with events or files for editing and so on. After finished, I simply unassign the files from the bookmark and this number is then free again for other files.  :)
Best wishes from Switzerland! :-)
Markus

Mario

Quote from: bekesizl on June 20, 2025, 07:59:46 AMI recently started also to use a Categories approach similar to this Working folder / Workflow categories.After some thinking I revised those categories and formulated them differently.

My question to Mario is, if it is more efficient this way:
1) Create one formula category for "No Country". As I understand, this evaluates the complete database anyway.
2) If I want to use this information elsewhere, create a formula category, e.g. "Working Folder" AND "No Country"
In my opinion this is only a "set operation" and if should be more efficient, then to have multiple categories needing to evaulate the formula for metadata query everytime.
It's always faster to have only one @Metadata category than several. Each @Metadata formula evaluates the tag values for the entire database (same as a data-driven category level).

Tveloso

Quote from: dcb on June 20, 2025, 06:09:40 AM
QuoteThe first is a 5-Level Data Driven Category:

    1. Country
    2. Country Code
    3. State/Province
    4. City
    5. Location

...and the Other Element at Level-4 is the only one selected:
What do you mean by Other element at Level-4. That's City. Are you select every City's Other element in the filter?
Sorry, that was quite unclear.  I should have added "after drilling down through the Other Category at each of the higher levels":

    You cannot view this attachment.

So it's actually files in Category Other|Other|Other|Other that this part of the filter returns.  Files in that category have no value in all of Country, Country Code, State/Province, and City.

I suppose I could have added that condition to the Location Exceptions category, and made the Incomplete Location Data filter based on just that one Category, but I thought since IMatch was already tracking the "no data at all" condition, the Location Exceptions category should handle only the condition where a file had some location data, but it was not "correct/complete"...and it might be useful to keep those two conditions separate.

Quote from: dcb on June 20, 2025, 06:09:40 AM
Quote    (Having levels for both Country Code and Country is a bit redundant, but I include both to ensure that they're consistent)

In my travels I read that if both Country Code and Country are present, the Country Code wins. Sure, in most cases redundant but I'm using it for photos from England, Scotland and Wales. Country Code = GBR (correct), Country = England or Scotland or Wales (incorrect) but matches the way I want to find images.
This probably also speaks to my ignorance of the structure of other Countries.

I actually added the Country Code to the Location Category, because some time ago I had the bright idea to use the two-character ISO Country Codes in IMatch, and actually changed the value in all files that that had one (USA became US, GBR became UK, etc).  But since IMatch prefers the three-character Codes, after a while, I wound up with a mix of both.  So I thought an easy way to unify them again (and return to using the three-character codes), would be add Country Code as a level in the Location Category.

Quote from: dcb on June 20, 2025, 06:09:40 AMAgain, stunned at all the hidden features in IMatch and I've been using it since 2003!
Yes indeed.  It has been more than once that I've said aloud: "that's so cool" after learning something new about IMatch.
--Tony

Mario

From time to time I ponder if adding the option "Automatic Update" to formula-based categories would be useful for a substantial share of the user base.

Categories which combine multiple (even many) @MetadataTag formulas to do whatever kind of analysis are helpful, but maybe are needed only occasionally. Refreshing them only when needed could be a performance boost. Not sure if it is worth the effort.

dcb

Quote from: Mario on June 20, 2025, 03:01:20 PMFrom time to time I ponder if adding the option "Automatic Update" to formula-based categories would be useful for a substantial share of the user base.

Categories which combine multiple (even many) @MetadataTag formulas to do whatever kind of analysis are helpful, but maybe are needed only occasionally. Refreshing them only when needed could be a performance boost. Not sure if it is worth the effort.

I have about 40,000 files running on a motherboard M.2 SSD. That's more than quick enough but even so I've never had a problem with category refresh times. The delay I had earlier was between a change and when the update occurred. I trust your design decisions to prioritise other background functions for optimisation.

Once you start on the toggle automatic update state for all categories, we'll want it for some, then we'll want it for groups and it will also create a headache for you when new users don't understand why an update isn't happening (we've all done something like that).

I use more categories than the average user would and I don't think it's worth it. 
Have you backed up your photos today?

dcb

Quote from: Tveloso on June 20, 2025, 02:41:29 PMSorry, that was quite unclear.  I should have added "after drilling down through the Other Category at each of the higher levels":
   
No problems. I do that all the time. I understand where you're getting at now. It's working technically, but not quite yet set for my workflow but I know where to begin.

The case I'm missing is "No location data at all", whereas your's is only showing images where location (or other field) is present, but another above location is missing. Now I think I'm being unclear  :) I could get this from other/other/other/other/other but the whole point here is to avoid jumping around the interface to check things. A second category isn't going to be too much of an overhead I don't think.

Edit: Fix for me is to include the lowest level Other location. That then shows images which have no location information at all, or via your category, those which have a location but are missing other info.
Have you backed up your photos today?