Natural Language Queries

Started by javiavid, October 15, 2025, 09:29:03 AM

Previous topic - Next topic

javiavid

I just tried the new option, I have tried the 3 recommended models, I have also tried activating and deactivating tags, with variables...

I did it with 45 photos that already had the autotagger process done with description and keywords.

The problem is that it does not filter the images.
I'm probably doing something wrong or missing some step.

To be more specific with the error, I tell it to search for pool images and it shows me any image, a keyboard, a mountain... There are only 2 pool photos in the selection and when searching it shows me 38 photos.

Mario

Did you Rebuild the embeddings after every change?

I understand you created embeddings for 45 images, from description and keywords.
When you search, you get results (you can only get results from images with embeddings) but the results are not as you expect them.

IMatch will search the 45 images and open a result window showing the first 200 matches (in your case, maximum 45 images).

Is the result set sort profile set to "Default" so it sorts by similarity?

It is normal to get nonsense results ("stuff") because IMatch does not cut off the results when a minimum threshold is reached. It just returns the first 200 images that match the embedding created from the query best.


javiavid

Yes, in each change use the rebuild option, I also tried remove embeddings and delete embedding.

I have tried in several languages, with more or less photos, but the result is totally random.

I have ordered the images by default, I have tried other options, but I can not get the images that have in their keyword and description the word pool appear in the top 20.
The first ones I have are a tree, a piano...

Is there something I am doing wrong? Can you think of anything to try?

Mario

Then this is what you get.
IMatch creates embeddings using the model you configured. It creates an embedding from your search query.
If you have only 45 images, it will always return all images as matches, sorted by the similarity computed from the embeddings.
The similarity is shown in the result window when you use the Result Window layout. Which similarities do you see? Different than zero, sorted correctly?

javiavid

Maybe I misunderstood the feature.
I have several photos with keywords and a description with the word pool. If I do a standard search by pool, those photos appear correctly.

If I use the new feature with the words pool or woman in pool, many photos appear, but the first photos are not the pool photos. They are in the middle of the list.

Is this normal?

I don't mind that many photos appear, but the ones of the pool should be the first.

Mario

Then the AI embeddings created match the first photos shown in the result window "better".

This is a "fuzzy" search. IMatch just compares the embedding produced for your query with the embeddings created for your images. And then lists the matches in ascending order.

Usually you don't search for a single word either. If you want to find all images with the keyword "pool", just use the search bar in the File Window. If you want to find all files with keywords or descriptions describing "pool-like" things, you use the AI-enabled search.

In my test database, for example, I have product photos of handbags, duffel bags, rucksacks, and photos of people wearing bags of all sorts etc. These terms also reflect in the keywords / descriptions.

When I only search for bag, I get photos containing bags or similar, in no particular order.
When I search for "person wearing a bag", the product photos move down in the search result and the photos showing people wearing bags, sacks, rucksacks, duffel bags move up.

When I search for "person wearing a bag outside", the product photos move down in the results, and photos of persons walking on streets or in the woods carrying a bag or rucksack move up. 

If you know what you are looking for (images with the keyword pool), use the File Window search bar. If you don't know the exact word, or the words used to describe a "concept" or "motive" or "thing" vary, AI-enabled search can be very helpful.

I've had good success just using {File.MD.description} for the query to reveal all images with "similar" descriptions to the one focused when running the query. Same for {File.MD.hierarchicalkeywords}.
With only 45 files, you won't really see a benefit. I've worked with databases containing 50,000 or 100,000 images taken over 25 years and described by hand and later by AI.

javiavid

Thanks for the answers Mario, I will use the standard search, I will try again in the future.

monochrome

Quote from: javiavid on October 15, 2025, 09:29:03 AMThe problem is that it does not filter the images.
It's not supposed to filter them. It picks the 200 semantically nearest images, and if you only have 45 it will pick all of them, always.

If you also don't sort by similarity (there's a dropdown with sort order), you will always get all your images, sorted by file name.

Trying creating embeddings for 10,000+ images, and then search, and sort by similarity. It'll be a much better experience.

Stenis

#8
Thanks for your efforts Mario. I shall install the new version today. I haven't seen info about the new release before now since I have had plenty to sort with the new malfunctioning AI-features in DXO Photolab 9 (really severe problems with Nvidia GPU-drivers).

Looks like quite a few interesting new things in this package. NLQ is one of them. I'm also looking forward to test test the released version of GPT 5 since my tests with prerelese version was not entiredly encouraging. I managed to get it working  but it was really much slower than GPT 4.1.

Mario

Quote from: Stenis on October 18, 2025, 12:21:16 PMI'm also looking forward to test test the released version of GPT 5 since my tests with prerelese version was not entiredly encouraging. I managed to get it working  but it was really much slower than GPT 4.1.

Keep in mind that "newer" does not need "better" in all disciplines. Currently all tech companies zoom in on "thinking", agentic workflows and MCP. Not necessarily improvements for visual capabilities or image to text.
New models must always be carefully evaluated and tested for a given usage scenario, performance and cost.
Image to text seems to platform out already.

I guess most users are perfectly happy with GPT 4.1 mini or even locally running models like Gemma 3 12B/4b or the new Qwen 2.5 7b which AutoTagger also supports.


Stenis

The problem Mario is that I have never got version 5 of OpenAI to work at all with this new version. It is completely dead. I thought it might have been the Rate Limits but later I have started to wonder if it is not using the right endpoint of the OpenAI API for version 5.

... but you are absolutely right. I, since Gemma was even part of the AI-support for the "Flexible Natural Language Query", I even by chance tried that again in Autotagger and I was astonished by how well it solved even problems I know I earlier had with it with my animal pictures. It nailed it all - quickly too - so now I´m thinking of using it instead of OpenAI which has had some small but important issues.

But why is version 5 dead??

 I even have problems with the "Natural Language queries"

"Error while processing the query ... " that with Gemma.


Mario

#11
I have no issues running GPT 5 And GPT 5 Mini.
But the response times are real bad, 17 seconds for GPT 5 and 20 seconds for GPT 5 Mini (at 13:20+02:00 on October 19.).

Maybe this caused the the "thinking" approach of these "reasoning" models, which always adds a lot of runtime and cost? And which is one of the major differences to the 4.1 model generation.

I think there is a proprietary setting in the OpenAI API to reduce the reasoning effort, and other arguments to control the reasoning. But setting this all to minimal reduces GPT 5 basically to GPT 4, I think?! The reasoning aspect and the ability to call tools and MCP are the main settling point of the GPT 5 series, and I don't think that you will get better results than with GPT 4.1 anyway.

I've tried some of the reasoning-related parameters, but it always produces error messages. The OpenAI documentation has become a copy & paste minefield full of non-working examples, unclear documentation of parameters, unclear statements which model supports which parameters, which defaults are used for omitted parameters etc.
Probably helps the bottom line - requests which are denied because of faulty syntax still cost money.

Stenis

I got it to work with the file you sent me with poor performance though. With this release nothing seems to happen.

I´m lifting this because we never know when OpenAI decides to discontinue GPT 4.X API like they did with the chat models - they then had to partly back on.

Mario

If OpenAI returns an error, runs into a timeout or something else fails, IMatch will add entries to the log file.

Stenis

One thought Mario: Say you will fall under a bus or getting totally fed up developing, then after a while several models now suported will get discontinued.

Then it might be a good idea to secure local models running on for example Ollama I guess.

Mario

QuoteOne thought Mario: Say you will fall under a bus or getting totally fed up developing, then after a while several models now suported will get discontinued.
This is true for all 3rd party services, from reverse geocoding to the Map Panel to AutoTagger.

The same will happen if there are breaking changes and you did not upgrade IMatch to a version which supports the latest APIs. While IMatch keeps running as before, access to external services may cease to work if the services become incompatible to your old IMatch versions. Just the nature of things.

Quotesecure local models running on for example Ollama I guess.
Same problem! Newer models require changes to Ollama / LM Studio which may require changes in IMatch to support newer APIs published by Ollama / LM Studio.

Newer versions of Ollama / LM Studio may drop support for older models at some point in time.

Everything in AI is moving at breakneck speed. Stabilizing APIs, long-term backwards compatibility, LTRs (long-term releases) common for many other technologies is definitely not on the list of AI companies at the moment. Or for the near future.

Stenis

#16
I'm pretty convinced every commercial AI-API will get discontinued sooner or later and if there will be no one creating interfaces for them the logical end will eventually be, there will be no newer versions to replace them with.

Still I'm not all that worried since there will always - as long as XMP-metadata still rules the photo metadata-scene - be a possibility to migrate to another XMP-compatible DAM where they still maintain their API- interfaces.

If the AI-industry won't take care of the compatibility issues in the long run in a proper manner it will cease to be an industry and just will revert to be a proprietarian universe wilh a lot of separate systems without any interoperability. Very much like the camera industry looks today.

Mario


QuoteStill I'm not all that worried since there will always - as long as XMP-metadata still rules the photo metadata-scene - be a possibility to migrate to another XMP-compatible DAM where they still maintain their API- interfaces.

That's why I care so much about standards. Creating rich and standard-compliant metadata is one of the key aspects of IMatch's metadata management. 

It depends on the capabilities of the target system how much metadata it can import. Pro-grade DAM systems are usually good, with end user systems you need to check.

Stenis

iMatch is exemplary but the RAW-converters are not and not even PhotoMechnic as they still are stuck in IPTC all of them and have not have the skills or strength to take on a rewrite of that. Instead, they are converting or forking data to and from XMP. Soner or later they will have to do that job. This is an important reason I decided to migrate to iMatch when I took that decision and that is one, I haven´t regretted.