Share Examples?

Started by Mario, March 10, 2025, 11:09:12 AM

Previous topic - Next topic

Mario

I would be great to see some examples of results you produced with AutoTagger, including

  • a small version of the image
  • the AI used
  • your prompt
  • your results (keywords, description, landmarks, traits)

Most people, me included, learn best from examples, and if you are satisfied with your AutoTagger results, let others know how you did it.
Note: Please do not upload images of persons, or blur their faces before uploading.

Jingo

Great idea!

Here is mine: 
You cannot view this attachment.

OpenAI Prompts:

Description prompt:

[[-c-]] Describe this image in a style to make it easily searchable. Use simple English, common words, factual language, and simple sentences.
Avoid describing anything not directly observable from the image.
{File.MD.city|hasvalue:This image was taken in {File.MD.city},}{File.MD.country|hasvalue: {File.MD.country}.}

Keyword prompt:

[[-c-]] Return ten to fifteen keywords describing this image.
{File.MD.hierarchicalkeywords|hasvalue:This image has the following keywords: {File.MD.hierarchicalkeywords}.}


Results:

AI Description:
This image shows a white-throated sparrow on the ground among leaves. The bird has brown and gray feathers, with a distinctive striped head. It is in a natural outdoor setting, surrounded by fallen leaves. The sparrow appears to be foraging for food.

AI Keywords:
AI|American Sparrow; AI|Brown Feathers; AI|Foraging; AI|Ground; AI|Songbird; AI|Striped Head; Animals|Bird; Animals|Bird|White-throated Sparrow; colorful; Habitat; Objects|Leaves; outdoors; Places|Nature; small bird; wildlife

I haven't yet set up 1:1 keywords mapping for some of the returned hierarchical keywords.. but will be doing that next.

Stenis

Thanks for your input Jingo and god to have an explicit headline for the prompt examples Mario. I´ll contribute a little later.

One thing I didn´t understand is this:

{File.MD.city|hasvalue:This image was taken in {File.MD.city},}{File.MD.country|hasvalue: {File.MD.country}.}

Were there no data in these "Location"-elements of yours?


Mario

Quote from: Stenis on March 10, 2025, 03:08:00 PMOne thing I didn´t understand is this:

{File.MD.city|hasvalue:This image was taken in {File.MD.city},}{File.MD.country|hasvalue: {File.MD.country}.}

Were there no data in these "Location"-elements of yours?
This variable checks if the city tag has a value. If yes, it outputs something like "This image was taken in London, United Kingdom". If there is no city value, the variable outputs nothing.

Not all files in a database may have location data when the prompt runs. The hasvalue check ensures that the phrase is only added to the prompt when there is data.

Stenis

#4
Thanks for your reply Mario.

Below you can see my basic "static configuration" I have landed in for now and a few F7 prompt examples



I use all three static prompts the "Description", "Keyword" and "Landmark" now and never changes that data.
Landmarks are written (if they are found) to merge with the data in "Description" with a variable.
I always use flat keywords
I even think the "Landmark" works surprisingly well :-)

One other important condition for my workflow is that I still use PhotoMechanic to batch update about 20 data-elements in iMatch.
For me it is much faster than doing it in iMatch.
AutoTagger updates the "Description" and "Keyword"-elements with AI.
Compared to my earlier process, this is extremely much faster and fare more efficient due to I don´t have to write descriptions manually. It is like a stone has fallen from my shoulders and work has got so much easier.

I just have to read the ready-made texts now and adjust a few things sometimes but most of the time it is much more than good enough as it is
I´m also very happy with the job OpenAI is doing there in the background.
What a change this has brought to my processes!
It feels like getting a whole new photo archiving life!

Fantastic job you have done Mario!
I was afraid the system should be Beta and not flexible enough but it really is very well engineered and solid already.
Frankly, I did not really expect that I think and am glad I´m proven wrong.



At Description prompt (static)

[[-c-]]
Always priority for gepgrafic data, the year and peoples names.
Always write text inside quotes but without the quote signs.
Describe this image in a style to make it easily searchable.
Use simple English, common words, factual language, and simple sentences.
Avoid describing anything not directly observable from the image.
Write the geografic data and year in the beginning of the text together with names



At Keyword prompt (static) (Presumption: Only non-hierarchic keywords)

[[-c-]]
Keywords never in plural form
Return five to seven keywords describing this image.
Use simple English, common words, factual language.
Max one word in each keyword.
No geografic data or time info as keywords.
No words with only capital letters.


At Landmark prompt (static) (Presumption: Merge with existing target (Description) values

[[-c-]] Return the names of known landmarks and tourist spots in this image to  {File.MD.description}
If you cannot detect any landmarks or tourist spots, return ''.



At the F7-prompt (non-static)

Template

Keywords never in plural form
Always write text inside quotes but without the quote signs.

Mandatory text: "Place Country Year" - "motif"
In foreground "Name"


Examples


1.

Keywords never in plural form
Always write text inside quotes but without the quote signs.

Mandatory text: Essaouria Morocco 2025 -
"Essaouria Fishing Harbor"

2.

Keywords never in plural form
Always write text inside quotes but without the quote signs.

Mandatory text: Eze Sur Mer Provence France 2012 -
"Bye the sea"
Ingrid Sändh in the foreground



Mario

To attach files, use the Attach command available bellow the full post editor. If you use Quick Reply instead, click "Preview" once to see all controls for adding attachments.
Copying and pasting images from somewhere does not work.

Stenis

#6
Using Open AI GPT 40 instead of GPT-40-Mini with Autotagger is a completely different thing!!!

Earlier I complained about my pretty poor results with using Open AI mini or Gemma 3 4 on the task of figure out which species there were in the pictures. They both struggled with deciding which antelope or birds it was in the pictures. Open AI got a bit better after tweaking the prompts and adding the names of the animals, then they fixed even to fetch the latin names often but then the whole idea of speeding up these workflows got jeopardised I thought.

Then I saw I also had the option of using GPT 40 and I did and my jaw dropped. It was almost 100% adding name of the species, the latin name and the families the animals belonged to and that was the case both for mammals, birds and reptiles.

You cannot view this attachment.

This was my prompt texts:

Mandatory: Write the specie and the name in latin plus the family of species they belong to into Description.
Write the specie and family name into Keywords.


These were the results:

Description: A group of Damaliscus lunatus, commonly known as topi, graze in the savanna. The topi, part of the Bovidae family, is characterized by its reddish-brown coat and distinctive dark markings. The scene features scattered trees and open grassland, typical of the African savanna.

Keywords: Antelope, Grass, Savanna, Topi, Tree, Wildlife


You cannot view this attachment.


Here another example with the same prompt av the first example above:
This picture of the really small Dik-Dik antelope neither Gemma 3 4 or GPT-40-mini managed to figure out.

This is absolutely better than I expected. So, when I will tag all my safari pictures I will definitely use the larger OpenAI model. I have quite a few since I have been to East - and South-Africa totally eight times - the first trip already in 1972 - and I also have safari pictures both from Sri Lanka and India too to take care of. So, OpenAI GPT 40 will help a lot. If it solves these problems as effective as I have seen now, I don´t really care about the cost because what I win is so much time and efforts that I don´t really have in my age. From what I have seen it is hallucinating less than the other two smaller models.

I've really struggled to get a handle on all of this before and no matter how effective PhotoMechanic has been even in my hands, there has always been two data-fields/elements that have remained: Descriptions and Keywords On the other hand a picture library just gets better and more useful the more metadata you manage to add on the pictures, that is the positive and motivating side of it.

There is a new model of OpenAI coming soon (version 4.5 I think) that might be even better. There is a real race now it seems. Are we getting access to that one too in the future?

Stenis

You cannot view this attachment.

Den större modellen kostar ca 16 ggr mer att använda vad det verkar

Stenis

#8
I wonder if there is beginning to build a new trend where software companies are trying to push us a lot more than before. Look at this example below.

On top of all other sharpening models in Topaz AI, Topaz have now created a Super Focus-function that processes our pictures in the cloud and using it seems to cost a lot.

If I look in the list below: 
To use Super Focus V2 for one of my 33 Mp pictures costs 14 points per picture and their rekommendation to buy is 400 points for 39.99 U$. For that cost per picture 400 U$ will cover 28 pictures. I tested it and the results on my pictures are better with the non-cloud models luckily enough but the trend is clear I think that we will continue to see attempts of these companies to charge us even for processing time in the cloud now.

Compared to that the AI-companies like OpenAI not just gives us a number of options to chose between but they are have options that are very reasonably priced.


You cannot view this attachment.

Mario

Your point?

"Better" models cost more money. You rent "compute" and when the model is more expensive to compute, you pay more.

The new beta Google Gemini 2.5 model costs 10$ per million output tokens, while the Gemini 2.0 Flash (Light) costs 0.3$, making the 2.5 model 33 times more expensive.

If the free Gemma 3 model does not do what you need, AutoTagger offers OpenAI with two models (cheaper and more expensive), Mistral (two models, cheaper and more expensive) and soon Google Gemini (cheaper and more expensive models).
Thankfully you have IMatch which gives you many options and the freedom of choice.

Stenis

#10
Yes, you have done a wonderful job Mario, giving iMatch all that flexibility. It is really good we have the possibility to "step up" if it is necessary and my examples at least for me shows very clearly that exactly it sometimes is necessary, at least if you value efficiency and productivity more than the absolute cheapest solution.

On the other hand it has rarely been the most technically advanced softwares or processes that "wins" historically seen. It is mostly "the good enough" that takes home the "victory" and the money.

For me my small example with the African animals makes it very clear that 16 times higher cost will be worth it in that particular case of several reasons and maybe for others as well. Taking on many thousands of safari pictures with the more expensive model might be what it takes to get that job done. With cheaper alternatives it might be so cumbersome that it will never be done or finished.

With the mini-model of OpenAI I might need to spend much more time adjusting my prompts and rely on trial-and-error workflow instead of just letting GPT 40 doing all of the job with one single prompt for it all. There is a Swedish expression called "Dumsnålhet" (I'm sure you have one in German too) with means that I might embrace a combination of stinginess and stupidity because I´m just being obsessed with the money I pay for the service.

It could have been the same with th Topaz example but those similarities ends there today because that cloud service just isn´t good enough yet - at least for me. With the tests I have done it still doesn´t deliver because it mostly gives terrible artifacts but say you have a "once in a lifetime shot" and it would have delivered and managed to turn a technically mediocre image to something close to fantastic shot for in that case 70 cents? I could have been the market for that because I do have some historical images that I know are unique taken at the Saur Revolution when it all started in Afghanistan leading to an over 40 year long period of war.

You cannot view this attachment.

Stenis

#11
Sorry, There was no more space for pictures above.

You cannot view this attachment.

This is around noon one of the las days in april 1978 when I took that picture. 30 minutes later I had taken my pictures and sat in Khyber Restarant that lays around the same gigantic roundabout, waiting for my lunch. First they started to shoot with Kalashnikovs and after maybe ten minutes they fired the first canon-shot with one of the T55 tanks parked just 10-15 meters from the restaurant. It was parked with the rear against the restaurant entrance and the shock wave was enormous. If it wouldn´t have been for the decmeter broad aluminum blinders in the windows we might have been blinded all by the glass that flew into the restaurant. The rest is history.

I have quite a few color slides that might need something more powerful than the local sharpening models in Topaz and I know that a lot of other older photographers have the same problems.

Stenis

I will test how well Mistral performs too. You never know what EU will run in to with to US and how that will affect the American AI companies.

I rather buy a german DAM like iMatch than PhotoMechanic especially when it is superior to PM anyway the way I use them. I ditched Lightroom many years ago becuase it was inferior when it comes to image quality to DXO Optics Pro / DXO Photolab and nothing else can compete with danish Capture One when it comes to tethering. ... and I rather use an european AI-system if it performs decently befor an American or Chinese one like DeepSeek if that would have been a real option.

So what is the opinion of Mistral Mario?

Mario

Quote from: Stenis on April 08, 2025, 07:02:36 PMSo what is the opinion of Mistral Mario?
Give it a try. I have written what there is to know in the help: Mistral AI

Stenis

I will, thanks! ... even if OpenAI GPT-40 really impressed me when it comes to identifying animals and write useful descriptions with a minimum of prompting efforts


Stenis

I processed a lot of different animals yesterday with OpenAI GPT-40 which is the bigger model and it only missed once with a Serval cat half hidden in the tall savannah grass that GPT-40 mistook for a Cheetah.

I have just moved this from the Ad Hoc prompt (F7) into the Description prompt:
 
Write the animal species names and the animal name in latin plus the family name of the species they belong to into Description.
Write the animal species names and families names into Keywords.

With that I almost never need to do anything else than to press Run.
It is very very effective.

Stenis

I will soon start a big effort to take on a lot of animal pictures that I have neglected for a long time because it has been so much to face. Of that reason I have felt I really have to be satisfied with my prompting before I start because I don´t want to have to redo it once more in the future. It is pictures taken between 1972 and 2016 in quite a few National Parks both in Africa and India. So, I started to discuss that with a few sessions in Chat GPT 4.1 and got the advice to add the section below to the Description prompt- so I did.

If there is an animal in the image, write the animal's common English name, its family, and its scientific (Latin) name, using this format:

Animal: [Common name]
Family: [Family name]
Scientific name: [Latin name]

If there is no animal in the image, do not write anything. Return nothing.


I also added a new section to the Keywords prompt that I have developed using part of the instructions above and adding a few other things too.

If there is an animal, flower or plant in the image, write the animal's, flower's or plant's  ["Common name"]

If it is a Mammal write Animal and Mammal
If it is a Bird write Animal and Bird
If it is a Reptile write Animal and Reptile
If it is a Fish write Animlal and Fish
If it is an Insect write Animal and Insect
If it is a Flower write Flower
If it is a Tree write Tree
If it is a Plant write Plant
If it is a Fruit write Fruit

If there is no animal, flower or plant in the image, do not write anything.


This setup for addressing especially nature photography pictures has proven to work very well and leaves very little for me to add when I use it with the GPT-4.1 model. I know it costs more than the mini model but I think it is weel spent money since it is more reliable than the mini-models. Be aware though that this setup is just ment to be used with a flat keyword model. As you can se in the instructions above I also add that it IS an animal and what type of animal or plant it is etcetera.

Here a few examples (without pictures) of different type of animals

Lizzard: (Reptile example)

Serengeti Tanzania 2012 - A striking lizard rests on a rock. The reptile exhibits vibrant purple and blue scales, characteristic of the Chamaeleon species. This reptile showcases a beautiful color pattern. Common name: Agama lizard Family: Agamidae Scientific name: Agama agama

Agama Lizard, Animal, Reptile


Zebra: (Mammal example)

Serengeti Tanzania 2012 - Young plains zebra stands near water. Black and white stripe on body. Light reflects from water. Earth tone background. Typical Serengeti view. Common name: plains zebra Family: Equidae Scientific name: Equus quagga

Animal, Mammal, Plains Zebra, Stripe, Water



Verreaux's Eagle: (Bird example)

Serengeti Tanzania 2012 - A large dark eagle sits on a rock with green grass in the background. The bird faces away, showing detailed feathers. Good light and natural setting. Useful for wildlife and bird search. Common name: Verreaux's eagle Family: Accipitridae Scientific name: Aquila verreauxii

Animal, Bird, Grass, Rock, Verreaux's Eagle


Frangipani flower: (Flower example)

Zanzibar Tanzania 2012 - White frangipani flower with yellow center and dark green leaves. Close-up photo of the flower in daylight. Common name: Frangipani Family: Apocynaceae Scientific name: Plumeria alba

Flower, Frangipani, Green, Leaf, Nature, Petal, White



Inorder to get a mandatory initial text I also always includes a text like the one below in the ad hoc prompt window. The last months I have found that in my case Open AI GPT 4.1 API has proven to be the most reliable when it comes to follow formal instructions on for example text formatting. I have found Googles models have had more problems with that and that is the reason I now stick with Open AI.

Mandatory text: Serengeti Tanzania 2012 -


I have many times among more conservative photographers and museum workers working with digitizing and metadata met an attitude that their work is soooo special, creative and demanding that it can´t be rationalized or streamlined at all with automated workflows or AI and I have to say that half a year ago, before I migrated to iMatch 2025 and started to use the latest API-models we got a couple of moths ago I was prepared to agree to that.

I think just these small examples shows that Autotagger with proper prompting can do an awful lot of smart and effective things with our workflows. Of cource there will be occations when AI misses or our prompting leaves for something more to wish but no we are very close to zero metadata administration. I must say it is very few texts I correct manually these days.

Autotagger can´t always match texts that I really put some efforts in manually BUT if I am honest, I can see in my PhotoMechanic Descriptions that they often tends to be shorter and less descriptive because writing them manually is so time consuming. With the Autotagger the generated texts and Keywords often gets generally more well written and the quality gets more consistent.

It is also very interesting now with the new "resoning" AI-models that they really can help us writing better prompt instructions. This is really something that empowers us in our dialog with iMatch. It is really a paradigm shift we are witnessing and experiencing now and it is really very exiting.









jch2103

Thanks; that's very useful information. I'm glad Mario set up this discussion thread. 
John

Stenis

#18
Me too and the Photools community is a very good and responsive resourse which have helped me a lot.

I want to stress one thing a bit and it is that I use the GPT-4.1 almost exclusively to nature pictures where I am interested in getting help with identifying species of all sorts. So for safari pictures with animals on I use nothing else than GPT-4.1 API. For everything else I prefer the GPT-4.1-mini which absolutely is "good enough" for a fraction of what it costs to use the bigger model.

There is also another upside too, where the smaller model both is faster and less prone to return unwanted texts like "there is no animal or plant in the picture" or "there is no texts or signs in the picture". So, they are both indispensable but of different reasons One god thing is that we just need to write one set of promts for all OpenAI 4.1 API-versions.


Mario

Thanks for sharing.

The next release of IMatch brings support for the Google Gemini 2.5 Flash Lite, which has just been released. 
I'm not sure if it has advantages over the 2.1 model. Google claims it's faster.

I pretty much think that, for the purposes of AI in IMatch (descriptions, headlines, traits and keywords), the AI's are as good as it gets. The focus of AI development now seems to be reasoning models and agentic workflows, which offers no benefit for IMatch use cases.

I'm waiting for Google's Gemma3n model becoming available for Ollama and LM Studio (the version with vision support I mean). This model is supposed to need less memory than Gemma 3, which could mean that even IMatch users with graphic cards having as little as 4GB VRAM can work with it. When it becomes available and it works good enough, I will add support for it.

For the most part, I only use Gemma3 12b running locally. It produces more than good enough descriptions, headlines and hierarchical keywords for my needs.

Stenis

Agree Mario. I think what we have now in OpenAI 4.1 and also in Gemini Flash really has proven totally sufficient for almost all needs we might have. As I wrote there are situations where I have to use the bigger model GPT-4.1 when my demands are higher than the 4.1-mini can meet but for more than 95% of my needs the mini-model is just fine.

Gemini Flash is really fast already :-)

Maybe we won't need all that more speed and refinement but the reasoning focuses the developers have had lately might make their API:S write even better texts and sometimes a source like Chat-GPT or Google's Gemini might help us writing and refining our prompts.

I am very satisfied now with my workflows and have stopped testing different API-solutions for now, in order to get some real work done. Finally, I'm really making some progress now with my picture library metadata without killing myself - iMatch is a real relief:-)

It is very good and impressive though that you continue to give us both the latest commercial and free API-options. That is a huge selling point for iMatch compared to most other picture library - DAM- solutions that many times just try to lock their users into all sorts of proprietary maze-like solutions.

Stenis

The cost of using even a commercial service as Open AI API GPT-4.1 and especially GPT-mini is also very resonable. I prefer the payment model of Open AI where it is possible to have a total control of the costs through their system of accepting paying in advance instead of letting them charge you as you go. We know from real eamples that Google have charged people fantasy sums after having activated deep search models juts by mistakes. At least before there were no warnings concerning the costs.

It is very nice offering the totally free Google's Gemma3n model too for people with basic GPU-processing power and with a model like that it is impossible to make "mistakes" like the ones possible with Google Gemini.


Jingo

I've been using the Gemini 2.0 Flash Lite and it is quite successful in returning results for my bird and wildlife/flower photography.  It often confuses a few birds that are really close to each other (ie: Eastern Phoebe vs Pewee) but I too have a hard time differentiating these...  If the AI can't figure it out after 2 tries... I just figure it out myself, provide a keyword tag and then let the AI use that keyword to get me detailed descriptions and further keywords from it.

Overall, it has about a 90% positive return rate on identifying birds which is really great... especially for the more common things which saves me time.  It even correctly identified a Juvenile Bald Eagle which I thought might have been a Golden Eagle...  and nailed this female Baltimore Oriole.

A positive use of AI for sure!!

You cannot view this attachment.

You cannot view this attachment.

Mario

QuoteI've been using the Gemini 2.0 Flash Lite and
Have you tried the big 2.0 or even 2.5 model (which are more costly, but may produce better results in these cases).
As Stenis points out, sometimes the faster and cheaper models don't cut it. Having a copy of the AutoTagger settings but with the bigger model selected can be really helpful in such fringe cases.