Images with Metadata conflicts

Started by Grasbeak, May 17, 2025, 08:24:44 AM

Previous topic - Next topic

Grasbeak

I am creating a IMATCH 2025 database of ~100,000 images which I managed in IMatch 3.  I am using IPTC Categories and Keywords to create my IMATCH 2025 categories.  I used a variety of IPTC fields for my image information in IMatch 3, and of course there was EXIF metadata (including GIS).  Those tags have data which is important to me.  I have not yet used XML tags, so any XML tags associated with my images could be deleted without data loss.

My conversion to IMAtch 2025 seems to have gone quite well (I am still assessing), but the one area where I have been hung up for some time (weeks) is that I have 27,888 images which have metadata conflicts and are continually queued for metadata write-back.  The conflicts vary between images, but all conflicts appear to be between EXIF or IPTC data and XML (including embedded XML) data.  The most common conflict appears to involve date created.

I have run metadata write-back many times hoping that would resolve the issues.  

I tried to use the remove XML metadata preset for the ExifTool Command Processor under the IMAtch Tools, but I am not sure that it did actually delete the XML data (as I assume that would resolve the conflict).

On my last attempt, I accidently deleted the remove XML data preset from the IMATCH ExifTool Command Processor presets under on my installation.  I accidently hit the "delete" button which deletes the presets rather than the "run" button.

I have a few questions:

1) I do not use virus protection beyond that provided by Windows 11.  I have not disabled any virus protection for Windows 11.  Would Windows 11 have virus protection I would need to disable for these functions?

2)  Would I be able to recover or copy the Remove XML Metadata preset from somewhere to restore it on my installation?  

3) When running the remove XML metadata preset, do I need to select the option to run once per file?  I selected the option to run once.

Any other suggestions on how to resolve these metadata conflicts are appreciated.




Metadata Analyst summaries from two images are below:

Metadata Analyst Results. Version 2025.3.2. 5/17/2025 1:11:40 AM
File analyzed: C:\Temp until new HD\9 Fall 2024\DS71818.JPG
Errors: 0
Warnings: 9

Warning: [System] File has unwritten metadata (pending write-back).<br/>The metadata loaded from the image and the data in the database may not match.
Warning: [EXIF] Offset TimeOriginal missing. 'Date Subject Created' may have no time zone offset.
Warning: [Legacy IPTC] Created timestamp missing.
Warning: [Legacy IPTC] Character Set Encoding: unspecified.
Warning: [XMP] Embedded XMP record (photools.com IMatch 25.3.0.2 (Windows)) and XMP sidecar file (photools.com IMatch 25.3.0.2 (Windows)) found.
Warning: [XMP] [ExifIFD]:DateTimeOriginal and [XMP-photoshop]:DateCreated (embedded) mismatch.
Warning: [XMP] [ExifIFD]:DateTimeOriginal and [XMP-photoshop]:DateCreated (sidecar) mismatch.
Warning: [XMP] [ExifIFD]:UserComment and [XMP-dc]:Description (embedded) mismatch.
Warning: [XMP] [ExifIFD]:UserComment and [XMP-dc]:Description (sidecar) mismatch.


Metadata Analyst Results. Version 2025.3.2. 5/17/2025 1:15:17 AM
File analyzed: C:\Temp until new HD\9 Fall 2024\DS71820.CR2
Errors: 4
Warnings: 7

Warning: [System] File has unwritten metadata (pending write-back).<br/>The metadata loaded from the image and the data in the database may not match.
Warning: [Metadata] Warnings: 'IPTCDigest is not current. XMP may be out of sync'
Warning: [EXIF] Offset TimeOriginal missing. 'Date Subject Created' may have no time zone offset.
Warning: [Legacy IPTC] Created timestamp missing.
Warning: [Legacy IPTC] Character Set Encoding: unspecified.
Warning: [XMP] [ExifIFD]:DateTimeOriginal and [XMP-photoshop]:DateCreated (sidecar) mismatch.
Warning: [XMP] [ExifIFD]:UserComment and [XMP-dc]:Description (sidecar) mismatch.
Error: [Keywords] Flattened hierarchical XMP (sidecar) keywords don't match IPTC keywords.
Error: [Keywords] Flattened hierarchical XMP (sidecar) keywords don't match XMP keywords.
Error: [Keywords] IPTC keywords contain extra spaces at beginning or end.
Error: [Keywords] XMP (sidecar) keywords contain extra spaces at beginning or end.

Mario

For the JPEG:

Warning: [XMP] Embedded XMP record (photools.com IMatch 25.3.0.2 (Windows)) and XMP sidecar file (photools.com IMatch 25.3.0.2 (Windows)) found.

This means the JPG has an embedded XMP record and there is also an external XMP file with the same name as the JPG in the same folder. IMatch has to merge the XMP data during metadata import. IMatch always embeds XMP in JPG files (and updates the embedded XMP record), as per standard, but will not update the external XMP sidecar file.
This constellation may cause synchronization issues. I suggest you remove the XMP sidecar file with the same name as the JPG and try again or move it to another folder. Then write back again.

Also, the JPG (and, for some reason the CR2) has a legacy IPTC record, which is very often the cause for synchronization problems, as per Metadata Problems and Pitfalls. Depending in the keywords in IPTC, your thesaurus and the options you have enabled under Edit menu > Preferences > Metadata, IMatch may be unable to merge flat keywords in XMP and legacy IPTC.

If you don't rely on the legacy IPTC record for client or agency use, remove it with the corresponding "Delete legacy IPTC data" preset in the ExifTool Command Processor.

QuoteI accidently deleted the remove XML data preset from the

Here is the Delete XMP metadata preset:

# im-warn
-overwrite_original_in_place
-xmp=
-charset
filename=UTF8
{Files}

Create a new preset and copy & paste from above into it.
You can run it once, for all selected files. The reasons for running a preset once for each file are outlined in the ECP help.

Grasbeak

Thanks Mario.

All my data is currently in IPTC, so I will retain that until I am confident that all the data has properly been transitioned to  Categories, or XML.

I didn't realize RAW XML was handled differently than JPG XML.

I greatly appreciate the timely guidance.

Mario

XMP (not XML) is the term to use. That's the metadata standard IMatch uses primarily. Legacy IPTC metadata is automatically imported and converted into the modern XMP IPTC and XMPExt namespaces standardized by the IPTC committee over the past 20 years.
You can see the XMP data e.g. in the Default Metadata Panel layout, the IPTC layout, the IPTC Location layout.

There is really no reason to keep legacy IPTC around, unless your clients or agencies still require it for some reason.
Especially when it causes issues, as in your case, the simplest solution is to get rid of it.

QuoteI didn't realize RAW XML was handled differently than JPG XML.

I recommend reading Metadata for Beginners in the IMatch help to learn more about all the metadata standards and storage schemes. And the Metadata Problems and Pitfalls to learn about typical metadata problems and the role legacy IPTC plays.

Stenis

#4
Are you really sure about: "There is really no reason to keep legacy IPTC around, unless your clients or agencies still require it for some reason.
Especially when it causes issues, as in your case, the simplest solution is to get rid of it.
there "

I'm not convinced about that at all. Isn´t that all about what kind of interoperability demands you are living under - all the softwares you use that depends on XMP/IPTC-metadata. Even if they have had 20 years or so to migrate I can see all four other XMP-aware tools I use displays IPTC in their user interfaces.

PhotoMechanic is using IPTC internally and fork XMP and so is DXO Photolab. Even Capture one is using IPTC in it´s metadatainterface. All of them are capable of maintaining XMP BUT all of them are NOT using flagging/using XMP in their user interface but IPTC. Despite that I have never ever experienced a conflict since I stricltly use only one metadata source and maintenance tool at the time PhotoMechanic or iMatch. As long as one not allow bidirectional workflows it doesn´t seem to bo any problem really.


Mario

Whatever works for you. If your legacy IPTC data is not giving you trouble, keep it. Else fix it or loose it.
PM has a very specific audience (PJ's). That's why I mentioned clients and agencies.

Stenis

I know but is that the reason?
For me it looks like they don´t have the strength to modernize their platform.
I guess sooner or later they have to rewrite the whole application with XMP as the base instead of IPTC.
... but that is not likely to happen with venture capitalists owning the company now, is it.