Metadata conciliation again

Started by jmsantos, May 15, 2019, 10:30:21 AM

Previous topic - Next topic

jmsantos

I keep testing IMatch to solve metadata problems. This post is a consequence of another one (Metadata discrepancies IM-LR) in which Mario said:

Quote from: Mario on April 10, 2019, 10:18:43 AM
If your image collection has older images produced by older Adobe products or other software, you will need a software like IMatch to point out the problems and allow you to fix them.
Getting the metadata straight and standard-conform is usually one of the biggest challenges when migrating to a true DAM software. As long as you stick to the Adobe world you might never notice any problems...

The challenge I have is a collection of about 500,000 photographs from an institutional archive. Metadata has been written by different people with different levels of knowledge, mainly with Adobe Lightroom and other software such as XnView. There are photographs in different formats, but the main problem is related to RAW (especially NEF and RAF) and its XMP sidecar.

After that consultation I already learned that IMatch follows the recommendations of the Metadata Working Group and that Adobe does what it wants. The point is that I am faced with a file with images that, in the case of the RAW format, has "different metadata for the same tags (in the image and the sidecar)". MWG Guidance indicates that there are only four matching tags in Exif, IPTC and XMP: Copyright, Descripcion, Creator and Date/Time. Problems arise when there are discrepancies in those data, as is my case. For example, a RAF file containing a copyright value in the XMP sidecar file which does not match the EXIF data. Mario then suggests ticking the option "Favor XMP sidecar file" because I need to override the default "Data in image is more important than data in sidecar file" in IMatch.

However, with this option enabled, IMatch now does not read data that is in image, such as GPS data embedded by the camera. Why? GPS data are important data. Isn't there an option for IMatch to read all data wherever it is and synchronize it as it should be according to MWG guiddance? LR reads everything, whether it is embedded or not, even if it doesn't do the job of conciliing all the metadata properly afterwards.

Mario

IMatch always reads the data in the image and the data in the XMP.
The "Favor" option controls which set of data overrides the other. I don't have the details available anymore, but that is how I remember it.

If your image has embedded GPS data, IMatch will import it during ingest and load it into the GPS record of the XMP file.
The XMP record in the sidecar can override (if you set favor) the data from the image - otherwise the data in the image overrides the data read from the XMP sidecar.

If you use the defaults, IMatch will write the complete XMP record afterwards, and also sync the data in the image.

The problem is that you start with out-of-sync data and you, as I understand it, want to pick individual tags from either the image or the XMP?

As always: I helps a lot when you provide an example image and sidecar so we can actually see the data and don't need to guess.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

Thank you for your reply, Mario.

Here is a file with embedded GPS data and Copyright and other data in XMP sidecar:
https://www.dropbox.com/s/lluatpbtna1fr90/20190510_131403_4698.zip?dl=0

It is a case of my particular collection, similar to others that I can find in the institutional archive that I will have to manage. Depending on the option that I activate in IMatch (Default or Favor XMP sidecar), I will be able to see the GPS data or the author and copyright data.  With IMatch I can't see both at the same time, but with LR I can.

In response to my other post you said:

Quote from: Mario on April 10, 2019, 10:18:43 AM
If your image collection has older images produced by older Adobe products or other software, you will need a software like IMatch to point out the problems and allow you fix them.
Getting the metadata straight and standard-conform is usually one of the biggest challenges when migrating to a true DAM software.

Well, I need to migrate to a true DAM and that's why I'm testing IMatch, but I can't get the metadata to be straight and standart-conform. What workflow should I follow in IMatch to resolve this, or what options should I enable? In the archive I will find many problems like this.

ubacher

I think Imatch, because of its flexibility is the right tool for you - but it will take you some time to learn
its capabilities in order to know how to go about fixing your metadata.
And keep in mind that metadata is a messy business!

I envisage something like (for what I understand is your problem):
Filter out all files which contain metadata in X
Bookmark these files
Filter the bookmarked files to see if there is also Metadata in Y
Maybe a manual selection at this point.
Now use a metadata template to copy the wanted Metadata from X to Y (or reverse)
Use the exiftool processor to delete the Metadata in the unwanted location.
Write out all changed files.

Mario

#4
This is a quite sad example for metadata management.
The XMP sidecar file contains only a few tags:


[XMP-dc]        Title                           : 20190510_131403_4698
[XMP-dc]        Rights                          : © 2019 José Manuel Santos Madrid
[XMP-dc]        Creator                         : José Manuel Santos Madrid
[XMP-photoshop] Authors Position                : Fotógrafo
[XMP-xmpRights] Marked                          : True


This is a minimal XMP record, missing lots of tags.
And none of the tags written to the XMP record have been updated in the corresponding EXIF/IPTC/GPS records in the image itself.

When you use the default settings in IMatch, IMatch produces a new rich XMP record from the data embedded in the file. This XMP record then contains all EXIF fields, the GPS data and everything ExifTool can extract from the file and map them to XMP by applying the MWG rules. But this ignores the partial data in the sidecar file.

If you tell IMatch to favor the existing XMP sidecar file, IMatch will import metadata from the file (EXIF, GPS, legacy IPTC) but not perform any EXIF/IPTC/GPS mapping on its own if an XMP sidecar file already exists (which is the case). This means that the XMP data imported by IMatch contains everything that is in your sidecar file, which is only little.

This is how IMatch handles this situation right now.
by default it merges XMP data it has created from the original file with the XMP data in the sidecar (if exists).
If  you set "favor sidecar" it uses the sidecar XMP data and does not create its own XMP record.

What we would need to handle your use case is a feature to a) Produce a rich XMP record from the existing data in the file, apply all MWG mappings to map EXIF/GPS/IPTC from the file into the XMP record. And then, if an XMP sidecar file exists, merge its contents selectively with the rich XMP data produced by IMatch. Such a feature currently does not exist.
I have some ideas in that direction for IMatch 2020, though. Maybe even earlier.

This requires a lot of testing, because there are many fringe cases and metadata handling has become more and more complex over time, many options have been added to handle fringe cases and user-specific workflows. Camera vendors have started to add partial XMP records to RAW files etc. I need to give this a re-think for IMatch 2020.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

Thanks for your answers, Mario and ubacher.

I'll be following IMatch's evolution in this matter.

Mario

Since you mentioned Lr shows all the data you want, you can just let it write out the XMP record. Then use the default settings in IMatch and you are good.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jmsantos

My tests making Lightrom save metadata (Ctrl + S) are only good if I choose IMatch setting"Favor XMP sidecar".

Mario

Yes. But the XMP record produced by Lr should contain all your data, combined with the EXIF, GPS and legacy IPTC data Lr has copied from the original image into XMP. I might be wrong with that assumption.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook