Removal of unused XMP files.

Started by ubacher, July 08, 2014, 08:25:33 PM

Previous topic - Next topic

ubacher

If no file exists which uses an existing XMP file the XMP file should be erased.

Example: I start with a NEF file. On converting it with ACR an XMP file is generated (it holds the conversion settings for photoshop).
The resulting jpg file does not use XMP. If I now delete the NEF file the XMP file remains but is never again used.
The same holds true if I afterwards convert the NEF to DNG and then delete the NEF file.

Maybe the delete of unused XMP files is best added to the compact and optimize routine.

Previously mentioned in Topic 2753.

Richard

If you create a file ABC.nef the sidecar will be ABC.xmp. If you next make a file ABC.jpg the XMP file belongs to both ABC.nef and ABC.jpg. Only if both ABC.nef and ABC.jpg are deleted can IMatch know that ABC.xmp is unused.

Erik

This issue seems to becoming common (or at least it's been posted about a couple of times).

Richard covered why the problem occurs, but as a note this conflict with XMP files that happens when two files have the same name but different extensions can be dangerous to a user in cases such as this.

As mentioned in the original post, you end up with an XMP file that is only tied to a JPG, but JPGs rarely have XMP files, unless a user specifically sets it that way. There are many other conditions where XMP files could conflict if you were purposely only working with sidecars (such as orientation, develop settings, etc).

This is one of the reasons that I chose to make version files have unique names from the master files. 

Instead of having:

File1.raw
File1.jpg
File1.xmp

I use:

File1.raw
File1-v1.jpg
File1.xmp

The result is that when I delete or move the Raw file, the XMP is now only tied to the master file.  If the version were to have its own sidecar file

File1-v1.xmp

Then that file is unique to the version.  The added advantage is that I can easily track and manage multiple versions, but the primary reason I had started adding a suffix to the original file name for versions was to avoid conflicting xmp records, which was an issue for me back in IM3.6. 


--Anyway, I can see possible value to the request, but I can also see it getting quite complicated in cases where both similar named files legitimately need the XMP and having to keep track of the Metadata Settings closely to know when to delete an XMP file or not.  Ultimately, the issue can be remedied by a user with a bit of planning on their file naming convention.

Ferdinand

My view is that this is a good reason not to store RAW and non-RAW in the same folder.

ubacher

QuoteMy view is that this is a good reason not to store RAW and non-RAW in the same folder.

That is a way around not having the problem in the first place. Just as is Erik's method.

But I am not starting from scratch!

ANY COMPETENT DOS BATCH PROGRAMMERS OUT THERE to help? I think the orphan files could be deleted with a
batch script. "For each XMP file: if there is no matching RAW file delete it"



herman

#5
Quote from: ubacher on July 09, 2014, 11:44:38 AMI think the orphan files could be deleted with a
batch script. "For each XMP file: if there is no matching RAW file delete it"

I am not at my main machine, so I have not tried every step, but I think the tools you need are already there.
- in edit / preferences / file formats define .XMP as a format the database has to index
- rescan the database to bring the .XMP files in
- use the script "Find files with missing versions or buddy", declare .XMP as master and your raw format as buddy in the script
- when all done disable or delete the .XMP format from the file formats

As I said, not fully tested this yet, but I would be surprised when it would not show the orphaned XMP files.

[edit]
When the script does not do the trick an alternative may be:
- define a file version relation between XMP and your raw formats
- rebuild the relations
- bring all the XMP files into view
- the XMP files not having a version icon must be the orphaned files

Please let us know if any of these does what you want
[/edit]
Enjoy!

Herman.

Erik

Quote from: ubacher on July 09, 2014, 11:44:38 AM
QuoteMy view is that this is a good reason not to store RAW and non-RAW in the same folder.

That is a way around not having the problem in the first place. Just as is Erik's method.

But I am not starting from scratch!

ANY COMPETENT DOS BATCH PROGRAMMERS OUT THERE to help? I think the orphan files could be deleted with a
batch script. "For each XMP file: if there is no matching RAW file delete it"

Herman's method will work for you quickly.

Renaming or moving files is something you could still do after you clean things up (to prevent future occurrences).

I do realize that what I (and Ferdinand) suggested does you little good after the fact, but my own experience ended up in a situation such as yours that I fixed then changed file naming routines to solve.  I could have went a route like Ferdinand, too, but that is just a matter of preference and needs.  I think after the fact, you might find it relatively easy to move or rename your version files with the renamer

ubacher

I just gave herman's suggestion a try. Seems to work.
(Just on a test db. Not sure how log this will take on a full (200000 images) db.)

Thanks for now. I will report when done.

herman

Quote from: ubacher on July 09, 2014, 07:15:32 PMNot sure how log this will take on a full (200000 images) db.)
It will take a while, all new files (even when they are "only" XMP sidecars) will go through metadata ingestion and all other magic IMatch performs on new files.
But it is a one-time only effort, just to clean up the "mess" from the past.
When you have your buddy file relations properly set-up IMatch should take care of these events in the future.
Enjoy!

Herman.

ubacher

QuoteWhen you have your buddy file relations properly set-up IMatch should take care of these events in the future.
Not sure what you are referring to. I will remove the XMP file definition (see below) and the file relations definition once I am done.

I removed the orphan xmp files from one db ok using this method of defining XMP as a file type and setting up file rleations.

But when I (once done) tried to remove the xmp file definition I find IM won't let me.
I renamed/edited it (changed extension to zzz)  but still, the entry can not be deleted.
See the error message attached.

[attachment deleted by admin]

herman

That is because the database now contains a number of XMP files.
You can not delete a file definition as long as the database contains files of that definition.
You have first to select all XMP files and remove them from the database (not from the hard disk, just from the database!).
Only when the file type is removed from the database you can delete that definition.
Enjoy!

Herman.

Ferdinand

I would tackle this problem (of unwanted XMP files) with a script that does the whole thing.  I wrote one of these for 3.6 some time ago when I realised that I should switch to embedded XMP for non-RAW.  It searched for such situations and copied all the unwanted XMP files to a location outside the database. 

sinus

Quote from: Erik on July 09, 2014, 05:28:35 AM

Instead of having:

File1.raw
File1.jpg
File1.xmp

I use:

File1.raw
File1-v1.jpg
File1.xmp

Erik, that is a good solution. I use this anyway, because to distinguish versions better.
Best wishes from Switzerland! :-)
Markus

Erik

Quote from: sinus on July 10, 2014, 03:31:37 PM

Erik, that is a good solution. I use this anyway, because to distinguish versions better.

Right, that was a secondary motivation... I even go further and replace the v with different letters to indicate a version type, but it ultimately makes it easy to keep all versions and their metadata separate.

sinus

Quote from: Erik on July 10, 2014, 05:53:46 PM
Quote from: sinus on July 10, 2014, 03:31:37 PM

Erik, that is a good solution. I use this anyway, because to distinguish versions better.

I even go further and replace the v with different letters to indicate a version type, but it ultimately makes it easy to keep all versions and their metadata separate.

me too!  :D
Best wishes from Switzerland! :-)
Markus

cytochrome

#15
The original code is not from me (no cmd guru at all), I found it on a french forum some eyars ago, the idea was to suppress NEF when the JPG was no more there (why he wanted to do this ???)

A bit awkward but it seems to work: create a command file, for example test_xmp.cmd with this line inside  "FOR /F "delims=." %%i IN ('DIR *.XMP /b') DO IF NOT EXIST %%i.RW2 del %%i.XMP"   then launch test_xmp  from the windows CMD.

I tested it on a folder with RW2 and xmp sidecars cause my NEF have no xmp sidecars but some RW2 do. Replace RW2 by NEF and of course test on a folder copy

Like many here I use version names derived but different from the original raw (D7K009876_ASP, D7K009876_VNX, D7K009876_DXO, etc.) so no problem. It is the easiest way by far

Francis

ubacher

Since Imatch deletes unused XMP files now I think this request can be considered as fulfilled.