Why there are always thousands of pending writeback files ? Even after writing..

Started by CauTrindade, November 19, 2020, 02:29:51 AM

Previous topic - Next topic

CauTrindade

I don't know exactly how does the metadata writebacks and the after "reading metatada" process work, but:

It seems that "Pending Meta-Data writeback" files never end. Is it OK ?  I wonder if it's not because my files are all on Onedrive and many of the files are always been updated because of "date sync" processes between Onedrive and my COmputer and my computer with iMatch database ?  Or it's nothing to do with it. My iMatch database itself is outside onedrive.

Like now, I've asked to writeback, there were 10,000 files. It finished after 90min with all files processed and started "reading Metadata". After a long time, it finished with that "error window on the bottom of the screen" saying about the log file.  So right after there were another 7,000 to writeback.

I've attached my log file.

Mario, I know you are probably super busy, so no rush here.

Thanks!

Mario

See Metadata Write-back for details about how IMatch writes back metadata and all features involved.

Of course the write-back finishes at some time. When IMatch has written all pending files after you triggered the write back.
If the pen icon comes back after writing back a file, the typical reason is metadata that is out-of-sync (created by other applications, and usually keywords). Often IMatch can fix things automatically with a second write-back. If not, you need to check the metadata in the file and figure out what the problem is.

Point the mouse cursor at the pen icon in the File Window. This shows you which metadata tags need to be written. Keywords?

Run the Metadata Analyst on one of these files to check the metadata for problems.
Use the "Copy Results" button to copy the results into the clipboard and paste them into your reply.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Stevef48

I too have this problem. Sorry Mario I know that we have discussed this before.
iMatch told me that 20k files were pending writeback, even though I have not opened the program since the last metadata writeback.
The Pending Writeback collection shows 156 files, where are the rest? The number has now reduced to 155, the % processed went from 9% to 14% and the time remaining to around 2 hours.
I looked to see what metadata needed to be written for a couple of files. As you suspected it was keywords and subject. The photos were taken nearly 13 years ago and I have not changed the keywords recently.
How do I make iMatch show me the whole Pending Writeback collection?

Thanks in advance,
Stay Safe
Steve

Mario

Where did IMatch tell you that there are 20K files to write back?
The diagnosis reports 21,826 files to write-back, so this seems to be OK.

I see that IMatch tries many times to update the pending write-back collection, but fails almost every time because IMatch is super-busy importing files.
Since importing a file will invalidate all collections (and most categories) IMatch may postpone the updates until the import is finished - especially if it runs into database transaction timeouts while trying to update a collection.
This may cause the number of files displayed in the collection to be different than the actual write-back file count. Temporarily. It will update as soon as the background activity in your database ceases and leaves some CPU cycles for other things.

At the end of your log file, IMatch was still busy ingesting files and metadata from D:\Photos\2006\2006-05-May\.
It requires almost 5 seconds to import metadata for 5 files. This is a bit on the slow end...

What kind of computer is this?
Your database does not seem to perform all too well. So many locks while ingesting files, slowing everything down...
Is your C: disk a normal disk or a SSD?
Did you set the folder (!) containing your database as an exception in your virus checker? A virus checker constantly scanning the database will ruin performance.

Maybe try to reduce system load by reducing the number of parallel threads IMatch uses to ingest files - see Process Control
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Stevef48

Where did IMatch tell you that there are 20K files to write back? When I selected Commands | Metadata Writeback | All Pending Files
The diagnosis reports 21,826 files to write-back, so this seems to be OK.

I see that IMatch tries many times to update the pending write-back collection, but fails almost every time because IMatch is super-busy importing files. Don't understand why. The last import of 2 new images seemed to finish correctly.
Since importing a file will invalidate all collections (and most categories) IMatch may postpone the updates until the import is finished - especially if it runs into database transaction timeouts while trying to update a collection.
This may cause the number of files displayed in the collection to be different than the actual write-back file count. Temporarily. It will update as soon as the background activity in your database ceases and leaves some CPU cycles for other things.

At the end of your log file, IMatch was still busy ingesting files and metadata from D:\Photos\2006\2006-05-May\.??? Again why is it trying to import from this folder? All images earlier than 2021 should already have been imported when I rescanned the D:\Photos folder last month
It requires almost 5 seconds to import metadata for 5 files. This is a bit on the slow end... We discussed this last year. It seems as though one of my cameras, or a piece of software that I no longer used, had introduced some errors into files

What kind of computer is this? Dell G3 15. Intel i7 8th gen. 16gb RAM. 512gb SSD and 2tb SSD.
Your database does not seem to perform all too well. So many locks while ingesting files, slowing everything down... Could this be caused by McAfee Internet Security?
Is your C: disk a normal disk or a SSD? See above. Database and files are on different SSDs
Did you set the folder (!) containing your database as an exception in your virus checker? A virus checker constantly scanning the database will ruin performance. I have now excluded the database and other files in that folder

Maybe try to reduce system load by reducing the number of parallel threads IMatch uses to ingest files - see Process Control

Stevef48

Writeback finished, but Collections\Pending Metadata Writeback still shows 6727 pending

Mario

Please do not comment in red. Thank you. The ADMIN uses this color for specific purposes..

QuoteAgain why is it trying to import from this folder? All images earlier than 2021 should already have been imported when I rescanned the D:\Photos folder last month

IMatch rescans a folder when the "last modified on disk" timestamp of the folder has changed. Or when Windows sends a "folder modified" message.
IMatch then scans the folder and compares the last modified on disk timestamp of each file with the corresponding data record in the database. If the last modified on disk of a file differs, IMatch rescans the file.
IMatch also finds new files during this step.

I cannot know why the modified on disk timestamp of the files change on your system.
Maybe some background service or other application is touching the files, changing their timestamp?
This timestamp only changes when a file is modified by a process on disk. Windows then updates the timestamp in the file system when the file is closed.
You can compare what does Windows Explorer report in the properties of files and compare it with the {File.Modified} variable in VarToy or you show the corresponding attribute in a File Window layout. IMatch updates this file attribute after indexing a file, from the timestamp reported by Windows.

What is puzzling is that I see no AddorUpdateFile entries in the log. This means that IMatch did not scan any folder. It was just processing entries still in the background processing queue.
How many pending entries does IMatch show in the Info & Activity panel?
Maybe it did not finish what it was doing the last time you closed it and now it has still as number of files to process?

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Mario

Quote from: Stevef48 on January 25, 2021, 04:36:37 PM
Writeback finished, but Collections\Pending Metadata Writeback still shows 6727 pending

IMatch may pause write-back temporarily to do other things. What does the Info & Activity panel report? Are there still files pending for ingest?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jch2103

@stevef48: It's quite possible your old files have a mix of XMP and IPTC data, which causes synchronization problems, including multiple write-backs. Try running the Metadata Analyst on one or more of these old files to shed more light on the situation.

Also make sure you've excluded IMatch from your antivirus program.
John

Mario

Out of sync legacy IPTC/XMP data may require a second write-back to synch everything, that is correct.
(Legacy IPTC data should be removed from files which also have XMP data, IMHO. Legacy IPTC has so many limitations and was declared legacy 20 years ago for good reasons).

But it does not cause IMatch to rescan files not modified for a long time - this must be cause by something else.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mastodon

IPTC data may cause this problem, excecially if you use accent charaters, like á, é, ő, č, š etc.

Mario

Yeah. IPTC and character sets...

There is a way to tell which character set to use, but most applications did not bother.
Which is why non-ASCII IPTC data may show up differently (cryptic!) on different computers or code pages / language settings.

And it gets worse when you have to map between XMP and legacy IPTC. Which is why I recommend to get rid of the legacy IPTC in your files once you have XMP embedded.
I did, many years ago.

XMP solved all that almost 20 years ago.
And also solved the many, many problems and shortcomings of EXIF metadata.

Modern DSLR cameras are basically smart phones with much better optics.
Still, even these cameras still use the 30 year old EXIF metadata, intermingled with proprietary maker notes and stuff to record technical data very useful for the camera owner.
And they often also write an incomplete, non-standard XMP record as well. Usually only containing a hard-coded "rating=none" element.

It would all be much easier if they would just agree to abandon legacy IPTC and EXIF and just use XMP. XMP is around for almost 20 years know, they should have heard about it by now...
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

CauTrindade

Mario, you have mentioned possible other apps modifying date stamp. I believe it's Onedrive. I don't have clear evidence, but the only app that constantly accesses and possibly changes the time stamp is Onedrive.

Mario

You can see the "last modified" timestamp in the File Properties tab in Windows Explorer.
This timestamp never changes unless an application changes the file or changes the timestamp for some reason.
There are tree timestamps per file: created, modified and accessed. IMatch only cares for the modified timestamp. Same for folders.
This timestamp is recorded into the database when IMatch ingests/reloads a file. And this is how IMatch knows if a file has been modified - the timestamps differ.

I use OneDrive but I don't see unusual timestamps. I have files created back in 2015 and they still show the correct "created" timestamp.
Maybe you upload into the cloud and OneDrive downloads files on-access, thus creating the file anew after downloading it from the cloud or something. I'm don't know too much about OneDrive synching.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Carlo Didier

From experience, I know that if a file is copied in Windows, it gets a new "created" timestamp (a copy is a new file), but retains the "last modified" timestamp. This leads to the unintuitive situation where the "created" date of a file is more recent than the "modified" date.
So, if something (like OneDrive) uses the standard Windows functions to copy files, the "modified" date should not change.

Stevef48

I don't use Onedrive to store most of my photos.
When I searched for files changed on 24th January 2012 in my Photos folder, 23,279 files had their date modified updated between 16:04 and 16:48.
iMatch was 'writing metadata' at that time as shown by the debug log attached to my earlier post.

Mario

IMatch writes metadata only when you explicitly trigger it. Or if you have enabled the automatic write-back under Edit > Preferences > Background Processing.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

CauTrindade

Hi Mario,

I'm still struggling with Pending Write-backs ! No matter what I do, the same files come back again and again and again. If I click on the write-back icon in a particular file, it shows it is writing and the file comes back a minute after.

I've copied them to an external drive, delete them all, run an image repair on those files in my external disk, copy them back to the original place, scan them forcing updates, write back. They seem to go all away. Minutes after they start coming slowly and after a few minutes, all are back.


I've run Metadata Analysis and attached is the report. It seems that most issues (not all) are with XMC::DC/Subject (I couldn't even find this particular tag on Tag Manager/Search) and IPTC::ApplicationRecords/Keywords


Mario

Please use the GREEN COPY RESULTS button at the very top of the MD Analyst to copy the results.
Dumping all the output and asking me to wade through 60,000 characters of text will require a lot of time.
Seeing just the errors and problems reporting using the GREEN BUTTON will speed things up considerably.

If a write-back when repeated once still causes the file to remain pending, the typical cause are out-of-sync keywords (legacy IPTC / flat keywords in XMP).
That's usually caused by some other software not synchronizing keywords properly (which software did you use) and or a combination of your keyword flatten settings under Edit > Preferences > Metadata (show us if not the standard IMatch settings as a screen shot) and/or your thesaurus. Show us.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook