Metadata Templates Leading To Out Of Memory

Started by Darius1968, June 02, 2016, 11:22:38 AM

Previous topic - Next topic

Darius1968

I've confirmed that I've successfully set up a metadata template that lets me copy the contents of {File.Created} into the Tag, Microsoft.MXP\DateAcquired.  I've tested and it does indeed work!  (I've tested it, using up to 10 files at a time.) 
I'm running into memory issues trying to use this template on all my jpegs, which amounts to about 290,000 files.  This is on a system with 8GB RAM.  How many files am I able to run at a time? 

Mario

I don't think that running a Metadata Template on 300,000 files in a single batch is a good idea.
300,000 can be surely considered as an enterprise level amount of assets. Many stock photo agencies out there have much less images in their archives.

This is such an exceptional task that I don't even test this. Or will.

I don't know if you have background write-back on or not. but writing 300,000 files and re-loading the metadata will cause a lot of stress.
If you have many data-driven categories etc. this will also add to the general load.

Since you did not include a log file, I cannot even say how much memory IMatch used, if the Windows memory pool was totally fragmented, how much memory way available for IMatch at that time etc.

If you really need to write a proprietary Microsoft XMP tag into all files, spit the operation in 30 batches of 10,000 files or so.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

artem

Unfortunately I tried to struggle with this weird behavior for several days without any luck. I changed all my category-based process to hierarchical categories because of handy import-export capabilities. But nowI needed to just sort all my files in hierarchical categories and got the same "Out of memory" problem. It corrupts database immediately and I can't even open it after that. Only restoring from backup helps. I turned off any background processing, but without luck. IMatch falls so bad, that it even can't create dump file. All I see it "Imatch has encounted a problem..." window and after some seconds standard Windows "Windows try to recover program" and nothing happens.

So I can't attach any dump file as IMatch crashes completely. I am very disappointed because of this IMatch behavior, I worked with categories for many years, but switched to hierarchical categories as they are much more flexible, I changed my workflow and now I see, that all my process is broken as IMatch crashes. Unfortunately I didn't see anywhere "use categories if you have more than 10000 files in collection".

My main system is Win 10, 8 Gb, I am trying to open 88000 files. I tested in VMware with Win 7 installed, the same result. After many happy years of using IMatch I am stumbled and don't know what to do.

jch2103

I'd recommend that you post a log file (debug logging mode), which may help identify the problem.
John

artem

#4
John,

Thanks for recommendation. I have found log file, http://pastebin.com/raw/8RfKR1tK  ("Недостаточно памяти" in last lines means "Out of memory").

After switching to debug logging mode and reproducing crash steps I got this file: http://pastebin.com/raw/VSWED1rA

In Windows process manager I see that IMatch consumes 3413.9 Mb. I didn't start Photoshop, as was mentioned as possible problem here https://www.photools.com/community/index.php?topic=4190.0

sinus

I had this problem quite often.
And allways was Photoshop or/and other programs involved (open).

Nowadays I have this message (out of memory), say once in 3 monthes. And always was Photoshop involved.
Usually working with photoshop and IMatch is not a problem here.

But sometimes I have this message. I close then Photoshop and (must) cancel IMatch with the task manager (otherwise not possible).
And after reopen IMatch all is ok again.
Never lost some data or so.

Hence for me it is not more a problem. And yes, nowadays I am working with 340'000 images, quite a lot and I have very seldom a problem. And IMatch is still quick enough for me.
Best wishes from Switzerland! :-)
Markus

artem

Markus,

I didn't face this bug while working with any IM functional excepting keyword categories. But when I want to do anything with that metadata-based categories — IM crashes. I tried after Windows restart (no Photoshop etc.), but that didn't help.

artem

Quote from: sinus on December 08, 2016, 09:18:19 AM
I close then Photoshop

As you see, IMatch consumes all available memory for 32-bit apps, so closing anything won't help in my case. I don't know what is the problem in my case, I try to sort just 88000 images.

Mario

Your database has 200,000 files and a whooping 30,000 (!) categories.
This means you are maxing out IMatch. Why do you need 30,000 categories?
If you can reduce the number of categories, you will not see any out of memory problems again.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

artem

My database includes "author — item" relations, I have 10000 authors on English, another 10000 is names translation, plus some additional categories. I can't delete any categories as I can't just simply delete some authors from database and keep others. So I can't reduce number of categories. MS Access or any other DB system handles this situation very simply and fast, so I couldn't even imagine IMatch can have any problems with this. What do you recommend me to do? I can't work with so simple damaging database, I will have to change something in image organizing.

artem

Additionally as for number of categories. There are a lot of image recognition services now, so I planed to auto recognize some files and import tags as hierarchical categories. Average number of tags, let's say, 5. So for just 10000 images I would get 50000 categories. For 100000 images it is 500 000 categories (ok, without duplicates maybe 100 000), don't you plan to support such a workflow in IMatch?

Mario

#11
I have databases with 300,000 files and 27,000 categories. They peak memory at about 1.8 GB only.
This must be something else with your database.
Can you upload it somewhere so I can check it out?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Mario

QuoteFor 100000 images it is 500 000 categories (ok, without duplicates maybe 100 000), don't you plan to support such a workflow in IMatch?

If you need a software that handles 500,000 categories and hundreds of thousands of files you may look outside the 100$ software area. Maybe ask Canto, FotoWare, Widen, Extensis, AssetBank and similar companies for a server farm / software solution that can handle your requirements.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

artem

Meanwhile I tested many other products, but non of them are close to IMatch, so I decided to stay with IM. The most obvious solution is to move hierarchical categories to any other flat metatag, so I did that. Now I have completely new database with 88700 images and 0 hierarchical categories. I created new sort preset after that and applied it to all files. First run consumed 1.8 Gb of memory, second run — 2.2 Gb. So I woun't be able to even sort my collection by one of the metatag, if I have about 150000 images or more.

Here is screenshot: https://i.imgur.com/WXGD0qr.png

And here is IMatch log file: http://pastebin.com/raw/JnAFaXgf

I don't understand what I am doing wrong and why I get so high memory consumption.

artem

Will I get any speed improvements if I import metadata information to attributes and sort/filter attributes? Or it will be the same database sorting and filtering?

Mario

#15
Your sort preset is named "Musuem". Which metadata tag do you use for sorting? What does it contain?

I tried to max this out.
I used a database with 360,000 files. This is my largest test database.
It needs about 1.25 GB of RAM to load.

I created a sort profile for hierarchical Keywords and another one for the LTID (which is always filled).

When applying this sort profile to a file window containing 200,000 (!) files, IMatch used 3.5 GB of RAM. It works. But this is pretty much stressing things out.
Loading the data required to display 200,000 files, the thumbnails, the data for the sort profile, all the category, rating and label data etc. requires a massive amount of memory.

I used the Default layout (with all icons, rating & label) and the other sample layouts.

That said, it's pretty unusual to display 200,000 files in a file window and to sort them after very long metadata items.
I'm sure I could improve the memory consumption for such extreme cases, but as I said, this is so rare...

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

artem

I used XMP Photoshop\Instructions metatag. Just selected @All category and applied new sort profile (by photoshop\instructions metatag).

It contains information like this: "Categories|Museums|Los Angeles County Museum of Art", "Categories|Museums|National Gallery of Art", etc.

Mario

#17
Thank you for providing the sample database. It was paramount in figuring out what the problem is.

Your database has 80,000 files.
After loading all files into a file window (using @All), IMatch consumes only about 500 MB RAM. Pretty good.

I now created a sort profile using the XMP::photoshop:description tag and used it to sort the file window. After a short while, the file window was sorted.
But IMatch consumed over 2 GB (!) of memory. Yikes.

Several things come together in this case:

1. You sort 80,000 files.
2. You sort by an arbitrary metadata tag (you copied long hierarchical keywords into the instruction tag).
3. Your database contains a lot of metadata for each file, including long headlines, descriptions, Job-Ids etc.
4. IMatch is trying to be clever and caches metadata in memory. This avoids loading it from disk all the time.
5. IMatch is a bit too clever in this case and caches the entire metadata record of each file.
  While this is good for the Metadata Panel, the File Window and Keyword Panel, it's not so smart when only one of the tags is needed for sorting.

With a small change, a bit up-front intelligence before the sort starts, IMatch now uses barely more than 20 MB for the sort, instead of 2 GB.

This is one of the cases where I did not anticipate a certain usage pattern or a potential combination of factors. Impossible to think about everything up-front, sorry.

I will include this update in the 5.7.8 release of IMatch. But that will be next year.


-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

artem


sinus

Quote from: artem on December 15, 2016, 04:33:56 PM
That's fantastic, thank you a lot!

Yes. And what I specialy appreciate in such cases: Mario does explain this and that very clear, good and deeply enough, that we (I) can understand, why something happend.
Great.
From other software companies you hear

- nothing
- or Problem could not be found
- Problem solved
...

thats it. Only minor programers does like Mario does, great.
Best wishes from Switzerland! :-)
Markus