Set file filter during import.

Started by dkyndt, August 02, 2022, 12:04:56 PM

Previous topic - Next topic

dkyndt

Hi,

I had an Imatch crash during the import of folders.
I have some folders that include pictures, pdf words, etc., which I cannot separate from each other.
https://www.photools.com/community/index.php?topic=11398.msg81418#msg81418
The only way to solve this crash was to rename the folder, removing it from the import queue and removing it from the database.

Should it be possible to add a filter of file formats before you import folders?
This way Imatch can select the files to import instead of importing everything.
This would make the database also less bulky.

Thank you.

Mario

#1
You can control which file formats to import using the File Formats settings.

NOTE: IMatch is designed to manage all files in the folders you index in IMatch.
Keeping managed and unmanaged files in the same folders can and will lead to problems.
For example, when you delete a folder in IMatch (and you don't see all files in that folder). IMatch will warn you in this case, but better to avoid it.
There are also potential issues with buddy file management and versioning etc.
I don't recommend this kind of workflow. Move the files you don't want to manage in your DAM into folders not indexed by your DAM.

If you have files which crash IMatch (unlikely) of ExifTool or the WIC codec you have installed or LibRaw or one of the other 3rd party libraries IMatch uses to ingest files, these files are most likely badly damaged or corrupted or totally non-standard.
PDF files are processed by a very robust 3rd party component. The only occasions where a PDF file crashed IMatch was when IMatch was using the Adobe Acrobat shell thumbnail handler in Windows instead. Which is very "fragile".
If the PDF is non-standard and rejected, IMatch may fall back to Windows shell handler, which may crash and cause IMatch to crash as a result.
Same for Office documents.

You did not include the IMatch log file (see log file) from a crashed session. This leaves only guesswork, unfortunately.
The log file will include the names or the last ingested files near the end, and also if and which error messages or warnings were encountered while processing the files.
This would tell us which files crash IMatch or a 3rd party component and which files you'll have to repair (usually by re-saving or re-packing in case of video files).
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt

Hi Mario,

The debug log where Imatch crashed cannot be found anymore, as it is overwritten by a newer version.

I do understand your opinion about keeping these file formats separate, however, I cannot do this, as it is a shared folder, used by others.

I have done it with the file preferences, however, this is a bulky process to disable all unwanted filetypes.
Still I think it's a good feature to choose your files that you want to import.

Mario

As I said, mixing managed and unmanaged assets in the same folder is a bad idea, IMHO.
Consider that even in shared folders, you can usually split managed and unmanaged folders/assets quite easily.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt

#4
Hi Mario,

I see again the same issue here.
This time I have the loffile attached.

If you want I can also send you the PDF file that breaks imatch.
The file can be opened by any normal PDF viewer.

When looking in the windows event viewer I see following.

Faulting application name: IMatch2021x64.exe, version: 21.18.0.4, time stamp: 0x63550f55
Faulting module name: PolyImagePro64.dll, version: 1.0.5.0, time stamp: 0x59511fe2
Exception code: 0xc0000409
Fault offset: 0x0000000000216b38
Faulting process id: 0x104c
Faulting application start time: 0x01d985d59e07db09
Faulting application path: C:\Program Files\photools.com\imatch6\IMatch2021x64.exe
Faulting module path: C:\Program Files\photools.com\imatch6\PolyImagePro64.dll
Report Id: e071a069-0321-4bf6-9ddc-853520589080
Faulting package full name:
Faulting package-relative application ID:



Fault bucket , type 0
Event Name: BEX64
Response: Not available
Cab Id: 0

Problem signature:
P1: IMatch2021x64.exe
P2: 21.18.0.4
P3: 63550f55
P4: PolyImagePro64.dll
P5: 1.0.5.0
P6: 59511fe2
P7: 0000000000216b38
P8: c0000409
P9: 0000000000000002
P10:

Attached files:

These files may be available here:
\\?\C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_IMatch2021x64.ex_932367fc9e50d5e669a235f8ad95f430ec8899c_02fc5fba_697a8da9-463f-4381-b9be-d30238ac5ff4

Analysis symbol:
Rechecking for solution: 0
Report Id: e071a069-0321-4bf6-9ddc-853520589080
Report Status: 4
Hashed bucket:
Cab Guid: 0


Fault bucket 0, type 5
Event Name: BEX64
Response: Not available
Cab Id: 0

Problem signature:
P1: IMatch2021x64.exe
P2: 21.18.0.4
P3: 63550f55
P4: PolyImagePro64.dll
P5: 1.0.5.0
P6: 59511fe2
P7: 0000000000216b38
P8: c0000409
P9: 0000000000000002
P10:

Attached files:
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WERFC96.tmp.WERInternalMetadata.xml

These files may be available here:
\\?\C:\ProgramData\Microsoft\Windows\WER\ReportArchive\AppCrash_IMatch2021x64.ex_932367fc9e50d5e669a235f8ad95f430ec8899c_02fc5fba_697a8da9-463f-4381-b9be-d30238ac5ff4

Analysis symbol:
Rechecking for solution: 0
Report Id: e071a069-0321-4bf6-9ddc-853520589080
Report Status: 268435456
Hashed bucket: 2fcf3cd46e9e1c8631ed9e656a73e533
Cab Guid: 0





Mario

Please attach the PDF that causes this. I can then analyze it and maybe see where the problem is coming from.
PolyImagePro is not involved in PDF processing. IMatch uses a 3rd party external PDF processor to extract previews from PDF files, and it that files, it relies on whatever PDF thumbnail handler is available in Windows.
I have seen the Adobe Acrobat PDF handler consuming 4 GB of RAM and then crashing on a PDF file produced by Adobe Photoshop. That's one of my test cases.

Please re-save the PDF with a suitable modern PDF generator to remove any potential problems in the PDF file. This usually solves this kind of obscure problems.

When IMatch crashes, it produces a DUMP file as described here: The Debug Dump File
Please upload the dump file to your cloud space and send me a link.

If you don't see this message, the crash happens "outside" of IMatch, in a Windows component or 3rd party tool. This would make it much harder to analyze the cause for this problem.


-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook


Mario

Thanks for providing the sample PDF files.
I've added them to a new and one of my old test databases without any problem.
No crash, no error message. Preview and cache image created, metadata extracted fine.

When you load them into the original database (or force a reload by selecting them in a File Window and pressing Shift+Ctrl+F5 > Force Update), does IMatch produce a DUMP file? Windows may take a minute or two, depending on how much memory is in use.

The log file you have attached stops when IMatch is creating a cache image for the "20130006 weighing note.pdf" file.
This involves loading the PDF with an external executable, rendering the first page and storing it as a JPG image in the IMatch cache folder.

In the session recorded in the log, IMatch did that several times successfully before the log file ends with this file.
No warnings or errors are logged. The log just ends (which means, IMatch has been terminated or crashed). When IMatch crashes, it produces a DUMP file. When IMatch is terminated by Windows for some reason, it cannot produce a DUMP file.

Typical reasons for termination are:
a) an external component used by IMatch has badly crashed (WIC codec, driver, ...)
b) a virus checker kicked in and terminated IMatch because it considered it to be a malware or
c) the PC gave in under stress and caused random problems (notebooks sometimes do this).

If you could add the same PDF files to a new database, maybe this is some kind of "stress" issue?
How many PDF files dd you add to the new database? Only the problematic ones or all on the E:\.... drive?
What kind of computer are you using? A desktop PC, workstation or notebook?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt

Imatch does not create adump file, it just closes without warning, popup of error.

I added the file to a new database, and only loaded these files. This was successful.
I will try to add the whole folder with the pdfs that cause the crash. I keep you informed about this outcome.

I'm running a notebook.
Here are the specs.

Memory: 32GB
CPU: Intel Xeon E-2286M CPU@ 2.4GHz(Cores 8 ) ( Logical processors 16)
Operating System: Microsoft Windows 10 business

The notebook is curently running at around 95°C.
I will also see if cooling it down will help.


Mario

You might also try to reduce the load/stress for the notebook by reducing the number of parallel threads used for ingesting files in IMatch. Some systems produce random errors while being run for prolonged time under heavy load - which is what IMatch generates while processing files.

See Process Control (Advanced Setting) for details.
For example, set the number of threads for file import to 4 or 6.
This will reduce the load (and heat) considerably when IMatch is indexing files. It then processes only 4 or 6 files in parallel.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt

I'll do that, however, it is strange that Imatch is always closing on this particular file.
My original database file is currently around 10GB. can it be something to do with this?

sinus

Do we speak from two pdfs with names 20130006 weighing note.pdf and 20130006 weighing note_reprinted.pdf ?
I downloaded the two pdfs and imported it in my DB (31 GB) without problems.
Best wishes from Switzerland! :-)
Markus

Mario

Quote from: dkyndt on May 14, 2023, 02:02:31 PMI'll do that, however, it is strange that Imatch is always closing on this particular file.
My original database file is currently around 10GB. can it be something to do with this?

This is interesting but does not rule out that the virus checker kicks in. Which virus checker do you use?


No DUMP file, so "external" cause for the crash.
Processing PDF files uses an external executable, which may trigger a virus checker...
The same file always crashes one database, but loads fine in another database.
All very strange and inconclusive.

What happens if you copy the PDF file from your E: drive to a folder on your C: drive and add it from there to your database? Same crash?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt


When I move the file to another location, Imatch also stops on the same file.
05.15 12:47:17+    0 [248CC] 10  I>                  EUQH::Load(1) of C:\Users\dkyndt\OneDrive - C-power\Drawings\Pictures\20130006 weighing note.pdf with 2484 x 3509 (O: 2484 x 3509) in 1579ms
05.15 12:47:17+  109 [248CC] 10  M>                    > 19 CIMCacheManager::CreateCacheImage  'V:\develop\IMatch5\src\IMEngine\IMCacheManager.cpp(1017)

The windows evens are identical.

When I try to import the reprinted version, Imatch also closes.
05.15 12:56:54+    0 [B670] 10  I>                  EUQH::Load(1) of C:\Users\dkyndt\OneDrive - C-power\Drawings\Pictures\20130006 weighing note_reprinted.pdf with 2479 x 3508 (O: 2479 x 3508) in 1454ms
05.15 12:56:54+  109 [B670] 10  M>                    > 19 CIMCacheManager::CreateCacheImage  'V:\develop\IMatch5\src\IMEngine\IMCacheManager.cpp(1017)'

Is there anything I can do for you to find more about this?

I cannot find a lot of information about this PolyImagePro64.dll online.

Mario

#14
That's just one of the 3rd party components IMatch uses. In your case, IMatch uses PIP to load the PNG preview extracted from the PDF. IMatch uses a 3rd party executable to extract a preview from the PDF and lets it store preview in PNG format in the TEMP folder on your system.

Since neither I nor sinus have any problems with these PDF files on our systems, this is very likely something that happens only on your PC. You did not answer my question about the virus checker you use. I think this is the most likely culprit. I tried on 2 PCs and a fresh Windows installation in a VM. No issues processing your PDF file or the other ~ 200 I have in my sample/test library.

Did you check the protocol of your virus checker for related messages?
The Windows Event log for messages related to IMatch or your virus checker?
Did you configure IMatch as an exception (See IMPORTANT: Virus Checkers) just to see if this makes any difference.
Since there is no DUMP and the log just stops because IMatch is terminated, the Windows event log, virus checker protocol etc. are the only sources for information.


If IMatch is not creating a DUMP file, it is terminated forcefully from the "outside". And this typically means a virus checker or Windows itself.

Maybe open the PDF in your PDF reader and then print it to a new PDF. If there is something in this particular PDF that somehow upsets your system, rec-creating it may solve this.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt

Quote from: Mario on May 15, 2023, 02:53:31 PMYou did not answer my question about the virus checker you use. I think this is the most likely culprit. I tried on 2 PCs and a fresh Windows installation in a VM. No issues processing your PDF file or the other ~ 200 I have in my sample/test library.

Did you check the protocol of your virus checker for related messages?
The Windows Event log for messages related to IMatch or your virus checker?
Did you configure IMatch as an exception (See IMPORTANT: Virus Checkers) just to see if this makes any difference.
Since there is no DUMP and the log just stops because IMatch is terminated, the Windows event log, virus checker protocol etc. are the only sources for information.

I'm sorry to forget the virus checker.

The virus/malware checkers currently installed is windows defender.

QuoteMaybe open the PDF in your PDF reader and then print it to a new PDF. If there is something in this particular PDF that somehow upsets your system, rec-creating it may solve this.
This is what the reprinted version is.
I opened it via a pdf viewer and printed it via the doPDF printer.

I'll dig further into this matter.

Mario

Windows Defender usually does not cause any issues.
To rule it out, configure IMatch (or, better, the IMatch folder in Program Files\photools.com\IMatch*) as an exclusion temporarily to see if this changes anything.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

dkyndt

Quote from: Mario on May 15, 2023, 04:15:04 PMWindows Defender usually does not cause any issues.
To rule it out, configure IMatch (or, better, the IMatch folder in Program Files\photools.com\IMatch*) as an exclusion temporarily to see if this changes anything.
Hi Mario, 

I have excluded the imatch folder, however this does nothing. The software still stops when this file is encountered.
I try to open the database on another computer and see what this gives.

I keep you informed.

dkyndt

Hi Mario, 

I did some deeper searching with Process monitor from sysinternals.
I attached the complete logfile. I have only kept the last 10 seconds, because it is otherwise 7million lines of log.

OneDrive

I will also check the file tonight in more detail.

Mario

So, what did Process Monitor find out?
Where there any errors logged? They appear in red.
Please understand that I cannot analyze process monitor log files for you. I don't have the time.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mopperle

#20
@dkyndt next time please put the file into a zip file, easier to download.

I felt so free, loaded it, set the filter
"Process Name IS IMatch..."
"Result IS NOT Success"
and got 293 events, but no result named "ERROR". Also in the complete log no "ERROR".

Mario

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Mario

500,000 rows of log entries.
No errors or or "denied's I can find.
If you hide registry-related entries, there is only the typical noise produced by IMatch.

Sinus had no problems loading the PDF files.
I had no problems loading the PDF files on 2 PCs and a Windows 11 tablet.
If mopperle can also index the two PDF files without a problem, it's one of those "one computer only" issues and digging into that could take an unlimited amount of time. Looking at my two week "My metadata isn't working right" backlog, this is as far as I can go, sorry.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mopperle

As only one pdf was available on the OneDrive folder, I downloaded it and saved it with Acrobat Pro under another name.
I could import both files without any problem, but when I opened the original file in the Viewer, i got "Loading..." and then Imatch was terminated. Error message in the Windows Eventviewer the same as the TO got.
On a 2nd attempt I could open it in the Viewer, but the file was not displayed completely (see screenshot). Imported various other pdfs and had no problems in the Viewer. So IMHO something seems to be wrong with this specific pdf. And the problem did also not appear when loading the original file saved With Acrobat Pro under another name

Mario

#24
Quotei got "Loading..." and then Imatch was terminated.
Did IMatch produce a DUMP file?
Did Windows report a faulting app / DLL?
The image in the Viewer looks like IMatch was terminated in the middle of creating the cache image. The cache image can be loaded, but it is incomplete. You can re-create it by forcing an update of the file via Shift+Ctrl+F5.

I can view both files (from the original OneDrive content) without problems.
Maybe you can give the OP the converted (and working) PDF so we can close this?


Also the current 20130006 weighing note.pdf file.

The only thing I could think of was that maybe the newer version of the external PDF toolkit used by IMatch 2023 handles the file better. But I could switch back to the old version and re-index and view all PDF files without a problem.

QuoteSo IMHO something seems to be wrong with this specific pdf. And the problem did also not appear when loading the original file saved With Acrobat Pro under another name
This is why I asked the OP to save the file with Acrobat Reader or print it to a new file.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

mopperle

QuoteDid IMatch produce a DUMP file?
No
QuoteDid Windows report a faulting app / DLL?
Same as in post #4

Attached the file save with Acrobat Pro

Mario

Thanks. But that's misleading. the PIP DLL is loaded in the IMatch address space and when it crashes, IMatch's crash handler will be called. I guess  that the external PDF helper executable crashes when it extracts a page from the PDF and Windows then terminates IMatch "from the outside".
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

thrinn

I also downloaded the one PDF available (20130006 weighing note.pdf) and imported it into one of my test databases. No problems at all. File is displayed correctly in Quick View as well as Viewer.
Thorsten
Win 10 / 64, IMatch 2018, IMA