Another way to convert the IM3-database to IM5?!

Started by sinus, October 12, 2014, 05:06:02 PM

Previous topic - Next topic

sinus

Hi Everybody

Joerg (joel23) brought me to this idea (thanks, Joerg!).
I have a large db in IM3, about 200'000 images. I have once converted this db with success, but it took about 22 hours.

But I wanted later convert my db again (different reasons). So I tried again twice, but had everytime a problem, reported by IMatch, after 15 hours or so.

Now I thought, why not go another way, maybe faster or at least, with not a lot of waiting on the converter, because I can do most of the conversion by myself, step by step.

Here is the "idea":

1) In IM3, I push every important field of the properties into the custom fields of iptc (by a script). I tested this, not problem to do and no problem to import into IM5, thanks Mario (he has added these custom fields lately).
All property-fields I will not more use in IM5 (I will use there attributes, but with new fields and new entries).

2) Now all images in IM3 has all informations inside the image (I have no sidecar in IM3, all is in the file). What is missing, are the categories.
For this I clear the categories, for example all iptc-keywords I can delete, because IM5 will import and create categories for the keywords inside the file.

3) I export the categorie-schema INCLUDING the file-links in IM3.

4) I turn to IM5. And now I import all folders and files, exatly the same as in IM3. Here I can do this step by step, I must not do all in once, if I like.

5) If the importing of all files is done, I should have exactly the same number of files like in IM3.  Now I can import the category - file from IM3 into IM5. Here I can choose the file detection with "map the file by a checksum" or with the filename and path".
Because the path is the same like in IM3, I can choose "filename and path" - though I do not know, if the checksum would be better (faster)?

After completing, I should end with my new "converted" database from IM3.
The advantages would be (from my point of view), that it would be (hopefully) quicker (not 20 hours).
Second I can do it mostly step by step (the importing).
And I have the feeling, that IM5 has imported and created all files new. The quality is possible the same, but my feeling is better ;)

I have tried this, but only with 3000 images, not with all 200'000 images, and it worked fine.

What would you think? Is there an error in my thinking, did I forget something?

(BTW: once imported all informations from IM3 to IM5, I can move or delete them easy in IM5, for example move the entries in the custom-fields to other fields or to attributes and so on.)
Best wishes from Switzerland! :-)
Markus

Richard

#1
QuoteIs there an error in my thinking, did I forget something?
I hope not because the idea of working with portions appeals to me. I can unload a ton of bricks if I do it brick by brick but doing it all at once is impossible. Say you work with 3,000 files at a time but a file in a batch causes problems. You can then select half of the 3,000 and try again. If it works, the problem is in the other 1,500 and you can half split that. Eventually finding all files that created problems. With a 200,000 file conversion, it would be hell to find out why it failed.

Mario

You can always

1. Add all your files to an IMatch 5 database.
2. Export categories in IMatch 3 and import them into IMatch 5
3. Export properties in IMatch 3 and import them into IMatch 5

If all the metadata is written (in the file and nothing pending in the IMatch 3 database), IMatch 5 will pick up all your metadata.

See also http://www.photools.com/3152/migrating-xmp-rating-label-imatch-3-imatch-5-processing-pending-updates/

What happens after 15 hours (?)
15 hours is a lot of time, even for 200,000 files...
Does IMatch crash?
Stall?
Did you file a bug report for this?
Thousands of databases have been converted since the being of the database, and only a few real problems have been reported...

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Richard on October 12, 2014, 05:54:17 PM
QuoteIs there an error in my thinking, did I forget something?
I hope not because the idea of working with portions appeals to me. I can unload a ton of bricks is I do it brick by brick but doing it all at once is impossible. Say you work with 3,000 files at a time but a file in a batch causes problems. You can then select half of the 3,000 and try again. If it works, the problem is in the other 1,500 and you can half split that. Eventually finding all files that created problems. With a 200,000 file conversion, it would be hell to find out why it failed.

You are right, Richard!
Mario wrote here too, when I wrote about the failing of the conversion, that some images could be the source of the troubles.
But if these images are not found in the log of the conversion, then it is really a hell to find these images. And, even IMatch would find these images, after 1 hour, or 10 hours or so ... I have to start again all over and that takes time.

If I can import the files folder after folder (for example) I have much more control over the process, you are completely right. I am only not sure (but I think, it works), if after the import of all 200'000 images the import of the categories including the mapping of the files would work.
Best wishes from Switzerland! :-)
Markus

sinus

Quote from: Mario on October 12, 2014, 06:53:57 PM

What happens after 15 hours (?)
15 hours is a lot of time, even for 200,000 files...
Does IMatch crash?
Stall?
Did you file a bug report for this?
Thousands of databases have been converted since the being of the database, and only a few real problems have been reported...

Thanks, Mario, so I will try to do so (Import into IM5 and then export/import the cats with links).
I guess, it does not matter really, if I choose "map with checksum" or "map with filenname and path".

Because my files are in the same place, I will choose mapping with filename and path.

About the 15 hours, no, I did not report this.
To be honest, I thought, you have enough to do than looking at my log and second I thought, I will first delete some unwanted images and clear better my categories and check the metadata.

I have done this now and I am optimistic, that it works now.
Maybe I wait on the next build  ;D then I will do the work!  :D
Best wishes from Switzerland! :-)
Markus

Ferdinand

It is hard to find one or two files in your database that prevent conversion, or even importing into a new database.  I recall various threads about it during the beta test and I had a couple as well. 

To find them you have to import them into a new database in sections (like one year at a time) until you find the section with the problem file(s), then import that section in smaller sections and so on.  I haven't found any other way.  I couldn't tell from the logs.

When you do find the files then send them to Mario so that he can see what causes the problem and perhaps prevent it in the future, or get Phil Harvey to prevent ExifTool gagging on those files.

sinus

Thanks, Ferdinand, for your advice, I will do so.


Best wishes from Switzerland! :-)
Markus

Richard

QuoteTo find them you have to import them into a new database in sections (like one year at a time) until you find the section with the problem file(s), then import that section in smaller sections and so on.

When trouble shooting electrical circuits I use what is called "half splitting". It works well to find damaged image files as well. If a batch* of files will not import, try again with half the files. It that fails, try again with the other half of the files. Unless both halves contain a damaged file, one half should import. Now take half of what remains and try again. Eventually you will get down to twp files. One will import and one will not import.

* what you select as a "batch" will depend on how your files are grouped. Maybe all files in one folder.

sinus

Quote from: Richard on October 13, 2014, 03:40:06 PM
QuoteTo find them you have to import them into a new database in sections (like one year at a time) until you find the section with the problem file(s), then import that section in smaller sections and so on.

When trouble shooting electrical circuits I use what is called "half splitting". It works well to find damaged image files as well. If a batch* of files will not import, try again with half the files. It that fails, try again with the other half of the files. Unless both halves contain a damaged file, one half should import. Now take half of what remains and try again. Eventually you will get down to twp files. One will import and one will not import.

* what you select as a "batch" will depend on how your files are grouped. Maybe all files in one folder.

Thanks, Richard, a good advice, I will remember, when I do "the work to import".
Best wishes from Switzerland! :-)
Markus

sinus

Yep, could not wait on the new build  ;D I am on the way to import the files directly from the folders.
I choose folder by folder, so I have a very good control over the process.

I can see, if the Umlauts are corrects, if the keywords/hierachical keywords/@categories are correct or better as I wish them.
I can also control the behaviour of my individual windows-thumb-layout (what is ways better than in IM3) and I can see the quality of the thumbs and so on.

Of course this all we could also have with the converter, but in a way I have even more control and ...  8) it makes more fun!

BTW, it is very good, I can add a subfolder, and later the rootfolder and IM5 does look, that all folder are at the end in the correct order - very impressive.

I hope, all will go fine and if so, then I will import the categories with file-path-mapping.
The properties I must not convert, because I pushed all relevant fields into the metadata of the files.

Best wishes from Switzerland! :-)
Markus

sinus

Well, it is done!  :D

Phew, I have imported all images into IMatch 5, simply added the (same) folders like in IM3.
All relevant properties I pushed before into the custom fields of iptc.

This import took a while of course, but the good thing (against the converter) was, that I had all in "my hands", because I could add folder after folder or a bunch of folders.

After the import I checked with IM3, there where almost the same number except some differences. These differences were all from not know formats of IM5.
I have added these (gp3 and do not know know, the format of Adobe Premiere Pro CS6). After this the number was in each folder exactly the same, phew ... a glas of wine  :)

Then I removed some not more important categories in IM3.
After this I exported all categories including file mapping.

Well, then I imported these categories into IM5, with file-mapping of the "file and path" and it seems, that it worked fine.

I made then  diagnosis and compacting in IM5 and all shows "no error".
The only files, what I have excluded now are my music-files, because I can always import them later.

To be honest, now, that I am in working progress with IM5 I have only these fears:
- that IM5 has too much white screens and crashes (I had some in the past)
- that IM5 does lost collections (I have lost quite a lot of collections in the past)
- that IM5 is too slow (but scrolling is very fast!)
- that I cannot do everything like in IM3
- I have decided to go with the raw (nef) the xmp-sidecar-line. I hope, it works fine.

But I am optimistic, if not, I would not have converted.
And, say, I have troubles with some images ... well, I can always go back to IM3, import them and work there.

What is on one view other than in IM3?
- much better overview over the images (see attachement)
- much more and clearer informations about all metadata
- I can do and see a LOT more then in IM3
- I can see much quicker, when files are took, thanks to timeline
- with the collections I have MUCH more possibilities
- with the apps anyway
- the attributes are (mostly) more convenient than the old properties and have more possibilities

What could be (roughly) better for me, my opinion?
- specially the table of the Attributs, for example I would like to have them from top to bottom, not only horizontal (but because Mario has enough to do, I want not add a feature request, finally the work well), and some other things there with attributes
- my open feature request for proxies ( https://www.photools.com/community/index.php?topic=1614.msg19094#msg19094 ), but also this I can and will handle

... more I will see during working!  ;D

Thanks, Mario, for a very good piece of software! (thumbs up)

ups, almost forgot, here are the results of the Eurovison Song Contest, from Zurich, Switzerland:  ;)
-------------

nach dem Import der cats von IM3:
2014-10-16
IMatch database diagnosis logfile created: 16.10.2014 12:24:06

Application Info:
    Version:                      5.2.0.6
    Filename:                     C:\Program Files (x86)\photools.com\IMatch5\IMatch5.exe

General Database Info:
    Database file name:           C:\sinus_db_IM5\SINUS-IM5.imd5
    Database file size on disk:   9.34 GB
    Number of folders:            583
    Number of files:              168'934
    Number of categories:         7'131

Checking files:
    Performing sort array maintenance:
Completed.
Completed.

Checking file history:
      Entries:  366'424
Completed.

Checking time line:
      Entries:  5'530
Completed.

Checking Metabase:
Completed.

Checking Cache:
    88'494 files in cache folder.
Completed.

Checking Annotation Objects:
      Containers:  0
Completed.

Checking Visual Query Data:
      Data items:  660'264
Completed.

Checking Favorites:
Completed.
    Clearing oid cache.

Analyzing database:
Optimizing Database, rebuilding optimial index structures and query plans:
Completed.

WriteTest:
Completed.

Results:
    Errors:    0
    Warnings:  0

IMatch database diagnosis logfile closed: 16.10.2014 13:49:09 (01:25.03)





[attachment deleted by admin]
Best wishes from Switzerland! :-)
Markus

Mario

In case you did not already, run a Compact & Optimize to speed up your new database.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Mario on October 16, 2014, 03:05:37 PM
In case you did not already, run a Compact & Optimize to speed up your new database.

Thanks, Mario, yes, I did, for sure, first diagnosis, then the second tool, compact & optimize.
It takes some time, but specialy in the beginning, I will do it 3-4 times a week, and every time before I do this, I make a backup.

At the moment IM5 runs very good! Es macht Spass, damit zu arbeiten!

Best wishes from Switzerland! :-)
Markus

Mario

Very good, have fun.

When you experience bad performance or ghosting (white screen), keep the log file and remember what you did.
Ghosting happens when IMatch has to do too many things at once, e.g. too many data-driven categories, ExifTool running in the background reading and writing files while the user causes actions in the user interface which also demand a lot of database resources, ... etc.

Finding such hot spots and solve them is part of the ongoing maintenance, so I need feedback from users on that.

The 5.2.8 includes changes which already lessen the stress in many areas, especially for users with many file relation rules. The filter panel is more responsive when you work with one or more value filters etc.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Best wishes from Switzerland! :-)
Markus