The big challenge ... moving my pictures into IM5 ..

Started by Gerd, August 05, 2013, 12:51:49 PM

Previous topic - Next topic

Gerd

Dear MArion,

at moment I'm busy to move all my 155.000 pictures into IM5 ... (for sure, I have a complete backup ...)

The pictures are located (not optimal, but movable!) on an external 1 TB hardisk, connecteted via USB 2.0. The location is the same as for IM 3.6, I only try to use now IM5 parallel (either IM3.6 or IM5.x )

I'm busy with that since yesterday 13:00, whole story will follow later.

What I now want to know, the system is very busy with all 4 cpu-threads to read-in the meta-data. Now IM5 let me know, 11% read-in, 18h left.

What happens, if I close IM5? Are the 11% read-in lost? For these 11% today IM5 was busy from 9:00 till now (12:42).

The tool ProLasso shows me 4 processes used by Exif-Tool, so I think, it is optimal working ...

Overall the system is very busy, mouse movements are done in steps. All thumbnails are shown very unsharp and contain this blue double update-arrows. If I activate a category, for a relative long time (ca. 10-20sec.) a big white window is shown, before IM5 comes back to live!

At moment I will wait until reading of meta-data is finished, before I can say more.

Regards
Gerd

Short calculation inbetween:

for 10% from 144.000 pictures = 14.400 pictures 3h are used. That are 1,33 pictures per sec. ... is that a 'normal speed', if there are 4 exiftools parallel busy?
_______
Regards
Gerd

dcb

IM5 works like IM3.6 in that the 11% already read in should stay there.

However, this is BETA software so anything goes. Working both databases on your live data is risky and not recommended unless you have good backups. Can you copy the 155,000 images to another 1TB drive and TEST using that?
Have you backed up your photos today?

Ferdinand

Isn't there some way that you can put them on an external USB3 or eSATA disk?  USB2 is really going to be slow!

Gerd

Dear DCB,

yes, I have all my pictures backed-up on an extra 2 TB harddisk.

Do yoy think, that thers is a risk with IM5, that it can destroy my pictures so that they are no longer readable?

May be it is better to use the back-up, because these data will be overwritten/added from my original source?

@Ferdinand: no way to use other connections as USB 2.0 (at moment).

Regards
Gerd
_______
Regards
Gerd

Mario

QuoteDo yoy think, that thers is a risk with IM5, that it can destroy my pictures so that they are no longer readable?

Yes. I go on and one about that in the Beta Tester Guide. Like "Never use your original images" or "Your IMatch 5 database may be required to be deleted during the Beta". And all that. A Beta software is not for production use and not for original images. It is just for testing purposes.

You can close IMatch at any time. It will finish whatever it is doing and then continue right there when you start it the next time. Adding 150,000 images into a software not yet tuned for performance is maybe not a good idea. A good stress test, but slooow.

If your system is so busy that Windows cannot keep the mouse running, you should limit the threads used for background processing under Edit > Preferences > Application > Process Control. Try 2 and 2, for example.

When IMatch 5 is ingesting this number of files, you should not try to do other work with it. In IMatch 3 I disabled the user interface while IMatch was ingesting files. For IMatch 5 the aim was to keep the UI responsive. While IMatch is just ingesting a few dozen files or a day's worth of new photos! But not when processing 150,000 files in full gallop. This will leave very little CPU and especially disk resources for the user interface, which is why IMatch feels sluggish. There are limits of what can be done. More CPU is good, but when ingesting files that database and especially the hard disk will be the limiting factors. And when the background update locks the database because it has to move megabytes of data into it every few seconds, the user interface will have to wait until it can read data. This may even cause IMatch to become unresponsive for 10 seconds or more.

QuoteAll thumbnails are shown very unsharp and contain this blue double update-arrows.

That's normal. While IMatch is ingesting files it will try to show you very low-res previews (extracted from the 160 pixel thumbnail in the image file if available) when you browse folders. This is a comfort feature for you. These rough thumbnails will be replaced by the real thing as soon as IMatch has processed the file. You can tell by the "updating..." icon (double arrows) you see on each file still in the processing queue.

Tip: Run a database Database > Tools > Compact once in a while (IMatch will prompt you to do so via a popup in the status bar). This will speed up the database considerably. Especially during the slow phase where IMatch is ingesting metadata via ExifTool and producing the XMP data records by merging in IPTC, EXIF, GPS and whatever data you have configured for the merge. This is a very disk intense process and the database will have to store dozens of millions of records.


-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Dear Mario,

thanks for explanation ....

The back-up I have is not a back-up like it is named, where you normaly get on an extra hard-disk special back-up-files.
My back-up is a physical copy of all my files. I use a software, that looks for a physical difference between storage a and storage b, and I set options so, that only from a to b it is copying. Too many experience, that back-up-software could not restore. But here I can also use the explorer to copy items back. I use the same procedure for the directories with the IM-database-files.

For my pictures now I organize it so, that I 'back-up' my new pictures from my notebook first to storage a and then to storage b inststead of  notebook -> storage a -> storage b

My way of working is, that at first I have the pictures on my notebook,  1 photo-year, approxemately 15.000 pictures. Here I only process the pictures with diff. software. Are they ready, I save them as (depends on importance) jpg or tiff. From time-to-time I copy them to storage a, where they are then processed (categorized) by IM5 (or IM 3.6).

... and what is the difference between a beta and a final release? Look, I use now Lightroom 4.4 and parallel 5.0, both are final releases, but in LR 5.0 there are a lot of faults so there in now a new release candidate, but also with faults ...

How will you guarantee, that you IM5 final release (when it will be ready) does not destroy pictures, if there are 200.000 in the database, or 250.000, or 300.000. Or it happens only, if there are 333.333 pictures in the database? The most important part is reading and writing data .... the GUI with all the windows, is only for us users to can do it more easier formore pictures. Do you think, you may change the basic-routines to read and write in this phase? I think, it is only what to read and what to write.

My experience with your software over all the years is, that it very trustable in a phase, where you make a release public.

These are onle my 2-cents ... sorry for the long story ...

Regards
Gerd
_______
Regards
Gerd

Richard

Hi Gerd,

Many years ago NASA paid millions to have Launch Control software written. Then they paid hundreds of thousands to have that software checked for bugs because lives were at stake. Bugs were found and eliminated. Yet the final version still had bugs. My point is that it is nearly impossible to "guarantee" any complex software.

Our lives do not depend on IMatch. Only our image files. To ensure that nothing damages our image files, we make backups and never allow any software to modify those backups. Only copies are used when files will be modified. By doing so the worst that can happen is that a messed up copy will have to be replaced with a fresh copy of our original file.

Mario works hard to ensure that, when any version of IMatch is released to the public, IMatch is stable. This is why IMatch 5 will spend months in Beta testing. Even after all of that, nobody can guarantee that all bugs have been found and fixed.


Gerd

Hi Richard,

that's what I'm also thinking. My life depends not on software ...  ;)

What do you think, will happens with all our software, docs and pics and what else is stored somewhere in and arround our computers, if we die somday?  ... delete all .... !  ;D

Regards
Gerd
_______
Regards
Gerd

Mario

Quote... and what is the difference between a beta and a final release? Look, I use now Lightroom 4.4 and parallel 5.0, both are final releases, but in LR 5.0 there are a lot of faults so there in now a new release candidate, but also with faults ...

I know that Adobe had to ship emergency updates because of data corruption only a few days after their Beta Test ended. My impression is that Adobe does not run Beta tests to do real hard testing on their products with many users. They use it as a teaser, for the press, their "fanboys" and to make users shell out the money for the update. A typical IMatch user has paid once for IMatch over the past 5 years, and three to five times for LR. Go figure.

Quote... and what is the difference between a beta and a final release?

For me a Beta is a yet unfinished product which requires some additional work and lots of testing. Which is expected to haver bugs. Crashes. Which is not fir to be used on production systems or for your real files.

For me, the release version is the version which is considered by the majority of the Beta users as fit for purpose and production. Which may have some bugs (no software is ever free of bugs) but which does not risk your database or images or nerves.

QuoteHow will you guarantee, that you IM5 final release (when it will be ready) does not destroy pictures,

Can't. This is why all users have to sign of the license agreement which states "No guarantee. Use at your own risk". You have to sign something similar with any software. IMatch uses code written by others. IMatch uses code built into Windows. Windows uses code written by hardware vendors. All that is used when IMatch processes one of your images. All that code can have faults: Windows, the file system software, the device driver which writes to the disk, the firmware code in the disk which writes the bits etc... It is impossible to guarantee that all that works.

Since IMatch uses ExifTool to update your files, and ExifTool is used by millions of users and big sites like Flickr, the risk to damage a file is near nil - but existent.

IMatch 3 has a good name for being solid and reliable. I want the same for IMatch 5. But IMatch 5 is a really big update and I run this Beta as long as needed to hammer out as many bugs and glitches as possible.

I don't need to answer to shareholders who demand revenues or marketing people who require me to ship my software in time for a specific trade show or holiday season. I can ship IMatch 5 when it is ready. I manage my own files with it, after all  ;)
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Dear Mario,

that's, why I continue with IM!  :)

Regards
Gerd
_______
Regards
Gerd

Richard

QuoteWhat do you think, will happens with all our software, docs and pics and what else is stored somewhere in and arround our computers, if we die somday?
My main effort and the reason why I have IMatch is genealogy and family history. Many of my docs and pics have been used by dozens of others along with much of my families history. Long after I die and my name is forgotten, some of my work will live on. As to what is on my computer, CDs, DVDs, backup drives, etc., my hope is that I have eliminated most of it before I die. Anyone getting deep into what is on my computer would know I am crazy and I don't want to leave proof.  :D

Gerd

"Long after I die and my name is forgotten, some of my work will live on ....."

Amen!  ;D

Regards
Gerd
_______
Regards
Gerd

Gerd

Dear all,

short info inbetween:

Now (today, 8. Aug.; 17:00) IM5 finished "Reading-in Metadata"  ... now I thought, I could start working with IM5, but ... surprise, surprise ... the programm now is 'Adding and Updating Files' ... it is now busy with 1%, started with 8h, but shows now 20h ....  I have to wait ... will there come more surprising data/files-updates?

Regards
Gerd
_______
Regards
Gerd

BenAW

Have a look at your drive setup. Perhaps something slow there?
My 19.000+ images take about 1.5 hours to be complete ingested.
Do you have Preferences > Cache set on Default?
Perhaps "On Demand" is a better choice for you.

Richard

I agree that Cache "On Demand" would be a better choice for Gerd. She is ingesting 155,000 files from a HDD connected via a USB 2 port. I am sure that USB is slowing the import but I have had big problems with speed my self. Yesterday I had Metadata to write back for 3,848 images. I was going to be away for two hours so I started the write back. It projected that it would take 2:14 minutes. I forgot that the time given was only for the first phase. When I came back 2:30 later, it was in the second phase and the projected time was 14:37. There is a third phase as well.  :( I have decided that I will forgo any future Metadata write backs on this computer.

Gerd

Thanks for the tip, I have changet to "On Demand". Let's see how it now will proceed ...

Regards
Gerd

P.S. we are now 1 h later and IM5 has done 13%, so this little settings speeds up tremendous! Before it was 2% per hour ...
_______
Regards
Gerd

Gerd

Dear Mario,

IM5 is back into the life! Diagnose and Optimization are finished: no errors found.

Now I need some infos, how to continue, because in the categorie-view I see a lot of multiples of the same categories.
The amount of pictures ( all = 144.108) may be ok, but why shows the sub-category IMatch Sample ... Make and Model ... Canon  only 15?
Minolta and Fuji seems to be ok.

What do you propose as next steps?

Regards
Gerd

P.S. what is the difference between "unassigned" and "uncategorized" (see screenshot)

[attachment deleted by admin]
_______
Regards
Gerd

ovrevid

#17
The data-driven categories needs to be refreshed (indicated by the blue dots on the categories). You will find this in Help under Working with categories

Edit: Refreshing is explained at Data-driven Categories
br
Vidar
-- Vidar

Gerd

Hi Vidar,

... and what about the multiple "normal" categories?

Regards
Gerd
_______
Regards
Gerd

ovrevid

I have no idea, sorry. Try refreshing the categories first.
-- Vidar

Mario

#20
QuoteP.S. what is the difference between "unassigned" and "uncategorized" (see screenshot)

Needless to say that you can just look that up in the IMatch help. If in doubt, use the help. It's pretty good. And you can learn about data-driven categories, what the blue refresh icon means, the category formulas used for these categories at the same time. Cool stuff you for sure can use to let IMatch do a lot of work for you.

And, if you just click on one of these categories and then read the description in the property window you will immediately know what these categories are for... for example:



When setting up these sample categories for you, I've actually spent time adding a (hopefully) understandable description. And by looking at the formula used for this category @Unassigned you also know what to look up in the help for more information. There are many useful formulas like this, and I'm sure you will be able to use them to organize your collection faster, better and with less work for you.

[attachment deleted by admin]
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Mario

Quotebecause in the categorie-view I see a lot of multiples of the same categories.

This is impossible. Each category must be unique under it's parent. There cannot be more than one category with the same name under the same parent. Can you give us more details?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Dear Mario,

the screenshot shows, how it is ...

I started on 4th August with reading in the pictures and importing the categories from IM36 (with check-sum), the same file, Iused in my test-database, where it worked without problems. I got the message, that 0 categories are assigned, so I tried it now with "name and path", because the location of the files is the same for IM36 and for IM5, again, 0 categories assigned, now I tried it a third time, again with 'checksum' , again ) assignments. May be, that every time new categories are created.
After that, I thin, I did an optimize and then suddenly I saw pictures assigned to categories. I have saved thwe screenshots from categorie-import and made a copy of the log-file. I attach here only the screenshots, because the other two are also zipped more then 2 MB (I'll send them direct), so you can have a look, may be it helps ...


[attachment deleted by admin]
_______
Regards
Gerd

Gerd

Dear Mario,

I have exported a small categorie "My Familie" as xml-file and imported into Excel to analyse.

Also here I can see three time the same categorie, e.g. "Mum and Dad" with different cat-oids, see screenshots.

Regards
Gerd

P.S.  ...a temporary workaround: in the view "Categories", below in the window "Category Filtyer", I activated the option "Hide empty categories" ....

[attachment deleted by admin]
_______
Regards
Gerd

kiwilink

Gerd:

I was reading the thread about speed so I tested my entire database just for a test.  I was really surprised how fast Imatch5 was.  Here are some statistics from migrating my entire database.

It took 29 minutes to load my entire database (83GB, 33,724 images, 675 folders).
It took another 21 minutes to update the indexes.

I hope this helps.

Mike

P.S.  Here is my configuration

Intel I-7 3770 Processor
32GB of Memory
Z77 Extreme 4 Motherboard
64 bit Windows 7
IMATCH5 Database files only stored on an Samsung SSD 840 Disk
Actual images stored on a Seagate 500GB Internal Harddrive

Mario

Quote from: Gerd on August 10, 2013, 11:00:11 PM
Dear Mario,

I have exported a small categorie "My Familie" as xml-file and imported into Excel to analyse.

Also here I can see three time the same categorie, e.g. "Mum and Dad" with different cat-oids, see screenshots.


You have attached screen shots from IMatch importing categories. This does not help in this case.

If you import categories from IMatch 3 into IMatch 5 and IMatch 5 duplicates the category names, this is a bug.  We need to find out how this happens.
Since this thread is so long already and covers so many topics, I have lost track. Always better: One problem per thread, this makes it easier to discuss.

To analyze this I need the IMCS file you have created in IMatch 3.
And information how you imported the file, e.g. did you import the same file multiple times?
I have never seen duplicate category names, so either they are already in the IMCS file exported from IMatch 3, or they are creating during the import.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Dear Mario,

at least I have deleted all my imported categories and did it again some days ago with the result, that not all pics are fitted to their categories, but for this I have opened a new post.

Regards
Gerd
_______
Regards
Gerd

Mario

But you last post was about duplicate category names, which is a different problem.
Do you have the IMCS file and the steps you did in order to produce these duplicate category names?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Hi Mario,

the IMCS-file from IM36 is zipped 3,3 MB. I'll send by email

The steps I made:

1. I added the 144.000 pictures
2. I exported the IM36-cat-export-file (std-format with file-links)
3. I imported this IMCS-file with option check-sum, result: 0 categories added, 0 files assigned
4. Then I decided to import again, but now with option Name and Path, because the location of the pics is the same as for IM36: 0 Categories added, 277.720 files assigned
5. Then again I imported the IM36-IMCS file with option check-sum, because I thought I did something wrong or forget something and now I saw the categories, but some of them 3 times.

Regards
Gerd
_______
Regards
Gerd

Mario

Thanks for sending the category schema file.

I've imported it multiple times, in multiple databases. Closed the databases. Re-opened them. Imported the file again.
I even changed the check sums and file names in your file so that IMatch found some files to import by check-sum and by name.
In none of my tests, IMatch produced duplicate categories  :-\

You wrote that you first tried to import the file with the check-sum mode and no files were assigned. The second import using the path worked. This would indicate a problem with importing by check sum (there are several other reports about this and I'll be looking into this shortly). You then imported again, by check-sum. You then saw the duplicate categories. You did not mention how many imported categories IMatch reported in your third attempt. What did IMatch show?
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Gerd

Hi Mario,

after the third try, all categories have been imported ... I was so surprised and happy, that they now are in, that I forgot to make a screenshot ....

I have attached some screenshots. If you sort the by date, you will get a small overview, what has happened.

The last two (Cat_130811_19h47m ...) are made, after I deleted all my imported categories and cleand up to one categorie the ones in "IMatch Sample ..."

Regards
Gerd

[attachment deleted by admin]
_______
Regards
Gerd