Searching all Metadata

Started by sinus, November 24, 2017, 10:32:49 AM

Previous topic - Next topic

sinus

I want search the whole DB in all Metafields for different words, like:

Meat
fisch
founded
tabletop
dishes

and so on.

What is the easiest way to do so, without looses the nerves?    :-X

I have over 240'000 files, this makes it not easier.

But I go to the DB-node to search over all files.
Then I try to use the Search Bar.
I want search for "tabletop" for example.

Phew, I have to be quickly. A wrong letter and I have to correct it quickly because otherwise IMatch begins to search.
Once in search mode, what lasts for say 2 minutes, I think, "oh, this word is wrong, I have a better one to search" ... I can nothing do than waiting. I cannot interupt the search.

If I search one word, ok, this is not that bad, but if I have a lot of different searches, it is annoying, my nerves starts to flatter.

Hence I switch to filter-search.
Stay cool, I get a coffee, stay cool, relax.
Metadata Search with contains "tabletop".

I read the help, does not really help, after some searches my nerves are gone.  :-\
I do not know, does IM search now or not? Though Mario did describes all, I make a chaos, I am not able to do a proper filter-search.
Then IM stays, but it did not search. I look at the pause, the color from the bar, did I checked the correct checkboxes in the filter - panel.

OK, another search. Relax, it was for sure my foult.
Ahhh, again an error of mine, the wrong word, I want interrupt, not possible, I do not know.

Heck, all what I want is search the whole DB in all metadata-fields.
I am not able to do so.

I wrote once or twice in earlier posts, I have troubles to search with the filter. Too complicated, not intuitiv FOR ME!
Since there where only a few time some posts with problems like I have, I think, the problem is simply alone me!  :-[

My search-friend is a simple thing like the new File Finder-app.

Easy to understand, simple.
Does search for all files, undependent where I am, this alone is a big plus, really.
For File name Searches perfect. Really perfect.

Therefore my question:

Would it make sense to create a feature request for the File Finder to search also in all Metadata-fields?
I am sure, this would much easier, at least for me.  ;D

(in the visual-basic-script-times I had a script, created by my one, what was quite equal like the File Finder.
Enter a word, hit search and IM searched the word in all Metadata-fields.
Was perfect, but I have realised in the last monthes, that I am not able to deal with the Javascript-language really.
If so, I would have written such a script long ago).









Best wishes from Switzerland! :-)
Markus

Arthur

#1
There is a filter panel and a inplace search in the file window, so if you make a feature request, why not improve these? Who needs 3 places for search?

Normally every user triggered long running operation should be cancellable. This includes printing, searching, writing metadata. In the software from our company we also often enough forget to display a progress bar with a cancel button. But this is a must from usability point of view if the searching space becomes large.

It is not that hard to signal a cancellation to a running thread so that it returns after the current image is processed, it only has to be done. One precondition to this is that the search is really executed in the background, so that the UI does not block and can respond to a user click on the Cancel button.

sinus

Quote from: Arthur on November 24, 2017, 12:21:00 PM
There is a filter panel and a inplace search in the file window, so if you make a feature request, why not improve these? Who needs 3 places for search?

Thanks, Arthur

Well, we have NOW 3 places to search.  ;D
The problem is me, I think.

The best for  me it the File Finder - app. But it has no Metadata-search.

The other two searches, the filter and the Search Bar, they work fine, because there are only very few posting from users.
But I personally have always again troubles with them, this morning I lost really my nerves, because I was not able to search some for some different words (each search for a word).
If I am searching one word, then it is ok, then I use the Search Bar.

Hmmmm, today it the Black Friday, maybe in my case this means, my Friday is really black.  8) ;D :-\
Best wishes from Switzerland! :-)
Markus

Arthur

I don't know it exactly but I would expect that maybe 5% of all users have a 200k database. And from them there are only a few which search in the whole metadata. I do not know the size of the IMatch's user base but I think there are not that much with that conditions.

I am a private consumer which manages family photos. My database has 1400 files now and I never search anything else than category names (albums) and family member names (keywords).
Maybe that's why there is not so much response on this problems.

What is obvious is that if something has to be done, then this is not a bug fix. This is an architectural issue, which may be solved in a major release. Like the thing of reducing memory needed to manage large category trees.

Arthur

But the best search is the one that works, so maybe someone with scripting experience could extend the "File Finder" app.

What seems to be needed is maybe an "Include Metadata" checkbox in the UI and an inner loop which iterates over metadata fields for each enumerated file. I think an API for this should be there.

sinus

Thanks, Arthur, for your thoughts.
I think, you are correct in all, what you are writing.

Let's see ... my nerves has a bit relaxed in the mean time.  :D ;D
Best wishes from Switzerland! :-)
Markus

Mario

#6
Markus, you wrote a lot of text and this may have hidden the actual problem description...
Can you please try to  less verbose. I will ask if I need more details.

Quote
I want search the whole DB in all Metafields for different words, like:
Meat
fisch
founded
tabletop
dishes
and so on.
What is the easiest way to do so, without looses the nerves?

Do you want to find files which contain any of these words, or all?
The outcome of Meat OR fish is different than Meat AND fish or Meat fish, obviously.
Do yo need to search each word individually or in combination?

Please note that the (now) default simple search mode does not support multi-word searches or Boolean operators.
As I recall, you were one of the users who wanted an easier and less powerful search bar. If you want to use multi-word searches or AND/OR you have to enable the Advanced mode first.

QuotePhew, I have to be quickly. A wrong letter and I have to correct it quickly because otherwise IMatch begins to search.

It starts searching about 1.5 seconds after you have typed the last character.
If you have to type complex search patterns and this gives you trouble, why not type them into Notepad and then use copy/paste?

Again, the search bar has been designed to quickly lookup files in the file window. Single word searches or something like Family AND Spain. And it works great.
With 200,000 files in the file window to search this may be a bit of a nuisance.

In a later post you wrote:

QuoteHeck, all what I want is search the whole DB in all metadata-fields.
I am not able to do so.

You can. But since I know that you are using NEF files a lot and considering your 200,000 files database this means that IMatch has to search about

~ 200,000 x ~ 400 = about 80 million (!) data records

for each of your searches. 80 million records need to be searched, filtered and the result prepared. This is not something you do in a second and not something you do very often. If you have to search your entire database for files containing the word fish anywhere in any of the ~ 400 metadata fields, there may be room for improvement in your workflow.

IMatch does not maintain a search index for all metadata fields for all files. This would require lots of memory and would slow down IMatch a lot because many IMatch features update metadata and thus 'invalidate' the index - requiring IMatch to rebuild it. And this will take long, for a 200,000 files database. And there are users out there with 500,000 files  or more.

I've made q quick test here. With my 430,000 files test database I switched to advanced search and search everywhere.
Then I searched for Paris because I know that my database contains several thousand files which have Paris either as a keyword or in the title/description.

The search takes 13 seconds and searches a whooping 173 million records.
When I switch to frequently used tags (title, description, keywords, headline, ...) the search takes about 5 seconds - for 430,000 files searched. Not bad.
Not much difference for either test when I search for a term consisting of the phrase "Paris by Night"  or Paris AND Brasserie (with Advanced Search on to allow for the AND).

When I use the Metadata Search in the Filter panel (mode: contains, where: Everywhere, pattern: Paris) IMatch takes about as long as with the file window search bar. This is to be expected because both use the same search routines internally.

The Filter panel indicates that it is running why the changing Funnel icon in the toolbar. Usually you don't notice this because it's so fast, but for 200,000 you will see it switch and you can stop the filter panel with it any time.


The File Finder app does only one thing: It finds file names very fast. For many cases it can use a special index IMWS maintains for file names. This is doable because file names don't change as often as metadata. And if you have 200,000 files, it only needs to search 200,000 strings, and does not have to load 80 million records from the hard disk first. A lot quicker.

So, tips:

Make sure you use the right search mode for the File Window Search Bar.
Use "Search Everywhere" only when really needed. The search is 3 to 4 times faster if you search only in often used tags and not in every EXIF tag or maker note.
If my answer does not cover your problem, tell me exactly what your search pattern is, and which settings you used (screen shot, quickest).


During the IMatch 5 beta I toyed with a full-text index maintained by the database. This index covered all frequently used metadata tags and was blazing fast (sub-second search performance).
It had two major disadvantages, though:

1. It supported only prefix searches.  It find "bartender" "bar" and "bar stock" when you search for bar, but does not find "bartender" when you search for "tender".
Beta testers did not like this at all.

2. The index of course has to be updated every time a metadata tag changes (for tags which it contains). And this was a big drag on performance because most of what IMatch users are doing is changing metadata which means that the index is out of sync most of the time. This is not like Google which visits each web site once a week or month, updates it index and then keeps that index unchanged for a week or month...

I'm still looking for ways to implement such a thing, but so far I was not successful. And, most users do not search 200,000 files all the time.

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Mario on November 24, 2017, 02:56:56 PM
Markus, you wrote a lot of text and this may have hidden the actual problem description...
Can you please try to  less verbose. I will ask if I need more details.

Quote
I want search the whole DB in all Metafields for different words, like:
Meat
fisch
founded
tabletop
dishes
and so on.
What is the easiest way to do so, without looses the nerves?

Do you want to find files which contain any of these words, or all?
The outcome of Meat OR fish is different than Meat AND fish or Meat fish, obviously.
Do yo need to search each word individually or in combination?

No combination.
No OR, no AND.
For each word one search.

Quote from: Mario on November 24, 2017, 02:56:56 PM
Please note that the (now) default simple search mode does not support multi-word searches or Boolean operators.
As I recall, you were one of the users who wanted an easier and less powerful search bar. If you want to use multi-word searches or AND/OR you have to enable the Advanced mode first.

Yes, I know.
I want only searching ONE word in ALL metadata-fields over the whole DB.
No  multi-word searches
No  AND/OR

Quote from: Mario on November 24, 2017, 02:56:56 PM
In a later post you wrote:

QuoteHeck, all what I want is search the whole DB in all metadata-fields.
I am not able to do so.

You can. But since I know that you are using NEF files a lot and considering your 200,000 files database this means that IMatch has to search about

~ 200,000 x ~ 400 = about 80 million (!) data records

for each of your searches. 80 million records need to be searched, filtered and the result prepared. This is not something you do in a second and not something you do very often. If you have to search your entire database for files containing the word fish anywhere in any of the ~ 400 metadata fields, there may be room for improvement in your workflow.

IMatch does not maintain a search index for all metadata fields for all files. This would require lots of memory and would slow down IMatch a lot because many IMatch features update metadata and thus 'invalidate' the index - requiring IMatch to rebuild it. And this will take long, for a 200,000 files database. And there are users out there with 500,000 files  or more.

I've made q quick test here. With my 430,000 files test database I switched to advanced search and search everywhere.
Then I searched for Paris because I know that my database contains several thousand files which have Paris either as a keyword or in the title/description.

The search takes 13 seconds and searches a whooping 173 million records.
I did the same now, 250'000 files
Searching the word Spiez over the whole DB,  advanced search and search everywhere.
Found 82 files, time 62 seconds.


Quote from: Mario on November 24, 2017, 02:56:56 PM
When I switch to frequently used tags (title, description, keywords, headline, ...) the search takes about 5 seconds - for 430,000 files searched. Not bad.

I did also the same: 23 seconds.

Quote from: Mario on November 24, 2017, 02:56:56 PM
When I use the Metadata Search in the Filter panel (mode: contains, where: Everywhere, pattern: Paris) IMatch takes about as long as with the file window search bar. This is to be expected because both use the same search routines internally.

The Filter panel indicates that it is running why the changing Funnel icon in the toolbar. Usually you don't notice this because it's so fast, but for 200,000 you will see it switch and you can stop the filter panel with it any time.

OK, I will check this another time. I tried just now, but have problems to see, if IM is searching, pausing or whatever.
I will read again the help and try again the filter.

Quote from: Mario on November 24, 2017, 02:56:56 PM
The File Finder app does only one thing: It finds file names very fast. For many cases it can use a special index IMWS maintains for file names. This is doable because file names don't change as often as metadata. And if you have 200,000 files, it only needs to search 200,000 strings, and does not have to load 80 million records from the hard disk first. A lot quicker.

True, the  File Finder app is incredible fast.
AND convenient.
I have not to go first to select the Database-folder or @All - category and this alone is worth its weight in gold!
If I must not switch to the Database-Folder or to the @All - category is (for me) an important advantage.
No matter, if I have to wait 5 times longer (as an example).


Quote from: Mario on November 24, 2017, 02:56:56 PM
So, tips:

Make sure you use the right search mode for the File Window Search Bar.
Use "Search Everywhere" only when really needed. The search is 3 to 4 times faster if you search only in often used tags and not in every EXIF tag or maker note.
If my answer does not cover your problem, tell me exactly what your search pattern is, and which settings you used (screen shot, quickest).

The problem is not really the speed. If I wait 10 seconds or 30 seconds does not really matter.
But it matters for me (maybe not for other users), that I have to go first to the Database-folder or the @All-category.
And this I have to do with the filter-panel or with the Search bar.

That is why the  File Finder app is so very much convenience.
I can open it, write a word, push Search and that is all!
It searches in the entire database, no matter, where I am. And it opens a result window.
Hence I can let seach very quickly a second, a third search. All is "stored" in the result window.

If this  File Finder app would allow to search for one word (e.g. Paris),  search everywhere this would be great.
If such a search would take 70 seconds ... would not really matter.
Of course the quicker the better, but far more important is a convenience search.

I love the File Finder app, really cool.


Quote from: Mario on November 24, 2017, 02:56:56 PM
During the IMatch 5 beta I toyed with a full-text index maintained by the database. This index covered all frequently used metadata tags and was blazing fast (sub-second search performance).
It had two major disadvantages, though:

1. It supported only prefix searches.  It find "bartender" "bar" and "bar stock" when you search for bar, but does not find "bartender" when you search for "tender".
Beta testers did not like this at all.

2. The index of course has to be updated every time a metadata tag changes (for tags which it contains). And this was a big drag on performance because most of what IMatch users are doing is changing metadata which means that the index is out of sync most of the time. This is not like Google which visits each web site once a week or month, updates it index and then keeps that index unchanged for a week or month...

I'm still looking for ways to implement such a thing, but so far I was not successful. And, most users do not search 200,000 files all the time.

I can remember.
Yes, I agree, searching tender should also find "bartender", hence like it is now, is very good.

Hm, finally the most irritating (and for me not convenient) thing is, that I have to go first to a place, where all files are representing (Database or @all).
And this is also irritating for some new users, as I can remember some postings.

But it is, like it is.
Maybe one day will someone enhance the File Finder app with such a search over all metadata-fields in the whole DB.

So, let this post be, Mario!
Hatte wohl einen schlechten Morgen, nichts klappte bei der Suche, der Filter sowieso nicht, am Schluss ging ich über die keywords (direkt in den Cats) und fand so dann doch noch meine Resultate.
Es muss wohl einfach am Freitag, dem schwarzen, liegen.  ;D

Wenn ich wieder mal mehr Zeit habe, werde ich zuerst mal die Hilfe nochmals genau lesen.
Dann nochmals den Filter anschauen und die Search bar.

Und dann nochmals alles versuchen.
Wenn ich dann immer noch Probleme habe, werde ich wieder etwas posten, dann kürzer.  ;D
Danke!



Best wishes from Switzerland! :-)
Markus

Mario

#8
62 second for only 200,000 files is really lame. My PC is 4 times faster. And its is almost 3 years old. Samsung SSD, i7, 32GB RAM.

QuoteBut it matters for me (maybe not for other users), that I have to go first to the Database-folder or the @All-category.
And this I have to do with the filter-panel or with the Search bar.

This is how you control where to search. It may be that you are only ever searching the entire database, but other users want to search only in the current folder, some categories, a timeline node or in a collection. The scope feature allows you to control where to search and also gives IMatch a place where to show the search results: In the File Window.

A "I'm currently looking at this folder but I want the file window search bar to search the entire database" will probably cause a lot of confusion. Where to show the results? Opening a result window? If not, what about stacks, version sets? They may break or change because the file window suddenly shows files not from the current scope...

QuoteI have not to go first to select the Database-folder or @All - category and this alone is worth its weight in gold!
If I must not switch to the Database-Folder or to the @All - category is (for me) an important advantage.
No matter, if I have to wait 5 times longer (as an example).

Why don't you click on the Pause button in the File Window first?
This makes a lot of sense if you plan to load 200,000 files into your scope.
You can then just click on @All or the Database node and enter your search pattern. Without waiting for the File Window to load your 200,000 files.

QuoteHm, finally the most irritating (and for me not convenient) thing is, that I have to go first to a place, where all files are representing (Database or @all).
And this is also irritating for some new users, as I can remember some postings.

Not really. Do you really want IMatch to always search the entire database? What if you want to search only in a category, a folder or in March to June 2017? Or just all files with a rating better than 3?
The "I search what's in the file window" is actually both 'learnable' and flexible.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Mario on November 24, 2017, 04:39:14 PM
QuoteHm, finally the most irritating (and for me not convenient) thing is, that I have to go first to a place, where all files are representing (Database or @all).
And this is also irritating for some new users, as I can remember some postings.

Not really. Do you really want IMatch to always search the entire database? What if you want to search only in a category, a folder or in March to June 2017? Or just all files with a rating better than 3?
The "I search what's in the file window" is actually both 'learnable' and flexible.

Maybe I am the only one, but yes, usually (90% or so) I search the whole DB.

But of course, if I know, it was in the last month or two, then I go directly into the folder. And because I have in a folder only about 3000 files, but most stacked, I have to look only for 10-20 files (top-stack), hence I even have not to search, only to look.

If I want search in a category, then I use the Search Bar. Because this is mostly convenient.
The same, if I have really to search in a folder, then I use also the Search Bar.

And of course I use often the File Finder - app, great, cool, quick.
And since I have some short description in the filename, I find very often the searched files.


Search for Paris, monkey, halloween or chicken I would find simply with the File Finder ... and very fast!
Thanks for this app, really!

I use never the filter-searching.

Best wishes from Switzerland! :-)
Markus

Mario

#10
Your search terms look like something that typically goes into keywords.
It seems that you may could add some more keywords to your files. Saves a lot of searching and you can immediately find your files under @Keywords.

I just tried your workflow again with my 400,000 files database.

1. File Window enable Pause.
2. Click on Database Node
3. Enter Paris in File Window Search Bar.

The File Window updates in less than 10 seconds, showing me about 1000 hits, from 400,000 files in the database.
I used the Advanced Mode, but frequent tags only.
I can live with that performance.

QuoteSearch for Paris, monkey, halloween or chicken I would find simply with the File Finder ... and very fast!
Thanks for this app, really!
Great. Your problem is solved then? Add more keywords to file names?
Or sit down, clone the File Finder App and make it a "search my entire database everywhere" app... ;)
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Carlo Didier

Just for comparison:
Search bar: part of a filename
@All category (~90000 files)
3941 results found after 11,8s
Hardware: 7 year old Core i5-2300 @ 2.80GHz, 16GB RAM, OS and 4.5GB db on Samsung SSD
OS: Windows 10 Home (1703) 64bit

Very acceptable IMHO

Mario

7 years? That's a lot of mileage!
I need to swap my keyboard every year (unreadable) and my hardware is ground to the bones after 3  ::) ;D

The SSD is of course what speeds things up. A lot of disk I/O during search, and that's what SSDs excel at. Best invention since sliced bread.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Quote from: Mario on November 24, 2017, 06:15:47 PM
Great. Your problem is solved then? Add more keywords to file names?

Yes, it is solved.
Add more keyword is possible, but not doable for me, no time.
If I get some images from someone, than some important "keywords" are in the description, or in the headline or in the city and so on...

Some people does not write the info into the proper fields.  ::) Hence I have to search over all fields in the whole DB. Mostly, but not always.

Quote from: Mario on November 24, 2017, 06:15:47 PM
Or sit down, clone the File Finder App and make it a "search my entire database everywhere" app... ;)

Ha! Yes, that would be the best thing!
Uh, but since I have tried a lot with Javascript, I do now know my limits  ::) :-[ :-\

I think, I could do it, but it would take me one full week to do so.
If I have once this time, I will do it.  :)
Best wishes from Switzerland! :-)
Markus

Mario

IMWS offers the endpoint /search/metadata which does all the work.
This would be the start of your very own search engine:

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

sinus

Thanks, Mario

As soon as I have time (a lot of time) I will try this.
Best wishes from Switzerland! :-)
Markus