RAG vs. CAG: Solving Knowledge Gaps in AI Models youtube video @IBMTechnology

Lincoln · June 23, 2025, 01:20:12 PM

Was just watching the above Youtube video started experimenting using RAG with LM Studio to help improve Gemma 3's knowledge of Imatch. As you are limited to 5 pages the knowledge from the offline Help file seems to be helping but was wondering if it is possible to get the help file as just one file? We can use 128,000 context tokens now so all the information available on Imatch would possibly fit and be accessible to the LLM. All can be done locally and no need to fine tune the LLM. Any thoughts or suggestions on getting Ai to do the work?
Scraping the forum would be a goldmine of information but that's above my pay grade. Attached some samples first showing the pages used and then the prompt information I provided then the output.

Mario · June 23, 2025, 01:52:19 PM

QuoteScraping the forum would be a goldmine of information but that's above my pay grade.

I believe OpenAI, Gemini, Anthropic and Copilot have ingested the IMatch help and this community already. You can ask them questions about IMatch. Probably even DeepSeek by now.

QuoteThe offline Help file seems to be helping

I've played with that months ago, with mixed results. I've posted about that here somewhere.

Sometimes I now get support emails with people complaining that the feature / workflow an AI has dreamed-up don't work. One person even used the term "f' you", because the command the AI suggested to him does not exist in IMatch...

Please keep in mind that I own the copyright on the IMatch help system and I don't allow feeding it into AI. Local AI with RAG is OK, but not explicitly training corporate AI with texts and images I have the copyright for.
Too late now, I guess. The big AI companies have already vacuumed the web. SIGH

Providing the offline help system is a courtesy. I do this extra work for maybe a handful of users who need this twice a year. If you need this, you can surely find a script or tool that can concatenate multiple text documents together.

Note that the topics combined have about 3.5 MB of UTF8 text, which is way to much for a context window of only 128,000 tokens and I don't know how much the RAG implementation in LM Studio will help with that.

Better to feed in e.g. the variables help topic file into RAG when you have questions about variables. Better context, less tokens., better use of the context window. Doing RAG right can be challenging.

You can feed in HTML documents e.g. in Copilot or OpenAI. And then ask questions about it (assuming the AI has not already inhaled the text. Including context for the AI (e.g. use content on https://www.photools.com/help/imatch/var_basics.htm when answering my question) also often helps (for AI that can access the WWW).

RAG vs. CAG: Solving Knowledge Gaps in AI Models youtube video @IBMTechnology

Lincoln

Mario