LM Studio Image Inputs

Started by Lincoln, September 14, 2025, 03:00:12 AM

Previous topic - Next topic

Lincoln

Just found this info in Quick Docs in LM Studio 0.03.25. As a user of large 100mb tif files it was of interest to see this note for calling large files into LM studio. Can you confirm if IMatch is using this method?

From LM Studio Quick Docs - top right hand of Developer Page - Image input

The API supports requests containing images when a vision-enabled model (e.g. LLaVA) is loaded. Images are passed in using the messages array in a request to the /v1/chat/completions endpoint.

Important: The API does not recognize filepaths, so images must be encoded in base64. See pro tip for this below.
Note that large images will fail here due to their base64 encodings surpassing curl's length limit. To process large images, please query the LLM using another method such as the TypeScript SDK.

Multiple images
Multiple images can be passed to the LLM by simply specifying more image-url objects in the content array.

Pro tip for converting images to base64
Suppose your local image exists in the path /path/to/image. You can use the base64 utility packaged with GNU coreutils like so:

  {
    "type": "image_url",
    "image_url": { "url": "data:image/png;base64,'"$( base64 -w 0 /path/to/image )"'" }
  }

Mario

I think I don't understand your question. 

All AI APIs take Base64-encoded images: Ollama, LM Studio, OpenAI, Gemini, Mistral.
Each AI has different recommendations for image size, which IMatch AutoTagger honors. The size of the images provide also has an impact on the cost for cloud-based AIs (larger images can cost up to 4 times more to process).

IMatch will never send a 100 MB TIFF file to an AI, if this is your question.

They don't support TIFF files, only JPG or PNG anyway, either provided as Base64-encoded images or previously uploaded to the cloud space the AI company offers (at extra cost) and then referenced in prompts. Which is something useful in corporate contexts, but not for individual IMatch users.

The default image size IMatch uses for LM Studio is 512x512 pixels, for example. IMatch produces the images from the cache image, or generates them on-the-fly via temporary cache images or from the original file in case of JPG.

AI models don't produce better results just because you provide larger images (sometimes the result is even opposite). Many AI engines scale down images too large anyway, or break them in smaller chunks for pipelining.

The default image size AutoTagger uses is usually sufficient and is based on the recommendations of the AI provider.
If you run prompts which do OCR or which try to analyze fine details or small objects in images, setting a larger image size in in AutoTagger can be helpful (but also costs more than using the default image size for cloud-based AI).

Lincoln

"Note that large images will fail here due to their base64 encodings surpassing curl's length limit. To process large images, please query the LLM using another method such as the TypeScript SDK."

Sorry my question is can I change the query to use Typescript SDK instead of base64? Only noticed this information in the latest update from LM Studio. Have you read the Quick Docs?
Also what does the image size in the ai-services-202501.json file refer to if IMatch only uses 512?
"id": 800,
      "name": {
        "en": "LM Studio",
        "de": "LM Studio"
      },
      "description": {
        "en": "LM Studio Local AI Host."
      },
      "minImageSize": {
        "width": 512,
        "height": 512
      },
      "maxImageSize": {
        "width": 2048,
        "height": 2048
      },
      "apiUrl": "http://127.0.0.1:1234",
      "allowEditAPIUrl": true,
      "apiKey": false,
      "infoUrl": "https://lmstudio.ai/",
      "runsLocal": true,
      "billing": [
        "free"
      ],

Mario

#3
What would you do with a TypeScript SDK? Which problem are you trying to solve?
Do you want to send a 100 MB image file to LM Studio to  break it?

What does Typescript even do in the context of AutoTagger? IMatch uses AI connectors written in C++ and optimized for parallel processing. TypeScript has no place here. You can write your own app in TypeScript and use the LM Studio API or OpenAI API.

QuoteAlso what does the image size in the ai-services-202501.json file refer to if IMatch only uses 512?
Please refer to the AutoTagger documentation.
512 pixel is the default size. When you want to send larger images to LM Studio for whatever purpose, select a larger image size in AutoTagger. For LM Studio, IMatch stops at 2K resolution, because more resolution does not produce better results in most cases. Depending on the model, LM Studio and Ollama may even scale down the delivered image before applying the model.

Why would you need to send even larger images?

Besides, it is no problem to send a 2K image via Base64.
Does AutoTagger fail on your computer when you set the image size to the largest setting?
I've just tried and it works fine with LM Studio at maximum image size.

Even if your original image size is 10K, AutoTagger will never send an image that large to the AI.
Which problem do you encounter and trying to solve?

thrinn

Quote from: Lincoln on September 14, 2025, 12:33:32 PM"Note that large images will fail here due to their base64 encodings surpassing curl's length limit. To process large images, please query the LLM using another method such as the TypeScript SDK."
@Lincoln: My understanding of this note is: For large images, curl (which is a command line utility) is not the best way to pass the image data to LM Studio. Instead, one should call the API directly, but still pass base64 encoded data. It's about how to call the API, not about different encodings of the image data to process.
And I doubt very much that IMatch is using curl.
Thorsten
Win 10 / 64, IMatch 2018, IMA

Mario

#5
IMatch is not using curl.
I'm still not understanding which problems the OP is trying to solve.