LM Studio Setup
LM Studio provides a desktop app for running large language models locally, with a built-in server API.
Installation
Download the desktop app from lmstudio.ai for your platform (Windows, macOS, Linux).
Starting the Local Server
- Open LM Studio
- Go to the Local Server tab (sidebar icon)
- Select a loaded model
- Click Start Server
By default the server runs on http://127.0.0.1:1234 with an OpenAI-compatible API.
Recommended Settings
- Enable "Allow CORS" and "Network Access" so other tools on your machine can reach the server
- Adjust the context window — start small and increase until you find the sweet spot between speed and quality
- GPU offload layers: set to Max to push as much work to your GPU as possible
Finding & Loading Models
I'm currently running Qwen3.6-27b on a RTX 4090.
If you are using a M-series Mac then you will want to use MLX variant. LM Studio will have a badge on the model if GPU offloading is fully or partially supported.