Ollama Setup
Ollama provides an easy way to run large language models locally.
Installation
Windows Installer (Recommended)
Download from ollama.ai/download:
Recommended Settings
- Enable "Expose Ollama to the network", be sure this is a network you trust.
- Try different context windows starting small testing increases until you find sweet spot of speed vs intelligence.
Useful Commands
ollama run <model>
ollama list # List downloaded models
ollama rm <model> # Remove a model
ollama serve # Start API server
Pull Models
Would start by visiting Ollama models catalog.
Each page includes instructions on how to pull the models.
Hugging Face models
You can also download models on Hugging Face. You can sign up on Hugging Face and plugin your hardware, this will help you decide which models you can run on your system.
I'm currently running https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF?show_file_info=Qwen3.6-35B-A3B-UD-IQ4_XS.gguf
For getting these running in Ollama, you have to convert it to a docker image.
You can get the Modelfile from https://github.com/jpmanson/llm_templates
Update the model file's FROM line to point to the model you downloaded.
e.g.
FROM ../Qwen3.6-35B-A3B-UD-IQ4_XS.gguf
Then run ollama command.
e.g.
ollama create Qwen3.6-35B-A3B-UD:IQ4_XS -f .\llm-templates\Modelfile-qwen3
Once you are happy with the file you've generated you can delete the original downloaded model as it shouldn't be needed anymore.