Should have spent more time…you’re right.
According to some articles, you can self host smaller parameter LLMs and/or quantized versions. For example the 7B models. Recommendations were 16GB+. Some even pulled off lower
Should have spent more time…you’re right.
According to some articles, you can self host smaller parameter LLMs and/or quantized versions. For example the 7B models. Recommendations were 16GB+. Some even pulled off lower
Ollama has been great for self-hosting, but also checkout vLLM as its the new shiny self-hosting toy