I’ve been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat.

Some questions to get a nice discussion going:

  • Any of you have experience with this?
  • What are your motivations?
  • What are you using in terms of hardware?
  • Considerations regarding energy efficiency and associated costs?
  • What about renting a GPU? Privacy implications?
  • bobburger@fedia.io
    link
    fedilink
    arrow-up
    2
    ·
    6 months ago

    Llamafile is a great way to get use an LLM locally. Inference is incredibly fast on my ARM macbook and rtx 4060ti, its okay on my Intel laptop running Ubuntu.