I don’t really want companies or anyone else deciding what I’m allowed to see or learn. Are there any AI assistants out there that won’t say “sorry, I can’t talk to you about that” if I mention something modern companies don’t want us to see?

  • Cease@mander.xyz
    link
    fedilink
    arrow-up
    1
    ·
    11 hours ago

    Actually not 100% true, you can offload a portion of the model into ram to save VRAM to save money on a crazy gpu and still run a decent model, it just takes a bit longer. I personally can wait a minute for a detailed answer instead of needing it in 5 seconds but of course YMMV

    • SpicyTaint@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      6 minutes ago

      Is there a general term for the setting that offloads the model into RAM? I’d love to be able to load larger models.

      I thought CUDA was supposed to just supposed to treat VRAM and regular RAM as one resource, but that doesn’t seem to be correct.