• Mearcfara@lemmy.ml
    link
    fedilink
    English
    arrow-up
    25
    ·
    1 day ago

    I just wish we could invest the time/money/resources into compressing AI and making it smaller and more efficient. I’d so much rather have a somewhat capable AI that can be run locally and offline, to outsource menial tasks to like alphabetizing spreadsheets and so basic image modification, than to have to upgrade my hardware constantly or use cloud based SaaS and/or have newer models that are more accurate in their predictions.

    Of course that assumes a lot of things, like the intent to help people and not make money. Maybe someone in the Linux-sphere will make something.

    • nforminvasion@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      7 hours ago

      Look into Bonsai Ternary models. They’re “1.5” bit models that have to be trained that way (so no taking a full model and quantizing it down) but they are so efficient and they can run on CPU only, though it’s a bit alpha at the moment. Really cool company and projects.

      You have to create a specific environment for them though, using Bonsai’s GGUF version which enables them to run properly. So unfortunately, no use in LM Studio yet.

    • HubertManne@piefed.social
      link
      fedilink
      English
      arrow-up
      4
      ·
      21 hours ago

      I would like to see one integrated into a gnu os like linux where its only capability is to understand the os and guide you through it. No generation and no expertise outside the os exosystem. Maybe allow for it to be given the privelege to search the web. I would have it have capability to use other ais to perform other tasks so modules or whatnot could be added to give it more capability as a general computer butler type. Basically an os that acted like a start trek computer.

      • dil@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 hours ago

        I want clippy but actually useful with all software, just giving tips when needed, ai can be useful sometimes, idk like im bad at math always have been, I need to sort some curves by index recentlly and it helped with the math logic a lot, otherwise I was using a repeat node and it was a lot slower than the way it showed me. Downside ofc was the ai way isn’t fully accurate or implementable as they say, has to be modified, it makes up nodes that don’t exist, but there are similar ones.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 hour ago

          Increasingly, people ask me questions, send me screen shots, I copy-paste that into gpt, gpt’s answers are helpful and correct… they have access to the same (free to use) gpt themselves…

          • dil@lemmy.zip
            link
            fedilink
            English
            arrow-up
            1
            ·
            37 minutes ago

            People ask humans because they want to interact with humans, just say you don’t know and they’ll ask ai themselves, unnecessary middlemannimg for ego boost is weird

          • dil@lemmy.zip
            link
            fedilink
            English
            arrow-up
            1
            ·
            38 minutes ago

            Please don’t compare how I use AI to how you do, I hope I never ask you a question and trust you like I would a human

      • SilentKnightOwl@slrpnk.net
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 hours ago

        Using Pi agent with qwen 3.6 35b a3b running with llamacpp on my GPU feels a lot like that. I have a script that watches my downloads folder and keeps it organized, and it used to just get the file extensions and move things based on type, now with a local llm in the loop, it moves things based on what it is, and what it is for. If I download a PDF file from work, it automatically reads the first page, figures out what its about, and moves it to my work documents.

        “Whichever port that docker container is on, make it this one”

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 hour ago

          They’re really good at digging for stuff, like: this app is reporting the git hash it was built from - somewhere in the log files - go read that and show me which branch that hash appears on (hash is 8 commits back in some branch…) Yeah, I could do that myself, but why would I if I don’t have to?

    • ZephyrXero@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      1 day ago

      There are efforts there. The new Deepseek 4 compresses a lot of its knowledge using something they call engrams. But it’s unfortunately still too big for a consumer GPU.

      Gemma 4 is small enough to run on your cellphone.

      If your GPU has at least 8GB there are a lot of options for self hosting your own local models

    • petersr@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 day ago

      If I understand correctly, if we actually said “this model is great, let’s put a pin in it”, then it could be turned into a dedicated chip that would be much more efficient and perhaps even something that could get embedded in consumer hardware - but then you are just stuck with that model instead of “the next shiny new model” that they keep making.

      • Mearcfara@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 hours ago

        This sent me for a loop.

        I don’t mind older stuff- my car is from the late '10s, and was a few years old when I got it, but blew my mind compared to my last car from the mid '00s. It has a back up camera! And even though my car is now nearing 10 years old, my experience hasn’t changed. I’m still driving on mostly the same roads using the same method. And, when I have to get a new car, I’m sure I’ll marvel at remote start or whatever.

        But what’s a bummer is the idea that someone else can decide that the hardware is no longer adequate- that “you must have the newest experience”. I simply don’t want that. Yes, it’s annoying that my phone has to be plugged in to access carplay, while new cars have it over bluetooth, but I didn’t even know it was that way until I got a rental recently.

        So for AI, i’m okay with some shortcomings, because I can get to know the software and work with it, and if the shortcoming is a show stopper, then I can seek to upgrade or just not do what I was trying to do with my older gen AI.

        But alas, the number must go up so the shareholders can rub their stocks or whatever

    • BrightCandle@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 day ago

      I feel like there is a future of more targeted AI. At the moment something that does spreadsheets has to carry knowledge of programming and chemistry and lots of languages and this seems very heavy for what ultimately we need. A programming language focussed AT dedicated to Rust or Go or Java could potentially be quite a bit smaller especially if they focussed on algorithm snippet and auto complete smarts. There is definitely a market for smaller more targeted uses than these all encompassing chat bots where the goal is to move the state of the art on for existing algorithms.