• boonhet@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    4
    ·
    9 hours ago

    Uh that’s not how this works. They’re not open source, they’re open weight. Well, the smaller distillations are. The big ones are still closed. And it takes a bunch of compute to train them, but they’ve learned to be thriftier since they don’t have access to nearly as much parallel compute as the American companies right now. The models also tend to trail in performance. Takes a lot less to compete for third place than first.

    • AbsolutePain@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 hours ago

      Why is this being downvoted? It’s factually true.

      I’d love actual open source training somehow. But at the moment I don’t think an asynchronous training mechanism that would enable this exists, given that running the flagship models on even a small batch of data requires massive compute power.

    • YoureHotCupCake@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      5
      ·
      9 hours ago

      DeepSeek is one of if not the most popular Chinese AI and is open source and requires a small amount of computing compared to others. Its used in numerous Chinese car brands, smart phones, and even government services throughout China.

      China isn’t competing for Third they are leading the world in AI development and have already integrated it in many areas.

      • Hugucinogens@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        edit-2
        7 hours ago

        Again, at the very least, no, they’re not open source. Open-weights means anyone can download, use, and tinker with them a bit, but there is no access to their code, training data, or process.

        It’s just as limiting as closed source software with some modding allowed, but not as limiting as an online-api-only model, as many of the most powerful modern models are.

        There are no heroes in the Global Powers’ race. The USA is a comically cartoonish villain in real life, yes, but all the biggest Chinese data centres for all that training, are still built in poor areas (Inner Mongolia and the bullshit that China has apparently inflicted on them), and still fucking over those who live there.

        It’s abuse all the way down.

        • YoureHotCupCake@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 hours ago

          You mean all of this code that is clearly on their github: https://github.com/deepseek-ai? They release both their model weights as well as the source code for their AI. You can literally take what they have provided to create your own LLM if you would like to and get a good understanding of their AI. Sure you can’t see the training data but that would be like putting the entirety of the internet in a github repo and just isn’t feasible, but you can contribute your own training data to a local setup of deepseek and shape it in a way you want to.

        • AlteredEgo@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 hours ago

          Haven’t they clearly documented how they did it and what they used so that anyone can replicate it? Anyone with the compute power, which of course few have. But universities could do it.

          So how is it not open source in this specific domain of problems? What would a LLM model need to do to be open source then? Duplicate the whole training dataset in a big zipfile for you to download?

          From what I understand you could even replicate deepseek by replacing the “cold start” with latest deepseek instead.

          • metermatic26@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 hours ago

            Why even have this discussion? Self-learning algorithms appeared more than ten years ago. AI is being used very effectively in countless areas.

            The idea that there is some sort of prize waiting for whomever gets the most computing power, is highly dubious.