US Needs Trillions To Stay Ahead of China in AI Race — Blackrock CEO Points to Pensions and Retirement Savings

stumu415@lemmy.zip · 2 months ago

US Needs Trillions To Stay Ahead of China in AI Race — Blackrock CEO Points to Pensions and Retirement Savings

boonhet@sopuli.xyz · 2 months ago

Uh that’s not how this works. They’re not open source, they’re open weight. Well, the smaller distillations are. The big ones are still closed. And it takes a bunch of compute to train them, but they’ve learned to be thriftier since they don’t have access to nearly as much parallel compute as the American companies right now. The models also tend to trail in performance. Takes a lot less to compete for third place than first.

AbsolutePain@lemmy.world · 2 months ago

Why is this being downvoted? It’s factually true.

I’d love actual open source training somehow. But at the moment I don’t think an asynchronous training mechanism that would enable this exists, given that running the flagship models on even a small batch of data requires massive compute power.

YoureHotCupCake@lemmy.world · 2 months ago

DeepSeek is one of if not the most popular Chinese AI and is open source and requires a small amount of computing compared to others. Its used in numerous Chinese car brands, smart phones, and even government services throughout China.

China isn’t competing for Third they are leading the world in AI development and have already integrated it in many areas.

Hugucinogens@lemmy.blahaj.zone · edit-2 2 months ago

Again, at the very least, no, they’re not open source. Open-weights means anyone can download, use, and tinker with them a bit, but there is no access to their code, training data, or process.

It’s just as limiting as closed source software with some modding allowed, but not as limiting as an online-api-only model, as many of the most powerful modern models are.

There are no heroes in the Global Powers’ race. The USA is a comically cartoonish villain in real life, yes, but all the biggest Chinese data centres for all that training, are still built in poor areas (Inner Mongolia and the bullshit that China has apparently inflicted on them), and still fucking over those who live there.

It’s abuse all the way down.

YoureHotCupCake@lemmy.world · 2 months ago

You mean all of this code that is clearly on their github: https://github.com/deepseek-ai? They release both their model weights as well as the source code for their AI. You can literally take what they have provided to create your own LLM if you would like to and get a good understanding of their AI. Sure you can’t see the training data but that would be like putting the entirety of the internet in a github repo and just isn’t feasible, but you can contribute your own training data to a local setup of deepseek and shape it in a way you want to.

boonhet@sopuli.xyz · 2 months ago

The training data is as important as the source code here to replicate the end result. The weights are more like a binary distribution. You can run the model and you can technically edit it just like you can technically edit a binary file.

They also only release some libraries and tools for running the model if you have a set of weights (which they do graciously provide), but they do NOT release the source code for their training pipeline itself. That’s up to you to reverse engineer from the whitepapers. Right now even if you had the exact training data and the compute available, you could not train your own Deepseek V3.2, let alone V4.

xep@discuss.online · 2 months ago

If people on Lemmy can’t understand this I have no hope for the average person.

humanspiral@lemmy.ca · 2 months ago

training data is as important as the source code here to replicate the end result

this is the nature of this flame war. Perfect replication of the end result, which is extremely opaque in how it works, is not nearly as important as the weights, that you can post train for any domain specific/general improvement with any other dataset. Which is how the authors would improve/change the weights further as well.

chloroken@lemmy.ml · 2 months ago

You’re talking like you know what you’re talking about, but you clearly are guessing. Knock it off. Don’t mask conjecture as fact.

AlteredEgo@lemmy.ml · 2 months ago

Haven’t they clearly documented how they did it and what they used so that anyone can replicate it? Anyone with the compute power, which of course few have. But universities could do it.

So how is it not open source in this specific domain of problems? What would a LLM model need to do to be open source then? Duplicate the whole training dataset in a big zipfile for you to download?

From what I understand you could even replicate deepseek by replacing the “cold start” with latest deepseek instead.

boonhet@sopuli.xyz · 2 months ago

Haven’t they clearly documented how they did it and what they used so that anyone can replicate it?

They don’t put up the actual code for their training pipeline though. It’s more of a “if you have enough engineers, you can do this too” whitepaper, because they wouldn’t want any rando training their own model.

Right now, even if you had the exact training set (which is a CRUCIAL part of an LLM and you can NOT replicate it without it), you couldn’t rebuild the thing exactly, you’d need to do a whole lot of extra work.

So how is it not open source in this specific domain of problems?

You could call all proprietary software open source then. The UI and user manual describe what it does, you can do your own engineering to duplicate the functionality.

sp3ctr4l@lemmy.dbzer0.com · 2 months ago

So its significantly closer to being ‘open’ in that qualified organizations can poke around with some of its innards and probably achieve something useful, in comparison to fully propietary models where that is either legally impossible or extremely, absurdly expensive.

Its sorta like …

… a compiled game that requires you to fully reverse engineer a lot of it to be able to mod it at all, while also its use liscense states that doing that is illegal,

… a compiled game that is highly moddable via tools/apis and/or a significant portion of it that is well publically documented or just source code available,

… and then a truly totally open source, libre game.

Yeah, not totally open source.

But functionally and practically closer to it.

I can mod the shit out of Half Life 2 or New Vegas or CyberPunk or Kenshi, but not so much with … I dunno, basically any live service game.

I still need those original core compiled exes for those games, but basically, many things I can fuck with relatively easily… and maybe if I really go nuts I can figure out how to hack or shim or hijack the exe to make NVSE or RedScript or ReKenshi or whatnot.

As compared to trying to mod HellDivers or Fortnite… near instant ban, most likely.

But also, if you don’t know much about how to make a mod, well you’re basically SoL in that department just the same for any kind of game, really.

metermatic26@lemmy.world · 2 months ago

Why even have this discussion? Self-learning algorithms appeared more than ten years ago. AI is being used very effectively in countless areas.

The idea that there is some sort of prize waiting for whomever gets the most computing power, is highly dubious.

sp3ctr4l@lemmy.dbzer0.com · 2 months ago

Yet that idea is the entire basis of the US economy, at the moment.

US Needs Trillions To Stay Ahead of China in AI Race — Blackrock CEO Points to Pensions and Retirement Savings

US Needs Trillions To Stay Ahead of China in AI Race — Blackrock CEO Points to Pensions and Retirement Savings

US Needs Trillions to Stay Ahead of China in AI Race — Blackrock CEO Points to Pensions and Retirement Savings