• galaxy_nova@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 hours ago

    Huh does that actually work?

    Edit: I realize it probably should given my understanding of tokenization but if it’s training data couldn’t it easily be replaced with like a regex or something?

    • Drusenija@aussie.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 hours ago

      It probably could if everyone did it the same way. But I suspect that isn’t what’s happening, so while our brains pattern recognition the message reasonably easily regardless of the substitution, doing that at scale with regex would be a lot more difficult.