• TheTechnician27@lemmy.world
    link
    fedilink
    English
    arrow-up
    53
    ·
    edit-2
    2 days ago

    If anyone has specific questions about this, let me know, and I can probably answer them. Hopefully I can be to Lemmy and Wikimedia what Unidan was to Reddit and ecology before he crashed out over jackdaws and got exposed for vote fraud.

    • xinayder@infosec.pub
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 day ago

      How do I get started on contributing to new articles (written by a human) for my language? I always wanted to help out but never found an easy way to do so.

      • TheTechnician27@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 day ago

        I’m going to write this from the perspective of the English Wikipedia, but most specifics should have some analog in other Wikipedias. By “contribute to new articles”, do you mean create new articles, contribute to articles which are new that you come across, or contribute to articles which you haven’t before (thus “new to you”)? Asking because the first one has a very different – much more complicated – answer from the other two.

        • xinayder@infosec.pub
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 day ago

          Both. How do I get started creating a new article, and how do I contribute to them, or other articles?

          • TheTechnician27@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            ·
            edit-2
            10 hours ago

            The short answer is that I really, really suggest you try other things before trying to create your first article. This isn’t just me; every experienced editor will tell you that creating a new article is one of the hardest things any editor can do, let alone a newer one. It’s why the task center lists it as being appropriate for “advanced editors”. Finding an existing article which interests you and then polishing and expanding it is almost always more rewarding, more useful, easier, and less stressful than creating an article from scratch. And if creating articles sounds appealing, expanding existing stub articles is great experience for that.

            The long answer is “you can”, but it’s really hard:

            • New editors are subject to Articles for Creation, or AfC, when creating an article. The article sits in a draft state until the editor flags it for review. The backlog is very long, and while reviewers can go in any order they want, they usually prioritize the oldest articles out of fairness and because most AfC submissions are about equal in urgency and time consumption. “Months” is the expected waiting time.
            • If you’re not using the English Wikipedia, you can try translating over a well-established article from English. There’s no rule that says sources have to be in the language of the Wikipedia they’re on, although it’s still considered a big plus if sources are in the same language. You’d have to keep in mind that the target language may have standards not followed on the English Wikipedia.
            • Wikipedia’s notability guidelines are predicated on you understanding other policies and guidelines like “reliable sources” and “independent sources”. They’re also intentionally fuzzy so people don’t play lawyer and follow the exact letter without considering the spirit of the guideline.
            • The English Wikipedia currently has over 7 million articles. There are still a lot of missing articles (mostly in taxonomy, where notability is almost guaranteed), but you really need to know where to look.
            • When choosing an article subject, it’s extremely important to avoid COI.
            • Assuming you have a subject you think meets criteria, now you have to go out and find reliable, independent sources with substantial coverage of the subject to confirm your hypothesis.
            • Now you need to start the article, and you need to do this in a manner which:
              • Is verifiable (all claims are cited)
              • Is not original research (i.e. nothing you say can be based on “because I know it”)
              • Is reliable (all citations are to reliable sources)
              • Is neutral (you’ve minimized bias as much as you can, let the sources speak for themselves, and made sure your source selection isn’t biased)
              • Is stylistically correct (there’s a manual of style, but just use your best judgment, and small mistakes can be copy-edited out by people familiar with style guidelines)
            • If the article is nominated for deletion, you have to keep your cool and argue based solely on guidelines (not on perceived importance of the subject) that the article should be kept.
            • New articles are almost always given more scrutiny than articles which have been around; this isn’t a cultural problem as much as it is a heuristic one.
            • An article deleted feels much more personal than edits reverted (despite the fact that subject notability is 100% out of your control).

            Some of these apply to normal editing too, but working within an article others have worked on and might be willing to help with is vastly easier than building one from scratch. If you want specific help in picking out, say, an article to try editing and are on the English Wikipedia, I have no problem acting like bowling bumpers if you’re afraid your edits won’t meet standards.

    • ℍ𝕂-𝟞𝟝@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 days ago

      Is there a danger that unscrupulous actors will try and build out a Wikipedia edit history with this and try to mass skew articles with propaganda using their “trusted” accounts?

      Or what might be the goal here? Is it just stupid and bored people?

      • TheTechnician27@lemmy.world
        link
        fedilink
        English
        arrow-up
        22
        ·
        edit-2
        2 days ago

        So Wikipedia has three methods for deleting an article:

        • Proposed deletion (PROD): An editor tags an article explaining why they think it should be uncontroversially deleted. After seven days, an administrator will take a look and decide if they agree. Proposed deletion of an article can only be done once, even this can be removed by anyone passing by who disagrees with it, and an article deleted via PROD can be recreated at any time.
        • Articles for deletion (AfD): A discussion is held to delete an article. Pretty much always, this is about the subject’s notability. After the discussion (a week by default), a closer (almost always an administrator, especially for contentious discussions) will evaluate the merits of the arguments made and see if a consensus has been reached to e.g. delete, keep, redirect, or merge. Articles deleted via discussion cannot be recreated until they’ve satisfied the concerns of said discussion, else they can be summarily re-deleted.
        • Speedy deletion: An article is so fundamentally flawed that it should be summarily deleted at best or needs to be deleted as soon as possible at worst. The nominating editor will choose one or more of the criteria for speedy deletion (CSD), and an administrator will delete the article if they agree. Like a PROD, articles deleted this way can be recreated at any time.

        This new criterion has nothing to do with preempting the kind of trust building you described. The editor who made it will not be treated any differently than without this criterion. It’s there so editors don’t have to deal with the bullshit asymmetry principle and comb through everything to make sure it’s verifiable. Sometimes editors will make these LLM-generated articles because they think they’re helping but don’t know how to do it themselves, sometimes it’s for some bizarre agenda (e.g. there’s a sockpuppet editor who’s been occasionally popping up trying to push articles generated by an LLM about the Afghan–Mughal Wars), but whatever the reason, it just does nothing but waste other editors’ time and can be effectively considered unverified. All this criterion does is expedite the process of purging their bullshit.

        I’d argue meticulously building trust to push an agenda isn’t a prevalent problem on Wikipedia, but that’s a very different discussion.

        • ℍ𝕂-𝟞𝟝@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 days ago

          Thank you for your answer, I really feel happy that Wikipedia is safe then. Stuff happening nowadays makes me always think of the worst.

          Do you think your problem is similar to open-source developers fighting AI pull requests? There it was theorised that some people try to train their models by making them submit code changes and abuse the maintainers’ time and effort to get training data.

          Is it possible that this is an effort to steal work from Wikipedia editors to get you to train their AI models?

          • TheTechnician27@lemmy.world
            link
            fedilink
            English
            arrow-up
            8
            ·
            2 days ago

            Is it possible that this is an effort to steal work from Wikipedia editors to get you to train their AI models?

            I can’t definitively say “no”, but I’ve seen no evidence of this at all.

    • baltakatei@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      7
      ·
      2 days ago

      How frequently are images generated/modified by diffusion models uploaded to Wikimedia Commons? I can wrap my head around evaluating cited sources for notability, but I don’t know where to start determining the repute of photographs. So many images Wikipedia articles use are taken by seemingly random people not associated with any organization.

      • TheTechnician27@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        2 days ago

        So far, I haven’t seen all that many, and the ones that are are very obvious like a very glossy crab at the beach wearing a Santa Claus hat. I definitely have yet to see one that’s undisclosed, let alone actively disguising itself. I also have yet to see someone try using an AI-generated image on Wikipedia. The process of disclaiming generative AI usage is trivialized in the upload process with an obvious checkbox, so the only incentive not to is straight-up lying.

        I can’t say how much this will be an issue in the future or what good steps are to finding and eliminating it should it become one.

        • jungle@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          How would you know if an image is AI generated? That was easy to do in the past, but have you seen what they are capable of now?