• LainOfTheWired@lemy.lol
    link
    fedilink
    English
    arrow-up
    9
    ·
    11 months ago

    On a different note how do these big companies train AI’s to detect CSAM without using a bunch of illegal CSAM to train it?

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      17
      ·
      11 months ago

      It’s perverse how the laws are so ultra-strict that you can break them by making an attempt to comply with them. The article describes how at several points the researchers had to “outsource” part of their work to people in less-strict jurisdictions And. LAION itself is based in Germany, which adds yet another jurisdiction to the situation.

      CSAM always turns into a ridiculous minefield. So many different jurisdictions and different definitions, and everyone is ultra adamant about theirs being the one that must be enforced globally.

    • fishos@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      edit-2
      11 months ago

      I’ve heard there are specific data sets you can download that have the training data, but not the images themselves. Someone else already ran the images through a training model and you’re just grabbing the processed data and plugging it into your model. I’m sure I’m missing some nuance and haven’t looked into it myself, but I’ve seen that given as the answer when someone asked before.

      • piecat@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        11 months ago

        IIRC from a previous thread, different law enforcement agencies will release hashes or similar so the image can be detected without distributing the original