• DeathByBigSad@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      16 hours ago

      I thought about it again for a bit. I checked the frequency of english letters as the first letter. There’s way to make two groups of letters so that 13 of them would represent a “0” bit if you see them appear as the first letter and if you see the other set of 13 characters, you’d interpret it as a “1” bit. And the letters in both groups would each collectively add up to 50%. This make it a bit harder to do frequency analysis.

      Then for coded text:

      You’d use 6 bits per character. So every 6 word you write on the steganograph/ciphertext, you get back 6 bits, aka 1 character (6 words to hide 1 character is a lot of work I know). 6 bits represent 64 characters. 26 is enough for one character-set of a-z, but the remaining characters can be used to make frequency analysis harder, for example, the binary that denotes “27” could be used to represent another letter “e” since “e” is the most common letter in the alphabet, so now both “05” and “27” both represent e. Do the same with all the other frequenly appearing letters, these are “Etaoin Shrdlu”(I actually memorized it on the top of my head since I was obsessed with the idea of ciphers as a kid and I looked it up on wikipedia about cryptanalysis a few years ago)

      Of course, these are just examples, don’t make your “e” = “05”, they could be e=61 e=53 e=23 t=2 t=4 etc…, so basically, one letter has multiple ways to be represented, especially the more frequent letters should be represented more times until you get a more even distribution.

      (I’m not an expert tho, in a post-chat-control world, maybe you can email some cryptography/steganography expert and ask for better advice.)