-
Notifications
You must be signed in to change notification settings - Fork 0
/
PROJECT_IDEA.txt
31 lines (23 loc) · 2.04 KB
/
PROJECT_IDEA.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
so this project started, because I saw a post on Reddit about an MI5 level code.
https://www.reddit.com/r/codes/comments/1bshvw/an_uncrackable_code_given_to_me_by_a_wwii_veteran/
it got me to thinking, exactly how to hide an english message in a string of numbers.
this method is to use a natural language processor to detect the percentage possibility that
a given string of numbers, can be turned into words that make a sentence.
first a test sentence is provided to check the encode/decode procedure.
if the test sentence can be decoded in this manner, then in all likelyhood the
given message can as well.
parameters may be adjusted later, like the calculation of the word value, which currently calculates each word
in the given dictionary by giving each letter a numerical value from 1 to 26 and adding the sum of he words letters.
this calculation should be possible to shift, in case the code's designer used different numerical values for different letters.
there are very many permutations of this. it would be a shame to miss that. but for the test, we will use A-Z equals 1-26.
if it fails, build a database table of all the possible combinations of 1-26 vs A-Z and run those permutations against the
permutions of the words permutations. this will not be simply done on a POC machine. the permutations list of words for
the test message in the reddit post was 10 GB.
the resulting values are matched with the values in the given message. each possible match is recorded for each 'word' value
in the message. Next a calculation is done to see how many possible permutations there are to make each
word value for each word be in a 'message format' with all the other words for which the word values matched.
this permutation number is then saved, and the 'message formats' are saved in the database.
these messages are then processed by the NLP and given a percentage value for the likelihood that at least
one sentence is contained in the message.
These messages are then checked manually, to determine the effectiveness of the test.
acnicholls