October 02, 2020
Have you ever wanted to hash/randomize a primary key or other private information, but also wanted to be able to refer to it with a memorable phrase? Probably not, but with my new package avilable on CRAN, KeyToEnglish
, now you can!
The package primarily revolves around one function, keyToEnglish()
, that does the following:
character
type if needed Here's an example:
# install with `install.packages("keyToEnglish")` library(keyToEnglish) email_addresses = c( 'anika_harmonica@themangoblues.com', 'billy_silly@someclowncollege.com', 'carys_ferrous@steel-rarebit.org', 'diarra_tiara@coffeequeens.net', 'eri_merry@joifish.org' ) print(keyToEnglish(email_addresses))
## [1] "GeneralPurityTunnelSpellingFeeding" "PrintInfusionAdverseEngraveCentral" "IndependentEffectiveFlavorConsistWall" ## [4] "HearPigmentCouncilTeacherPressing" "UnitComplexionElderConstitutionFellowship"
Alternately, you can provide a list of word lists, and the output will include words strung together in the order of the lists they appeared in. Note that for best results, the least-common-multiple of the sizes of all of the lists should be relatively small. I usually make my list sizes all powers of 2 in order to accomplish this.
# hash to a memorable sentence # equivalent to # print(hash_to_sentence(email_addresses)) print(keyToEnglish(email_addresses, word_list=wml_long_sentence))
## [1] "EruditeMoltenPetalResurrectsEmbossedLingonberries" "HelplessWideChicoryBifurcatesDitsyNecks" ## [3] "CapriciousKitchPartnerGluesShinyGrime" "EnchantedGlassRockstarChainsLaqueredGauntlets" ## [5] "HauntedRainbowShinerObliteratesOrangeCounts"
You can also define your own word lists:
custom_word_lists = list( sizes=c('infintesimal','miniscule','tiny','small','average','big','huge','astronomical'), colors=c('red','blue','green','yellow','orange','purple','pink','brown'), nouns=c('monkey','parrot','kitty','newt','fish','buffalo','wasp','octopus'), of='of', nouns2=c('doom','love','chaos','happiness','anger','sadness','swoleness','alacrity') ) keyToEnglish( email_addresses, word_list=custom_word_lists )
## [1] "AstronomicalYellowOctopusOfChaos" "BigBlueKittyOfChaos" "MinisculeBlueKittyOfSwoleness" ## [4] "InfintesimalYellowMonkeyOfHappiness" "MinisculeOrangeBuffaloOfSadness"
Of course, this only has 4,096 unique combinations. If you want to calculate the maximum number of keys you can generate before encountering a collision, you can use the function uniqueness_max_size()
, which approximates this number:
print(uniqueness_max_size(4096, 0.01))
## [1] 9.073718
Surpisingly, it is only 9. Using the wml_long_sentence
multi-wordlist, the value is a bit higher:
print(uniqueness_max_size(wml_long_sentence, 0.01))
## [1] 19028965
which is about 19 million, which is more than enough for most applications. As a general rule, the probability of any collisions occuring is proportional to the square root of the number of permutations.
In case you just want random strings, you can also run generate_random_sentences()
. Note that this uses the openssl
package to generate random numbers, but if you want to use set.seed()
, or just run it faster, you can use the fast
parameter.
print(generate_random_sentences(5))
## [1] "Hellish black secessionist illuminates moist chevaliers." "Harmonious nylon pus bifurcates maroon diamonds." ## [3] "Mysterious galvanized atom manufactures pink rices." "Calculating glossy gauge inverts oak demons." ## [5] "Nutty drenched lime condemns pyrite dirks."
Several word lists/word multi-lists are included with this package in order to make it easier to run code out-of-the-box:
wl_common
- A list of 5,000 common English words. wl_nouns_concrete
- A list of 2,048 concrete (non-abstract) nouns. These are generally countable in some contexts. wl_nouns_concrete_plural
- A list of 2,048 plural, concrete (non-abstract) nouns. These are generally countable in some contexts. wl_adjectives_visual
- A list of 256 adjectives that can be used to visually describe something, often what it is made of.wl_adjectives_nonorigin
- A list of 256 adjectives that do not describe something's origin, and can usually be used before the visual adjectives above.wl_verbs_transitive_infinitive
- A list of 256 transitive verbs in infinitive form (without "to") wl_verbs_transitive_present
- A list of 256 transitive verbs in present tense wl_verbs_transitive_gerund
- A list of 256 transitive verbs in gerund form (ends with "ing") wml_long_sentence
- A word multi-list that can be used to form somewhat long sentences using nouns, adjectives, and a verb.wml_cutephysics
- A small word multi-list that combines written numbers, physics nouns, some adjectives, and cute things. wml_animals
- A small word multi-list that has sizes, colors, animals, and an emotion or attribute. Also, I hashed the name "Anika", and this is a phrase it came up with:
Cthonian Granite Halcyon Pressurizes Bronze Scammers
I thought it would be funny to try (poorly) drawing it, so here's the result:
"https://github.com/mcandocia/keyToEnglish"
Tags: