|
| randomCode() Function for PHP |
|
This page was created to explain the randomCode() function and its testing.
It's a very simple function, but all the other versions I found online had major security vulernabilities that effected
the randomness of the output and defeated the strength of the PRNG (Pseudo-Random Number Generator). The function is in the
next table, and can be used free of charge. Give me props in your code if you'd like, or maybe let me know if you use it.
I'd be interested to know where it's getting used (even if it's a really small site in an obscure part of the world).
|
| THE CODE |
////
// Returns a random code of the specified length, containing characters that are
// equally likely to be any of the digits, uppercase letters, or lowercase letters.
//
// The default length of 10 provides 839299365868340224 (62^10) possible codes.
//
// NOTE: Do not call wt_srand(). It is handled automatically in PHP 4.2.0 and above
// and any additional calls are likely to DECREASE the randomness.
////
function randomCode($length=10){
$retVal = "";
while(strlen($retVal) < $length){
$nextChar = mt_rand(0, 61); // 10 digits + 26 uppercase + 26 lowercase = 62 chars
if(($nextChar >=10) && ($nextChar < 36)){ // uppercase letters
$nextChar -= 10; // bases the number at 0 instead of 10
$nextChar = chr($nextChar + 65); // ord('A') == 65
} else if($nextChar >= 36){ // lowercase letters
$nextChar -= 36; // bases the number at 0 instead of 36
$nextChar = chr($nextChar + 97); // ord('a') == 97
} else { // 0-9
$nextChar = chr($nextChar + 48); // ord('0') == 48
}
$retVal .= $nextChar;
}
return $retVal;
}
|
| Security Info |
Entropy:
I used the BigInt Calculator (which I wrote) to figure out the number of different strings that are possible to be
generated for various different lengths of the string. This is done by taking the number of possible characters
(10 digits + 26 lowercase letters + 26 uppercase letters = 62 total characters) and raising it to the power of the
length. For example, a 10 characters string would have (62^10) = 839299365868340224 possible values.
The results were as follows:
| Code Length | Number of Possible Codes |
| 10 digits -> | 839299365868340224 |
| 16 digits -> | 47672401706823533450263330816 |
| 32 digits -> | 2272657884496751345355241563627544170162852933518655225856 |
| 64 digits -> | 5164973859965249179065154939717494269947658426266553960878
244596268481614842987330263563657801857314603738370932736 |
| 128 digits -> | 26676954974124325436517176620537113043668346978033613998
04790699818347038724308897085327540811872028543853314278
26190504551262386997554673813297617979958085084847966735
01174584260670195378195281787455665322747798721925430636
445696 |
| 256 digits -> | 71165992669145658882019868898151328323771921416752427294
09800073407378500715055503674260501908537449489553399876
62427844810850852717191846883823768674280839119270574786
53577446062864038475783726741893203934707811490161526734
43196909752774289297379160316238090285455972385241499835
32303848529517503894555603085813572927495336324076794731
57679404444406282325554480278791264675699612296265480939
55191301349236115406393842370801975411812607723819179616
83956924416 |
That gets pretty rediculous pretty quickly. A brute-force attack trying all of the combinations would fail invariably for any significantly
long string unless you have a quantum computer laying around.
Why is this implementation more secure than the common implementations found online? Because each character has an
equal chance of being the next character in any given string.
Characters:
The flawed implementations will tend to first generate a random number to tell if they should create
a number, lowercase letter, or uppercase letter... then generate a seperate number for the actual character.
This is bad because it takes away some of the entropy. Any given number is now more than twice as likely to appear than
any of the letters. The reason for this is that one out of every three characters will be a number even though
only 10 out of 61 possible characters are digits.
Seeding:
Often a function will seed the random number generator every time it is envoked. This is bad because now the random number
generator's effects are completely unimportant. The first weak link is that the seed to the random number generator is all that has
to be calculated now. If you are seeding by microseconds, then you only have 86,400,000,000 values or so to check (that's the number
of microseconds in a day). That's fairly strong, but much less than it can be because that number doesn't increase as the length
of your random string increases. The default seeding for the Mersenne Twister (the PRNG used... hence the "mt_") is much safer.
In addition, by calling the function, seeding it by time... the values are going to have very similar and possibly repeated seeds.
If the function is called repeatedly, then the same second/microsecond/whatever-you-seed-with may be the same. Generating a string
will most often take way more than a microsecond, but this is a general principle to adopt.
An even better reason: as of PHP 4.2.0, PHP handles the calling of wt_srand() itself, and in a way that is probably much more secure
than you will handle it unless you put a TON of thought into it. It's best just to stick with the way we are sure is fairly safe.
//Don't use this anymore!
//mt_srand((double)microtime()*1000000); // Seeding is automatic as of PHP 4.2.0
|
| Testing |
To show myself that I got the implementation correct, I ran several tests. The most important of which may be the Frequency Analysis.
The Frequency Analysis shows the number of each character that was generated in a very large test sample. Ideally, they should be very
similar. I didn't have the time/inclination to find the standard deviation / variance of the data. If you calculate it, please let
me know the results. I've included an example output below that shows that the results are, in fact, fairly well distributed.
This is the code that I used in testing (which I formated using my AIM Code Formatter):
set_time_limit(0); //turns off the limit
for($cnt=0; $cnt<=61; $cnt++){
$nextChar = $cnt;
if(($nextChar >=10) && ($nextChar < 36)){ // uppercase letters
$nextChar -= 10; // bases the number at 0 instead of 10
$nextChar = chr($nextChar + 65); // ord('A') == 65
} else if($nextChar >= 36){ // lowercase letters
$nextChar -= 36; // bases the number at 0 instead of 36
$nextChar = chr($nextChar + 97); // ord('a') == 97
} else { // 0-9
$nextChar = chr($nextChar + 48); // ord('0') == 48
}
print $nextChar;
}
$frequency = array();
$allValues = "0123456789";
$allValues.= "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
$allValues.= "abcdefghijklmnopqrstuvwxyz";
for($cnt=0; $cnt<=count($allValues); $cnt++){
// initialize all frequencies to 0
$frequency[substr($allValues,$cnt,1)] = 0;
}
for($cnt=0; $cnt<9000000; $cnt++){
$frequency[randomCode(1)]++;
}
$minFreq = "";
$maxFreq = "";
foreach ($frequency as $char => $freq) {
if(($minFreq == "") || ($freq < $minFreq)){
$minFreq = $freq;
}
if(($maxFreq == "") || ($freq > $maxFreq)){
$maxFreq = $freq;
}
}
print "Minimum value: $minFreq<br>n";
print "Maximum value: $maxFreq<br>n";
print "Size of Range: ".($maxFreq-$minFreq)."<br>n";
print "<pre>n";
ksort($frequency);
print_r($frequency);
print "</pre>n";
|
| Example Output of Testing Code |
Minimum value: 144356
Maximum value: 145987
Size of Range: 1631
Array
(
[A] => 145789
[B] => 145167
[C] => 144443
[D] => 145091
[E] => 144356
[F] => 144707
[G] => 145987
[0] => 144959
[H] => 145633
[I] => 145124
[J] => 145780
[K] => 145140
[L] => 145652
[M] => 144889
[N] => 145397
[O] => 145273
[P] => 144972
[Q] => 145513
[R] => 145167
[S] => 144677
[T] => 145333
[U] => 145091
[V] => 145164
[W] => 144876
[X] => 145608
[Y] => 144663
[Z] => 145356
[a] => 144958
[b] => 145615
[c] => 145272
[d] => 145093
[e] => 145820
[f] => 145206
[g] => 145108
[h] => 144467
[i] => 145770
[j] => 145411
[k] => 144703
[l] => 145425
[m] => 145215
[n] => 144775
[o] => 145526
[p] => 145263
[q] => 145248
[r] => 145241
[s] => 145399
[t] => 145273
[u] => 144740
[v] => 145438
[w] => 144369
[x] => 144863
[y] => 145134
[z] => 145222
[1] => 144653
[2] => 145632
[3] => 145348
[4] => 145215
[5] => 144876
[6] => 145095
[7] => 144912
[8] => 144896
[9] => 145012
)
|
|