Measuring Password Entropy with Python
In the previous posts, we counted password spaces and designed generators whose output sizes can be computed exactly. Now we turn that into a practical measurement: how do we convert a password space into bits of entropy, and how do we compute that in Python?
Part 3 of 5: This post bridges the series from theory to calculation. Once a password space is defined, entropy gives us a standard way to measure it.
\(H = \log_2(N)\)
Entropy tells us how much uncertainty there is in the generator’s output.
Python makes entropy calculation simple once the password space is known.
Password entropy starts with password space
The security of a password generator depends on the total number of possible outputs it can produce.
If a generator has \(N\) possible outputs, then its entropy is:
\[ H = \log_2(N) \]
This converts the size of the password space into bits of entropy.
Each additional bit doubles the number of guesses needed to brute-force the space.
Ten random lowercase letters
Suppose a password consists of 10 lowercase letters.
The alphabet size is:
\[ |A| = 26 \]
The total number of possible passwords is:
\[ 26^{10} \]
Entropy becomes:
\[ H = \log_2(26^{10}) \]
Using logarithm rules:
\[ H = 10\log_2(26) \]
Since:
\[ \log_2(26) \approx 4.7 \]
we get:
\[ H \approx 47 \text{ bits} \]
Three-word passphrases
Now consider a generator that creates passwords like:
word-word-word
If the dictionary contains \(W\) words, then the total number of passwords is:
\[ W^3 \]
Entropy becomes:
\[ H = 3\log_2(W) \]
If we choose from a 2048-word list:
\[ 2048 = 2^{11} \]
Then entropy becomes:
\[ H = 3 \times 11 = 33 \]
This example is useful because the entropy becomes easy to read directly from the structure.
Computing entropy directly in Python
Once we know the size of the password space, Python can compute the entropy with a single logarithm.
import math
def entropy(password_space):
return math.log2(password_space)
Example:
space = 26**10
print(entropy(space))
Output:
47.0
Entropy for more complicated formats
For structured generators, we compute the password space by multiplying the number of choices at each step.
If a generator makes decisions like:
- Choose a word from a dictionary
- Choose a symbol position
- Choose a number position
then the total space becomes:
\[ N = a \times b \times c \]
Entropy becomes:
\[ H = \log_2(a \times b \times c) \]
Using logarithm rules:
\[ H = \log_2(a) + \log_2(b) + \log_2(c) \]
This is why structured generators are still analyzable: every design choice becomes another countable factor.
Developers need measurable security, not vague rules
Thinking in terms of entropy is better than relying on arbitrary statements like:
- “Passwords must include uppercase letters”
- “Passwords must include symbols”
A stronger statement is:
This generator produces 69 bits of entropy.
That sentence has a precise mathematical meaning. It tells us something measurable about the search space.
Frequently Asked Questions
These are the practical questions that usually come up when turning password-space counts into entropy measurements with Python.
Why use entropy instead of just password length?
Length matters, but it does not tell the full story by itself. Entropy measures how many possible outputs a generator can produce, which makes it a better unit for comparison.
What does one extra bit of entropy mean?
It doubles the number of possibilities in the search space. That is why even small increases in entropy can matter.
Why is math.log2() the right Python tool here?
Because entropy is measured in bits, and bits are naturally expressed with base-2 logarithms. Python’s math.log2() maps directly to that definition.
Why does a three-word passphrase from a 2048-word list have 33 bits of entropy?
Because each word contributes 11 bits when the list size is 2048, and three independent choices give \(3 \times 11 = 33\).
Can structured generators still be measured cleanly?
Yes. As long as the choices at each step can be counted, you can multiply those choices to get the password space and then apply \(\log_2\).
What is the main practical takeaway for developers?
Stop relying only on vague composition rules. If you can define the output space, you can measure the generator in a way that is clearer and easier to compare.
Conclusion
Entropy gives us a clean way to compare password generators.
Once the output space is known, Python makes the calculation trivial. That means password design can move from intuition to measurement.
The next step is implementation: can we build a real generator in Python whose code matches the mathematical model we have been analyzing?
Previous: Designing Password Generators with Exact Entropy
Next in the series: Building a Password Generator in Python with Provable Entropy
Raell Dottin
Comments