Passwords: Why the Common Wisdom is Just Plain Wrong
A good password ought to be both:
- easy to remember
- hard to guess.
It’s astonishing that so many people use passwords that are either
:
- easy to remember and
ridiculously easy to guess (“123456”) {or}
-
difficult to remember and
moderately easy to guess. (“Alice456!”) {or}
- impossible to remember and
difficult to guess (“r8s@i2#ry2o76$”)
We’re not talking about people guessing passwords here; we’re talking about computers guessing passwords. A person may try four or five passwords and give up. A computer will try millions. Experts claim that you should try to create passwords that:
- use special characters (e.g. ‘%’, ‘#’, ‘$’),
upper case characters, numerals
- don’t include dictionary words or names
- include at least 8 characters
- don’t include repeated characters or sequences
- use an obfuscated pass-phrase like
“mYd0Gh@sfleA5”
- even better, use deliberate misspellings -
“mYdawgh@sFleE5”
- make use of an automatic password generator
To understand why most of this advice is short-sighted, we need to understand:
- what makes a password hard to guess
- what makes a password easy to remember.
HARD TO GUESS
The strength of a password relates to the number of guesses that one would need to try before happening on the correct password. Automated password-guessing software might attempt:
- the most common passwords
(e.g. “letmein”, “password”, “abc123”),
- the most common password schemes
- a dictionary word (“cheetah”),
- a word with special characters (“ch33t@h)”,
- a name followed by a number (“mary26”),
- a brute search of all character sequences
going from shorter to longer sequences.
A password’s strength is commonly called its
entropy and is measured in
bits. Entropy is usually thought of as the amount of information in a system. For example, hard drive capacity is measured in bits (actually bytes: 8-bit sets) and describes how much information that drive can store. Password strength is also measured in bits. If a password can be guessed with 4 guesses, then that password is said to have an entropy of 2 bits. This is because a number between 1 and 4, inclusive, can be singled out by using 2 bits of information (i.e. is it less than 3?, is it even?). An n-bit password would require an exhaustive search of 2^n possibilities. This is because each bit of information can be used, ideally, to divide the realm of possibilities in half.
Now, each additional lower-case letter in a password multiplies the range of possibilities by 26 (i.e. the number of lower-case letters). We say that it adds 4.7 bits since 2^4.7 is about 26. Similarly a letter which may be upper or lower case multiplies by 52 or adds 5.7 bits. Adding digits and special characters to the set gives a little over 6 bits per character. The big caveat is that to get the full 4.7, 5.7, or 6 bits requires that the character be drawn apparently at random from the set. For example, adding the very predicable ‘!’ to the end of the password will not add a full 6 bits, since password hacking algorithms are likely to guess sequences that end in ‘!’ first (i.e. before those that end in ‘{‘).
We can easily see that longer password are to be preferred over shorter. Using a combination of upper and lower case letters (as opposed to just lower case) gives you only one additional bit per character. By contrast, adding just one additional letter would add almost 5 bits. In fact, simply by doubling the length of a password from eight to sixteen characters (even if the last eight are restricted to lower-case characters) will make the resulting password 208,318,498,661 times more difficult to guess. That means that even with a high-performance computer set-up that can crack an 8-character password in a few hours (and these do exist), that same set-up would still take many millions of years to crack such a 16-character password.
EASY TO REMEMBER
Memory is funny. Sometimes, things that people think will be easy to remember turn out not to be – in particular, that password you came up with a year ago and haven’t used since. Sometimes, people’s memory surprises them and they recall something that they didn’t think they’d be able to – like an old phone number or an odd jingle. Memory experts teach that things that have more mental connections to other items in your head are more likely to be more easily remembered. Isolated pieces of data are least likely to be remembered well. Items with more emotional attachment, good or bad, are more likely to be better remembered as well.
In general, words are easier to remember than numbers. Numbers are easier to remember than symbols. The word “plateau” is one out of 30,000 words from a medium-sized English word list. This represents a strength of about 15 bits. A number with equivalent strength would need to be 4 or 5 digits long (e.g. 6851). Which is easier to remember, “plateau” or 6851? Using an alphabet of 32 different symbols, we’d need a sequence of symbols of length 3, let’s say “#:$”. Which is easier to remember, “#:$” or 6851? What if we need a password with 60 bits? Which is easiest to remember:
- plateau+circus+walnut+yoga
- 6851090106796094
- #:$$@**^@!:%
This is the gist of the famous “correct+horse+battery+staple” xkcd cartoon. The word sequence is obviously easier to remember, but seems less random, and might therefore seem easier to guess. This is mistaken. It is, in fact, just as difficult (if not harder) than “#:$$@**^@!:%”. The big difference is that words mesh with how our brains work much better than numbers and symbols.
THEN WHAT?
Let’s revisit the common advice with these insights in hand:
- use special characters (e.g. ‘%’, ‘#’, ‘$’), upper case characters, numerals
IF you can get away without them, avoid special and numeric characters. They’re harder to remember, harder to type, more prone to mistakes, and don’t buy you a whole lot of extra password strength.
- don’t include dictionary words or names
Don’t use a single word or name; this is indeed vulnerable to a dictionary attack. Instead, concatenate and/or truncate several
unrelated words together to make a super password.
- include at least 8 characters
Given dramatically increasing computing speeds, it’s better to use at least 12 characters.
- don’t include repeated characters or sequences
Even randomly generated passwords will repeat characters sometimes; just don’t be silly and use something like “zzzzzzzz”. It’s still better to use “broom” than to use “brume” because you won’t necessarily remember how you adulterated the spelling. Don’t be afraid of repeated letters.
- use an obfuscated pass-phrase like “mYd0Gh@sfleA5”
In my opinion, using a common phrase is a bad idea. There are many more dictionary words than there are common phrases. It won’t be long before hackers start using dictionaries of common phrases. Not-so-common phrases (e.g. “my-boat-has-barnacles”) are much harder to remember. Pass-phrases tend to be longer, often exceeding length restrictions and incurring more time to type and more chance for mistakes. Replacing letters with “substitute” numbers and symbols won’t buy you a lot. Hackers are on to this trick. Remembering precisely how you abbreviated and obfuscated the text is likely to cause you more grief than benefit.
- even better, use deliberate misspellings - “mYdawgh@sFleE5”
See all of the drawbacks above, compounded to make a truly horrid password that’s not a whole lot stronger than the original.
- make use of an automatic password generator
You can be practically guaranteed of a strong password – given a suitable length. You can also be practically guaranteed to forget it unless you write it down and save it in a safe place.
WHITE PAPER IN PROGRESS...
Stay tuned for my coming white paper: “A Practical Scheme for Strong Password Generation” which incorporates many of these ideas to derive a method for anybody to easily come up with strong, valid, and memorable passwords.
A Good Antivirus
The number one question that people ask me is which Antivirus software I recommend. Since 2009, I have been happy to recommend NOD32 from ESET. I have used it myself for several years and have no complaints. I have also sold it to many clients over the years and all but a few have chosen to renew when the license expires.
NOD32 has protected my computer and has saved it from malicious web sites and emails quite a few times. On a few occasions, the only way to clean a client’s infected computer (after running a slew of other scans) has been to run NOD32 on it. It also routinely spots infected files when recovering data from clients’ computers.
The software does not seem to slow down computers and simply takes care of business (e.g. updates, removing threats, etc.) without any user intervention needed. I’ve heard that it was written in assembly code in the interest of speed and efficiency. I’ve never heard that about ANY modern software, much less, antivirus software. It receives 4.5 stars out of 5 on CNET based on user reviews. If NOD32 doesn’t float your boat, then I also recommend Norton (especially for businesses with several computers) and Kaspersky. For free antivirus, I like Avast.
Keep in mind that your mileage may vary and that no antivirus is 100% secure. Keep in mind also that a user doing risky things is almost sure to compromise a computer with even the best antivirus software. Note that only one antivirus program should be run on any one machine. In this case, more is not better.