[ale] unsalted hashes of 6 million linkedin passwords published on the internet

Thu Jun 7 22:56:23 EDT 2012

Hey,

On Thu, 2012-06-07 at 15:47 -0400, Stephen Haywood wrote:
> > Unsalted and unseeded.  If the hashing had been seeded, the brute
> > forcing would be impossible without the private seed.

> I understand what you mean by unsalted but explain unseeded in terms
> of a SHA1 hash. My understanding is the file contained about 6.5
> million unique password hashes, of which about 3.5 million were
> cracked before the list was made public. Last I heard about 1.5
> million had be cracked and analyzed by Stefan Venken (@StefanVenken).
> I believe the folks at KoreLogic have cracked over 3 million of them.

I was expecting this question.  It's not at all an uncommon question.

Salting and seeding are, on the surface, two very very similar things.
Mathematically, they appear to be the same thing, adding information to
something to be hashed to make brute force or guessing more difficult.
It's easy to mix them up or assume both are one thing or doing the same
thing.  In fact, however, the difference is how they are used that makes
them different.  Each can be used separately or can be used together.
They provide two different functions with two different purposes to
address to different attack vectors and surfaces.

A salt is a, preferably large, random nonce added to a blob (passphrase)
to be hashed that changes with each and every hash.  Each time you hash
a new password, you select a new nonce or salt.  So, if you hash a the
same password four times, you will get four different hashes.  Because
the hash depends on the unique salt, the salt is generally associated
with the hash very tightly, so, for instance, the shadow file contains
both the hash information and the associated salt in the formatted hash
field.  You needs this.  In order to verify a hash, you have to know the
salt and, if it's different for each hash, each hash has to have an
associated salt you can reference.  Consequently, if a hash database is
compromised, it invariably also includes the salt because they are
typically stored in the same place and/or in the same manner.

So, what does a salt provide us with?  It pretty well prevents "rainbow
tables" (tables of precomputed hashes).  Back in the days of DES hashes
in the password files, the hashes only had 13 bits of salt so each
password only had 8192 possible hashes and the hash was there in the
hash field.  With huge drives, it became possible to express every
possible hash for every possible 8 character password, creating a
rainbow table that only needed to be searched for a match.  With md5
and sha1 hashes, we have a 1024 bit salt (2^1024 possible combinations)
making rainbow tables virtually impossible even for short passwords.
Salts also prevent detecting when various entities / accounts choose
identical passwords or they repeat passwords (minor compared to the
rainbow table issues).  You'll note that, outside of the rainbow tables,
I have not said they make brute forcing harder.  Brute forcing of
INDIVIDUAL hashes is NOT made more difficult by salting.  A weak
password will fall just as fast and the computational requirements are
pretty much the same.  It's just that, by busting one, you have to
repeat the same effort for all the others in a group, even if they have
the same password.  So it prevents some optimization in brute forcing
large tables of passwords (like the Linkedin dump).

Now seeding is slightly different.  It looks the same, additional data
introduced into the hashing, but it's done in a different way.  The seed
is a nonce that is selected once and is applied to all the hashes and is
known only to the system.  This turns the hashing algorithm into a keyed
hashing algorithm with the seed being the "secret key".  It means that a
hash generated on one system with one seed is meaningless on another
system with another, different, seed.  Because a seed is common to all
the hashes on a system or in an application, it doesn't need to be
stored with the hashes (like in a database) and can be secured by
entirely different mechanisms.

So, what does a seed buy us?  If the entire system is compromised, not
much.  If they can compromise the seed, it's no better than not having a
seed but no worse either.  That's your worst case.  But, if they
compromise just the hashes (say out of a database through SQL injection
- very common) they still can not brute force even the simplest of
passwords unless they can determine that bloody seed (and you make it
bloody difficult).  It's a different issue that it's addressing.  It
doesn't address the issues of password collisions (multiple accounts
with the same passwords) or password reuse because they would result in
the same hash, without a salt.  But it prevents password hash transport
(could be a plus, could be a minus) and it makes brute forcing an
individual hash a bugger bear because you're missing a vital
cryptographically hard piece of this puzzle.  A salt, because it's
associate with an individual hash and generally exposed, can't do this.

In theory, I believe that these two could be combined into one nonce
but, generally, because they are determined differently and managed
differently I think that in all practical terms they are two different
things with two different paradigms.  In fact, to optimize your hashing
algorithm, I would choose a seed nonce and pre-digest it with the
hashing algorithm and save the intermediary result.  Then use the
intermediary result to start your hashing first with your salt and then
with your data to hash.  That way, a seed doesn't even cost you any
computing time in your hashing algorithms.

Obviously, if all you are doing is hashing to verify files, seeding and
salting are useless since you want others to be able to verify the
results.  If all you want is one SINGLE system to verify something,
seeding and salting can prevent a variety of attacks.  If you want
multiple systems to verify something but prevent duplicate detection
(both reused and between accounts) then salt is what you want or you
have to duplicate the seed.

Personally, if I were designing a hash system for storing passwords for
a site, I would choose the best algorithm I could (sha256 or sha512
would be nice) and use both a strong seed and strong salting.  Seeds and
salts with more entropy than the hashing algorithms are of little
benefit but the cost in computing is miniscule.  1024 bit blobs cost
little for this purpose.

The algorithms and concepts are the same, it's all in how you apply
them.

> -- 
> Stephen Haywood
> Information Security Consultant
> CISSP, GPEN, OSCP
> T: @averagesecguy
> W: averagesecurityguy.info

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
Url : http://mail.ale.org/pipermail/ale/attachments/20120607/8a952c7a/attachment.bin