console.blog(
"Huuu durrr" you say, "that's the point of a login."
I disagree!
The point of a login is to authenticate a person using the application. Sending their password to the backend is not necessarily part of that process.
There are essentially four vulnerable parts of an authentication system:
Of these, we can do pretty well in protecting against #2 and #3. We can't do much against #1 without drastically changing the typical sign-in flow. We also can't do much about most of the human factors that are out there, but we can protect our users against their password being stolen from us and stuffed into other websites, if they re-use passwords.
Of course, ideally, our database would never be stolen and our users would never have their networks compromised (leading to their credentials being stolen). "Ideally" is the wrong way to build an authentication system, though, so we're going to look at a way to deal with the non-ideal.
Granted, we will be using the most common method of sign-in (username plus password) and the most common method of storage (username plus hash). This is intentional.
We can get significant security gains by changing some of the magic between the user entering their information and the API authenticating them without making significant architectural changes.
Of course, you can make significant architectural changes for even better security, but that's a topic for another time.
"But how?"
Thanks for asking.
HMAC stands for Hash-based Message Authentication Code.
The basic premise is this: If I (the API) know a secret key that the user can continually regenerate using a secret passphrase, then we can both hash the same message and come up with the same result.
If we both come up with the same result without sharing the secret key (after the first time, more on that later), then I (the API) can theoretically assume that the user is the same person who originally created the account, because they've successfully generated the same secret key.
Once more, this time with (paid) actors.
Alice created an account on our website.
Our API - Bianca - has the secret key from when Alice created her account: EGG
.
Alice knows the secret passphrase that will recreate the secret key: PAN
.
When Alice visits the login form, she enters her username and her secret passphrase: PAN
.
In her browser, our application uses a hash to recreate EGG
. It then uses EGG
to hash some data, like her username, into a third secret: SCRAMBLE
.
The browser then sends Bianca the two pieces of data: the username and SCRAMBLE
.
Bianca knows the same secret key already, so she does the same computation: EGG
plus Alice's username results in SCRAMBLE
. Since Bianca has come up with the same result as Alice asserted, Bianca assumes that Alice is who she says she is, and logs Alice in.
For clarity (and the more visual among us), here's a sequence diagram that might help.
So the four parts of HMAC:
SCRAMBLE
is the code both Alice and Bianca generate. (numbers 4 & 7 in the diagram)
Note that only Alice knows the secret passphrase PAN
. It never leaves her browser.
This is how we protect Alice from having her password stolen by man-in-the-middle attacks.
Even if she's on a compromised network, Alice's password never leaves her browser.
If we wanted the additional security of a salt, we could change our signup and login flows to first submit the username, which would return the salt, and then the signup or login would proceed normally but using the salt to enhance the entropy of whatever passphrase is used.
We want to leave the login flow alone as much as possible, though, so we'll use Alice's username as her salt.
It's not a perfect salt, but it's better than nothing.
If Alice's password is password
(shame on you, Alice), and her username is alice1987
, at least our combined secret passphrase will be alice1987password
(or passwordalice1987
) instead of just password
. It's an improvement, even if minimal.
We have one more thing we can do to protect Alice.
If someone is snooping on Alice's network while she's logging in - maybe from unsecured coffee shop wifi - they could grab her login request (remember, it's alice1987
and the authentication code SCRAMBLE
).
Once someone has that info they could do anything with it. They could try to reverse the hash algorithm to figure out her secret passphrase, or they could re-use the same request to make Bianca think that Alice is trying to log in again.
The only thing we can do about the former is regularly audit our hash algorithms to make sure we're using best-in-class options so that figuring out the secret phrase will take longer than it's worth.
We can - however - protect against the latter attack, called a replay attack.
To protect against replay attacks, all we have to do is add a timestamp to the message! When Bianca checks the authentication request, she'll also check that the timestamp isn't too old - maybe a maximum of 30 seconds ago. If the request is too old, we'll consider that a replay attack and just deny the login.
The beauty of message authentication is that as long as our front end sends all of the information and it includes that information in the HMAC (SCRAMBLE
), the API (Bianca) can always validate the information.
The amazing thing about this is it just uses off-the-shelf tools baked into virtually every programming language.
Even web browsers provide (some of) these tools by default as far back as IE11!
Sadly, one crucial piece - the TextEncoder - is not supported in Internet Explorer at all, so if you need to support IE, some polyfilling is necessary.
Fortunately, everything here is polyfillable!
To get us started, we'll build a couple of helper functions to do basic cryptographic work.
function hex( buffer ){
var hexCodes = [];
var view = new DataView( buffer );
for( let i = 0; i < view.byteLength; i += 4 ){
hexCodes.push( `00000000${view.getUint32( i ).toString( 16 )}`.slice( -8 ) );
}
return hexCodes.join( "" );
}
export function hash( str, algo = "SHA-512" ){
var buffer = new TextEncoder( "utf-8" ).encode( str );
return crypto
.subtle
.digest( algo, buffer )
.then( hex );
}
These two functions take a stream of data and convert it to hexadecimal (hex
) and take a string and a hashing algorithm to create a hash (hash
).
Note that SubtleCrypto only supports a small subset of the possible algorithms for digesting strings.
For our use-case, however, SHA-512 is great, so we'll default to that.
Note that hex
is synchronous, but hash
returns a Promise that eventually resolves to the hexadecimal value.
Next up, we need a way to generate a special key that will be used to create the authentication code.
function makeKey( secret, algo = "SHA-512", usages = [ "sign" ] ){
var buffer = new TextEncoder( "utf-8" ).encode( secret );
return crypto
.subtle
.importKey(
"raw",
buffer,
{
"name": "HMAC",
"hash": { "name": algo }
},
false,
usages
);
}
Here, makeKey
takes Alice's secret passphrase, some algorithm (SHA-512 again, in our case), and a list of ways the key is allowed to be used.
For our intent, we only need to be able to sign messages with this key, so our usages
can just stay [ "sign" ]
.
What we get back is a Promise that eventually resolves to a CryptoKey.
Finally we need a way to generate signed HMAC messages.
export async function hmac( secret, message, algo = "SHA-512" ){
var buffer = new TextEncoder( "utf-8" ).encode( message );
var key = await makeKey( secret, algo );
return crypto
.subtle
.sign(
"HMAC",
key,
buffer
)
.then( hex );
}
Here, hmac
takes Alice's secret passphrase, and our message that we want to authenticate (plus our hash algorithm).
What we get back is a Promise that eventually resolves to a long hexadecimal string.
That hex string is the authentication code!
When our API (Bianca) and our front end agree on the mechanism for creating messages to be authenticated, as long as the result from hmac
on both sides agrees, the message has been authenticated!
So let's say all that code above is off in a file called Crypto.js
.
Here's a file with two functions in it that create a new account for a user, and log a user in.
import { hash, hmac } from "./Crypto.js";
async function createAccount( username, passphrase ){
var secret = await hash( `${username}${passphrase}` );
return fetch( "/signup", {
"method": "POST",
"body": JSON.stringify( {
secret,
username
} );
} )
}
async function login( username, passphrase ){
var now = new Date();
var secret = await hash( `${username}${passphrase}` );
var authenticationCode = await hmac( secret, `${username}${now.valueOf()}` );
return fetch( "/login", {
"method": "POST",
"body": JSON.stringify( {
"hmac": authenticationCode,
"timestamp": now.toISOString(),
username
} )
} )
}
Here's a sequence diagram for signing up.
Good question.
It does the exact same work you've always done, just using HMAC now.
In the case of account creation, nothing has changed.
It just so happens the secret
we receive from the user is a bit longer (and more random) than usual.
Store it safely in your authentications database table associated to the username.
For logins, your API will be doing a bit more work.
Here's a bit of rough psuedo-code (unlikely to run, never tested) for a node
login back end.
import { createHmac } from "crypto";
function login( request, response ){
var now = Date.now();
var oldest = now - 30000;
// fromISO from your favorite DateTime library
var timestamp = fromISO( request.body.timestamp ).valueOf();
var { hmac, username } = request.body;
// db is a Database abstraction layer here, use your favorite!
var secretKey = db.Authentications.getSecretKeyForUser( username );
var authenticator = createHmac( "sha512", secretKey );
// Note the parity here with the front end, we are hashing the same message
authenticator.update( `${username}${timestamp}` );
// If the one we created matches the one from the front end, we're authenticated
var isAuthenticated = authenticator.digest( "hex" ) == hmac;
// Except.......
if( oldest >= timestamp ){
// HEY THIS LOOKS LIKE A REPLAY ATTACK!
}
else{
return isAuthenticated;
}
}
crypto
, How Is This Better?
Thanks for asking.
This is better than the standard authentication process for a number of reasons:
Better protection against password reuse.
You can't protect your user from other websites stealing their password and stuffing it into your website (other than disallowing known-breached passwords, a topic for another time), but you can add a layer of protection.
By only storing a hashed value based on Alice's secret passphrase, you reduce the potential of an attacker being able to steal your data and use it elsewhere.
This is effectively the exact same policy as never storing plaintext passwords, except we're doing the hashing on Alice's computer, which means...
Better protection against man-in-the-middle attacks.
Granted, you should be running your entire website over SSL with HSTS, but just in case you're not, or just in case someone manages to perform a complicated SSL downgrade attack (only possible without HSTS!), HMAC further protects the all-important passphrase.
Since the only thing exiting Alice's web browser under most circumstances is the Authentication Code, it's a lot (years, maybe) of work to bruteforce that hash and extract the secret key.
Then, that key is only useful on your website, so it's even more work to bruteforce the key and extract the original passphrase.
Generally speaking, this kind of effort is not worth it to the average website attacker.
If you run a website with government, banking, or sensitive personal secrets, I assume this is all old news to you and you're probably doing something even better... right?
Better protection against replay attacks.
This is pretty straight-forward.
Because we're mixing a timestamp into the message to be authenticated, an attacker has only one option to be able to pretend they're Alice after intercepting her login request: bruteforce the hash to extract the secret key, then update the timestamp and regenerate the message authentication code.
Again, this is generally more effort than it's worth.
Of course not.
Without a doubt, this is better than the standard login form that sends a username and password to an API.
However, it does still have some drawbacks.
bill1968
and password billsmith1968
.
Yeah, you should do it.
It takes maybe an hour to set this stuff up, and the security gains are worth far, far more than an hour of your time.
This post could not be what it is without help from the following folks: