The Berlekamp-Massey algorithm, (or as abbreviated, the BM algorithm) solves the following problem:
Given a sequence of length , find a shortest LFSR, such that it produces the sequence in the first symbols.
For instance, assuming a binary sequence, (the 8 denotes the length of the sequence) is produced by an LFSR with connection polynomial over . This the shortest length of an LFSR that reproduces that sequence. We should note here that we totally ignore what it produces after the 8th symbol – it is not important – we only care about the first 8 symbols.
The main idea of the BM algorithm is that we “make a guess” about what LFSR we are having. We then simulate the LFSR up to symbol and see if symbol is correct – if it is not, we alter the LFSR connection polynomial so that it produces a correct symbol. A very important remark is that our correction cannot cause previous symbols to be incorrect, because then we would have to start all over again. This makes the BM algorithm a so-called exact greedy algorithm (it produces an optimal solution that, however, needs not to be unique unless the length of the solution is less than or equal to half the length of the sequence).
In essence, we can summarize the algorithm into a single for-loop:
- Simulate LFSR up to symbol . Denote the sequence .
- correct the LFSR so that it agrees.
Let us focus on 1.2. The false case of the if-statement is pretty obvious – if the LFSR works, do not fix it.
First of all, note the following: each symbol in a sequence produced by an LFSR must fulfill a linear relation with some other symbols in the sequence. This linear relation is given by the connection polynomial.
Consider the LFSR given by , from the example above. Each symbol must fulfill the equation , except for the symbols that is the starting state of the LFSR. From this, we can define the discrepancy function , where for all .
Assume that we have a determined the LFSR that produces the correct sequence up to symbol . This means we have a series for discrepancies, namely . Clearly, the last one needs to eliminated. How do we do this?
We could, theoretically, change the discrepancy function (and, thus, the connection polynomial correspondingly) to , where if and otherwise. How do we construct such a function (that also is of lowest possible degree)?
Assume that we have made this correction before. This means, that at some point, we had an index , where the discrepancy function series was .
Let’s say that we add this series onto , but we delay it steps. This means that . Here, we have found a function that agree with the properties of .
Here’s a catch: we might have had several discrepancies and therefore several equations for the discrepancy. Which equation should we use? First note that we need to delay the function steps if we use the discrepancy at index . Also note that delay is equal to complexity, which we, in turn, want to reduce. Therefore, we want to pick as large as possible, as this, by induction assumption will be minimal.
Note that by changing discrepancy function, we change the connection polynomial. Thus,
And this is the correction step of the BM algorithm.
Note: For the BM algorithm to work on arbitrary field , instead of adding the previous discrepancy symbol , we have to match it with the current discrepancy
The following fixes the problem:
Thus, the correction goes .
Related: Correlation attacks. See for instance this paper.