Strassen's Matrix Multiplication Explained

Strassen's algorithm is an efficient algorithm for matrix multiplication. It was developed by Volker Strassen in 1969 and provides a faster alternative to the standard matrix multiplication algorithm, especially for large matrices. It is a classic example of the Divide & Conquer paradigm.

Algorithm

Strassen's algorithm works by recursively dividing the input matrices into smaller submatrices and performing a series of seven multiplications on these submatrices, along with some additions and subtractions. This reduces the number of multiplications compared to the standard algorithm, which leads to a lower time complexity.

Let A and B be two n \\times n matrices that we want to multiply to get the result matrix C \= A \\times B. If n is not a power of 2, the matrices are padded with rows and columns of zeros.The algorithm can be summarized as follows:

  1. Divide: Divide the matrices A and B into four n/2 \\times n/2 submatrices each: A \= \\begin\{bmatrix\} A\_\{11\} & A\_\{12\} \\\\ A\_\{21\} & A\_\{22\} \\end\{bmatrix\}, \\quad B \= \\begin\{bmatrix\} B\_\{11\} & B\_\{12\} \\\\ B\_\{21\} & B\_\{22\} \\end\{bmatrix\}
  2. Compute Seven Products: Compute the following seven matrix products: $ \begin{aligned} P_1 &= (A_{11} + A_{22}) \times (B_{11} + B_{22}) \\ P_2 &= (A_{21} + A_{22}) \times B_{11} \\ P_3 &= A_{11} \times (B_{12} - B_{22}) \\ P_4 &= A_{22} \times (B_{21} - B_{11}) \\ P_5 &= (A_{11} + A_{12}) \times B_{22} \\ P_6 &= (A_{21} - A_{11}) \times (B_{11} + B_{12}) \\ P_7 &= (A_{12} - A_{22}) \times (B_{21} + B_{22}) \end{aligned} $
  3. Combine: Compute the submatrices of the result matrix C: $ \begin{aligned} C_{11} &= P_1 + P_4 - P_5 + P_7 \\ C_{12} &= P_3 + P_5 \\ C_{21} &= P_2 + P_4 \\ C_{22} &= P_1 - P_2 + P_3 + P_6 \end{aligned} $
  4. Recurse: Recursively apply the algorithm to the n/2 \\times n/2 matrices until the matrices are small enough to be multiplied using the standard method.

Mathematics

The key idea behind Strassen's algorithm is to reduce the number of multiplications. The standard matrix multiplication algorithm requires eight multiplications of n/2 \\times n/2 matrices. Strassen's algorithm reduces this to seven, at the cost of some extra additions and subtractions.

Let T\(n\) be the time complexity of multiplying two n \\times n matrices using Strassen's algorithm. The recurrence relation for T\(n\) is:

T\(n\) \= 7T\(n/2\) \+ O\(n^2\)

where 7T\(n/2\) represents the seven recursive calls to multiply n/2 \\times n/2 matrices, and O\(n^2\) represents the time taken for the additions and subtractions.

Using the Master Theorem, we can solve this recurrence relation:

In the Master Theorem: a \= 7, b \= 2, and f\(n\) \= O\(n^2\).

We compare f\(n\) \= n^2 with n^\{\\log\_b a\} \= n^\{\\log\_2 7\} \\approx n^\{2\.807\}.

Since f\(n\) \= O\(n^2\) \= O\(n^\{\\log\_2 7 \- \\epsilon\}\) for some \\epsilon \> 0 (e.g., \\epsilon \\approx 0\.807), we fall into case 1 of the Master Theorem.

Therefore, T\(n\) \= \\Theta\(n^\{\\log\_2 7\}\) \\approx \\Theta\(n^\{2\.807\}\).

This is asymptotically faster than the O\(n^3\) complexity of the standard matrix multiplication algorithm.

Time and Memory Complexity

  • Time Complexity: The time complexity of Strassen's algorithm is \\Theta\(n^\{\\log\_2 7\}\) \\approx \\Theta\(n^\{2\.807\}\). This is asymptotically faster than the O\(n^3\) time complexity of the standard matrix multiplication algorithm.
  • Memory Complexity: The memory complexity of Strassen's algorithm is determined by the depth of the recursion and the temporary storage used for intermediate matrices. The depth of the recursion is O\(\\log n\), and at each level, we store a constant number of matrices of size \(n/2\)^2. Therefore, the memory complexity is O\(n^2 \\log n\).

Easy Problems

Problem 1: Matrix Addition (Conceptual)

Problem Statement (Conceptual): While not Strassen's, matrix addition is a fundamental operation. Given two n \\times n matrices, add them. This highlights the basic matrix operations needed within Strassen's.

Input Example:


A = [[1, 2], [3, 4]]
B = [[5, 6], [7, 8]]

Output Example:


C = [[6, 8], [10, 12]]

C++ Code:


#include <iostream>
#include <vector>

std::vector<std::vector<int>> matrixAdd(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    int n = A.size();
    std::vector<std::vector<int>> C(n, std::vector<int>(n));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            C[i][j] = A[i][j] + B[i][j];
        }
    }
    return C;
}

int main() {
    std::vector<std::vector<int>> A = {{1, 2}, {3, 4}};
    std::vector<std::vector<int>> B = {{5, 6}, {7, 8}};
    std::vector<std::vector<int>> C = matrixAdd(A, B);
    for (const auto& row : C) {
        for (int val : row) {
            std::cout << val << " ";
        }
        std::cout << std::endl;
    }
    return 0;
}

Problem 2: Matrix Subtraction (Conceptual)

Problem Statement (Conceptual): Given two n \\times n matrices, subtract them. This is another basic matrix operation used within Strassen's.

Input Example:


A = [[5, 6], [7, 8]]
B = [[1, 2], [3, 4]]

Output Example:


C = [[4, 4], [4, 4]]

C++ Code:


#include <iostream>
#include <vector>

std::vector<std::vector<int>> matrixSubtract(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    int n = A.size();
    std::vector<std::vector<int>> C(n, std::vector<int>(n));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            C[i][j] = A[i][j] - B[i][j];
        }
    }
    return C;
}

int main() {
    std::vector<std::vector<int>> A = {{5, 6}, {7, 8}};
    std::vector<std::vector<int>> B = {{1, 2}, {3, 4}};
    std::vector<std::vector<int>> C = matrixSubtract(A, B);
    for (const auto& row : C) {
        for (int val : row) {
            std::cout << val << " ";
        }
        std::cout << std::endl;
    }
    return 0;
}

Intermediate Problems

Problem 3: Standard Matrix Multiplication (Divide and Conquer)

Problem Statement: Implement standard matrix multiplication using a divide-and-conquer approach. This will serve as a baseline for comparison with Strassen's. While still O\(n^3\), it demonstrates the divide-and-conquer strategy.

Input:


A = [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]]
B = [[16, 15, 14, 13], [12, 11, 10, 9], [8, 7, 6, 5], [4, 3, 2, 1]]

Output:


C = [[80, 70, 60, 50], [248, 214, 180, 146], [416, 358, 300, 242], [584, 502, 420, 338]]

C++ Code:


#include <iostream>
#include <vector>

// Function to split a matrix into four submatrices
void splitMatrix(const std::vector<std::vector<int>>& A, std::vector<std::vector<int>>& A11, std::vector<std::vector<int>>& A12,
                 std::vector<std::vector<int>>& A21, std::vector<std::vector<int>>& A22) {
    int n = A.size();
    int half = n / 2;

    for (int i = 0; i < half; ++i) {
        for (int j = 0; j < half; ++j) {
            A11[i][j] = A[i][j];
            A12[i][j] = A[i][j + half];
            A21[i][j] = A[i + half][j];
            A22[i][j] = A[i + half][j + half];
        }
    }
}

// Function to combine four submatrices into a single matrix
void combineMatrix(const std::vector<std::vector<int>>& C11, const std::vector<std::vector<int>>& C12,
                   const std::vector<std::vector<int>>& C21, const std::vector<std::vector<int>>& C22,
                   std::vector<std::vector<int>>& C) {
    int n = C11.size();
    int half = n;

    for (int i = 0; i < half; ++i) {
        for (int j = 0; j < half; ++j) {
            C[i][j] = C11[i][j];
            C[i][j + half] = C12[i][j];
            C[i + half][j] = C21[i][j];
            C[i + half][j + half] = C22[i][j];
        }
    }
}

// Function to perform standard matrix multiplication using divide and conquer
std::vector<std::vector<int>> matrixMultiplyDivideAndConquer(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    int n = A.size();
    std::vector<std::vector<int>> C(n, std::vector<int>(n, 0));

    if (n == 1) {
        C[0][0] = A[0][0] * B[0][0];
        return C;
    }

    int half = n / 2;

    // Divide matrices A and B into submatrices
    std::vector<std::vector<int>> A11(half, std::vector<int>(half));
    std::vector<std::vector<int>> A12(half, std::vector<int>(half));
    std::vector<std::vector<int>> A21(half, std::vector<int>(half));
    std::vector<std::vector<int>> A22(half, std::vector<int>(half));
    std::vector<std::vector<int>> B11(half, std::vector<int>(half));
    std::vector<std::vector<int>> B12(half, std::vector<int>(half));
    std::vector<std::vector<int>> B21(half, std::vector<int>(half));
    std::vector<std::vector<int>> B22(half, std::vector<int>(half));

    splitMatrix(A, A11, A12, A21, A22);
    splitMatrix(B, B11, B12, B21, B22);

    // Recursively compute the eight products
    std::vector<std::vector<int>> C11 = matrixMultiplyDivideAndConquer(A11, B11) + matrixMultiplyDivideAndConquer(A12, B21);
    std::vector<std::vector<int>> C12 = matrixMultiplyDivideAndConquer(A11, B12) + matrixMultiplyDivideAndConquer(A12, B22);
    std::vector<std::vector<int>> C21 = matrixMultiplyDivideAndConquer(A21, B11) + matrixMultiplyDivideAndConquer(A22, B21);
    std::vector<std::vector<int>> C22 = matrixMultiplyDivideAndConquer(A21, B12) + matrixMultiplyDivideAndConquer(A22, B22);

    // Combine the results
    combineMatrix(C11, C12, C21, C22, C);
    return C;
}

// Overload the + operator for matrix addition
std::vector<std::vector<int>> operator+(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    int n = A.size();
    std::vector<std::vector<int>> C(n, std::vector<int>(n));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            C[i][j] = A[i][j] + B[i][j];
        }
    }
    return C;
}

int main() {
    std::vector<std::vector<int>> A = {{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10, 11, 12}, {13, 14, 15, 16}};
    std::vector<std::vector<int>> B = {{16, 15, 14, 13}, {12, 11, 10, 9}, {8, 7, 6, 5}, {4, 3, 2, 1}};
    std::vector<std::vector<int>> C = matrixMultiplyDivideAndConquer(A, B);

    for (const auto& row : C) {
        for (int val : row) {
            std::cout << val << " ";
        }
        std::cout << std::endl;
    }
    return 0;
}

Problem 4: Strassen's Matrix Multiplication (2x2 Matrices)

Problem Statement: Implement Strassen's matrix multiplication algorithm for 2x2 matrices. This simplifies the recursion to a single level, making it easier to understand the core logic.

Input:


A = [[1, 2], [3, 4]]
B = [[5, 6], [7, 8]]

Output:


C = [[19, 22], [43, 50]]

C++ Code:


#include <iostream>
#include <vector>

// Function to perform matrix addition
std::vector<std::vector<int>> matrixAdd(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    int n = A.size();
    std::vector<std::vector<int>> C(n, std::vector<int>(n));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            C[i][j] = A[i][j] + B[i][j];
        }
    }
    return C;
}

// Function to perform matrix subtraction
std::vector<std::vector<int>> matrixSubtract(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    int n = A.size();
    std::vector<std::vector<int>> C(n, std::vector<int>(n));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            C[i][j] = A[i][j] - B[i][j];
        }
    }
    return C;
}

// Function to perform Strassen's matrix multiplication for 2x2 matrices
std::vector<std::vector<int>> strassen2x2(const std::vector<std::vector<int>>& A, const std::vector<std::vector<int>>& B) {
    std::vector<std::vector<int>> C(2, std::vector<int>(2));

    int a11 = A[0][0]; int a12 = A[0][1];
    int a21 = A[1][0]; int a22 = A[1][1];
    int b11 = B[0][0]; int b12 = B[0][1];
    int b21 = B[1][0]; int b22 = B[1][1];

    int p1 = (a11 + a22) * (b11 + b22);
    int p2 = (a21 + a22) * b11;
    int p3 = a11 * (b12 - b22);
    int p4 = a22 * (b21 - b11);
    int p5 = (a11 + a12) * b22;
    int p6 = (a21 - a11) * (b11 + b12);
    int p7 = (a12 - a22) * (b21 + b22);

    C[0][0] = p1 + p4 - p5 + p7;
    C[0][1] = p3 + p5;
    C[1][0] = p2 + p4;
    C[1][1] = p1 - p2 + p3 + p6;

    return C;
}

int main() {
    std::vector<std::vector<int>> A = {{1, 2}, {3, 4}};
    std::vector<std::vector<int>> B = {{5, 6}, {7, 8}};
    std::vector<std::vector<int>> C = strassen2x2(A, B);

    for (const auto& row : C) {
        for (int val : row) {
            std::cout << val << " ";
        }
        std::cout << std::endl;
    }
    return 0;
}

Hard Problem

Problem 5: Strassen's Matrix Multiplication (General Case)

Problem Statement: Implement Strassen's matrix multiplication algorithm for general n \\times n matrices, where n is a power of 2. Optimize your implementation for performance.

Input


n
A[n][n] (the first n x n matrix)
B[n][n] (the second n x n matrix)

where $n$ is a power of 2 (e.g., 2, 4, 8, 16, ...).

Output


C[n][n] (the product of A and B)

The resulting $n \times n$ matrix $C = A \times B$, with elements $C_{ij} = \sum_{k=0}^{n-1} A_{ik} B_{kj}$.

Input Example


2
1 2
3 4
5 6
7 8

Output Example


19 22
43 50

C++ Code Solution


#include <iostream>
#include <vector>
#include <cmath>

std::vector<std::vector<int>> multiply(const std::vector<std::vector<int>&>& a, const std::vector<std::vector<int>&>& b) {
    int n = a.size();
    std::vector<std::vector<int>> result(n, std::vector<int>(n, 0));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            for (int k = 0; k < n; ++k) {
                result[i][j] += a[i][k] * b[k][j];
            }
        }
    }
    return result;
}

std::vector<std::vector<int>> add(const std::vector<std::vector<int>&>& a, const std::vector<std::vector<int>&>& b) {
    int n = a.size();
    std::vector<std::vector<int>> result(n, std::vector<int>(n, 0));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            result[i][j] = a[i][j] + b[i][j];
        }
    }
    return result;
}

std::vector<std::vector<int>> subtract(const std::vector<std::vector<int>&>& a, const std::vector<std::vector<int>&>& b) {
    int n = a.size();
    std::vector<std::vector<int>> result(n, std::vector<int>(n, 0));
    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            result[i][j] = a[i][j] - b[i][j];
        }
    }
    return result;
}

std::vector<std::vector<int>> getSubmatrix(const std::vector<std::vector<int>&>& matrix, int row_start, int col_start, int size) {
    std::vector<std::vector<int>> sub(size, std::vector<int>(size));
    for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            sub[i][j] = matrix[row_start + i][col_start + j];
        }
    }
    return sub;
}

std::vector<std::vector<int>> combineSubmatrices(const std::vector<std::vector<int>&>& c11, const std::vector<std::vector<int>&>& c12,
                                              const std::vector<std::vector<int>&>& c21, const std::vector<std::vector<int>&>& c22) {
    int size = c11.size();
    int n = 2 * size;
    std::vector<std::vector<int>> result(n, std::vector<int>(n));
    for (int i = 0; i < size; ++i) {
        for (int j = 0; j < size; ++j) {
            result[i][j] = c11[i][j];
            result[i][j + size] = c12[i][j];
            result[i + size][j] = c21[i][j];
            result[i + size][j + size] = c22[i][j];
        }
    }
    return result;
}

std::vector<std::vector<int>> strassenMultiply(const std::vector<std::vector<int>&>& a, const std::vector<std::vector<int>&>& b) {
    int n = a.size();
    if (n <= 64) { // Base case for small matrices, use standard multiplication
        return multiply(a, b);
    }

    int half = n / 2;

    std::vector<std::vector<int>> a11 = getSubmatrix(a, 0, 0, half);
    std::vector<std::vector<int>> a12 = getSubmatrix(a, 0, half, half);
    std::vector<std::vector<int>> a21 = getSubmatrix(a, half, 0, half);
    std::vector<std::vector<int>> a22 = getSubmatrix(a, half, half, half);

    std::vector<std::vector<int>> b11 = getSubmatrix(b, 0, 0, half);
    std::vector<std::vector<int>> b12 = getSubmatrix(b, 0, half, half);
    std::vector<std::vector<int>> b21 = getSubmatrix(b, half, 0, half);
    std::vector<std::vector<int>> b22 = getSubmatrix(b, half, half, half);

    std::vector<std::vector<int>> p1 = strassenMultiply(add(a11, a22), add(b11, b22));
    std::vector<std::vector<int>> p2 = strassenMultiply(add(a21, a22), b11);
    std::vector<std::vector<int>> p3 = strassenMultiply(a11, subtract(b12, b22));
    std::vector<std::vector<int>> p4 = strassenMultiply(a22, subtract(b21, b11));
    std::vector<std::vector<int>> p5 = strassenMultiply(add(a11, a12), b22);
    std::vector<std::vector<int>> p6 = strassenMultiply(subtract(a21, a11), add(b11, b12));
    std::vector<std::vector<int>> p7 = strassenMultiply(subtract(a12, a22), add(b21, b22));

    std::vector<std::vector<int>> c11 = subtract(add(add(p1, p4), p7), p5);
    std::vector<std::vector<int>> c12 = add(p3, p5);
    std::vector<std::vector<int>> c21 = add(p2, p4);
    std::vector<std::vector<int>> c22 = subtract(add(add(p1, p3), p6), p2);

    return combineSubmatrices(c11, c12, c21, c22);
}

int main() {
    int n;
    std::cin >> n;
    std::vector<std::vector<int>> a(n, std::vector<int>(n));
    std::vector<std::vector<int>> b(n, std::vector<int>(n));

    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            std::cin >> a[i][j];
        }
    }

    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            std::cin >> b[i][j];
        }
    }

    std::vector<std::vector<int>> c = strassenMultiply(a, b);

    for (int i = 0; i < n; ++i) {
        for (int j = 0; j < n; ++j) {
            std::cout << c[i][j] << (j == n - 1 ? "" : " ");
        }
        std::cout << std::endl;
    }

    return 0;
}