在N×N二进制矩阵中查找仅包含零的最大矩形


75

给定一个NxN二进制矩阵(仅包含0或1),我们如何才能找到包含全0的最大矩形?

例:

      I
    0 0 0 0 1 0
    0 0 1 0 0 1
II->0 0 0 0 0 0
    1 0 0 0 0 0
    0 0 0 0 0 1 <--IV
    0 0 1 0 0 0
            IV 

对于上面的示例,它是一个6×6二进制矩阵。在这种情况下,返回值将为Cell 1:(2,1)和Cell 2:(4,4)。生成的子矩阵可以是正方形或矩形。返回值也可以是所有0的最大子矩阵的大小,在此示例中为3×4。


1
请考虑将接受的答案更改为JF Sebastian的答案,该答案现已正确并且具有最佳的复杂性。
j_random_hacker 2011年

1
请检查非常类似(我会说一式两份)问题:stackoverflow.com/questions/7770945/...stackoverflow.com/a/7353193/684229。解决方法是O(n)
TMS 2012年

我正在尝试对任何方向的矩形执行相同的操作。看到问题:stackoverflow.com/questions/22604043/…–
克里斯·梅斯

@TMS实际上是相反的方式。这些问题是这一问题的重复。
tommy.carstensen

Answers:


45

这是一个基于@j_random_hacker在评论中建议的“直方图中的最大矩形”问题的解决方案:

[算法]通过迭代从上到下的行来工作,对于解决该问题的每一行,其中“直方图”中的“条”由从当前行开始的所有零连续向上的零尾组成(一列的高度为0 (如果当前行中有1)。

输入矩阵mat可以是任意可迭代的,例如文件或网络流。一次只需要一行。

#!/usr/bin/env python
from collections import namedtuple
from operator import mul

Info = namedtuple('Info', 'start height')

def max_size(mat, value=0):
    """Find height, width of the largest rectangle containing all `value`'s."""
    it = iter(mat)
    hist = [(el==value) for el in next(it, [])]
    max_size = max_rectangle_size(hist)
    for row in it:
        hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)]
        max_size = max(max_size, max_rectangle_size(hist), key=area)
    return max_size

def max_rectangle_size(histogram):
    """Find height, width of the largest rectangle that fits entirely under
    the histogram.
    """
    stack = []
    top = lambda: stack[-1]
    max_size = (0, 0) # height, width of the largest rectangle
    pos = 0 # current position in the histogram
    for pos, height in enumerate(histogram):
        start = pos # position where rectangle starts
        while True:
            if not stack or height > top().height:
                stack.append(Info(start, height)) # push
            elif stack and height < top().height:
                max_size = max(max_size, (top().height, (pos - top().start)),
                               key=area)
                start, _ = stack.pop()
                continue
            break # height == top().height goes here

    pos += 1
    for start, height in stack:
        max_size = max(max_size, (height, (pos - start)), key=area)    
    return max_size

def area(size):
    return reduce(mul, size)

解为O(N),其中N是矩阵中元素的数量。它需要O(ncols)额外的内存,这里ncols是矩阵中的列数。

最新版本的测试位于https://gist.github.com/776423


2
很好的尝试,但是失败了max_size([[0,0,0,0,1,1,1], [0,0,0,0,0,0,0], [0,0,0,1,1,1,1], [0,0,1,1,1,1,1]] + [[1,0,1,1,1,1,1]] * 3),当左上角有一个3x3的正方形时,返回(2,4)。
j_random_hacker 2011年

3
基本的问题是,像在这里所做的那样,仅跟踪相邻点的(几个)最大面积矩形并不总是足够的。我知道是正确的唯一O(N)算法是通过从上到下迭代遍历行来解决此问题的每一行:stackoverflow.com/questions/4311694/…,其中“直方图”中的“条”包括从当前行开始的所有零的连续向上轨迹(如果当前行的值为1,则列的高度为0)。
j_random_hacker 2011年

6
@j_random_hacker:我已经更新了答案,以使用基于“直方图”的算法。
jfs

4
看起来不错,但是,我实际上是在寻找最大的矩形(例如,返回坐标)。该算法将可靠地返回该区域,但是一旦我知道了,一个人怎么会发现3列x 2行矩形的位置,其左上角为[3,5]?
JBWhitmore 2014年

1
从哪里获得绑定列信息?(矩形的左列或右列?)。我们可以从中获取宽度和高度max_rectangle_size,并从for row in it:迭代中获取最下面一行,但是我找不到边界列信息。
manatttta

30

请查看“直方图”下的“最大化矩形区域”,然后继续阅读下面的解决方案。

Traverse the matrix once and store the following;

For x=1 to N and y=1 to N    
F[x][y] = 1 + F[x][y-1] if A[x][y] is 0 , else 0

Then for each row for x=N to 1 
We have F[x] -> array with heights of the histograms with base at x.
Use O(N) algorithm to find the largest area of rectangle in this histogram = H[x]

From all areas computed, report the largest.

时间复杂度为O(N * N)= O(N²)(对于NxN二进制矩阵)

例:

Initial array    F[x][y] array
 0 0 0 0 1 0     1 1 1 1 0 1
 0 0 1 0 0 1     2 2 0 2 1 0
 0 0 0 0 0 0     3 3 1 3 2 1
 1 0 0 0 0 0     0 4 2 4 3 2
 0 0 0 0 0 1     1 5 3 5 4 0
 0 0 1 0 0 0     2 6 0 6 5 1

 For x = N to 1
 H[6] = 2 6 0 6 5 1 -> 10 (5*2)
 H[5] = 1 5 3 5 4 0 -> 12 (3*4)
 H[4] = 0 4 2 4 3 2 -> 10 (2*5)
 H[3] = 3 3 1 3 2 1 -> 6 (3*2)
 H[2] = 2 2 0 2 1 0 -> 4 (2*2)
 H[1] = 1 1 1 1 0 1 -> 4 (1*4)

 The largest area is thus H[5] = 12

例子很好的解释
Peter

1
您确定这是O(N * N)吗?整个矩阵有两遍,但我的印象是这是O(N)。
克里斯·梅斯

很好的解释.. :)我也希望您能解释“最大化直方图下方的矩形区域”。:D
tumbudu

1
更清楚地说。解为O(N * N),其中N是行/列中的项目数,因为问题表明输入的大小为NxN。如果N是输入项的总数,则为O(N)
user2469515 16-4-17

12

这是一个Python3解决方案,它除了返回最大矩形的面积之外,还返回位置:

#!/usr/bin/env python3

import numpy

s = '''0 0 0 0 1 0
0 0 1 0 0 1
0 0 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 1
0 0 1 0 0 0'''

nrows = 6
ncols = 6
skip = 1
area_max = (0, [])

a = numpy.fromstring(s, dtype=int, sep=' ').reshape(nrows, ncols)
w = numpy.zeros(dtype=int, shape=a.shape)
h = numpy.zeros(dtype=int, shape=a.shape)
for r in range(nrows):
    for c in range(ncols):
        if a[r][c] == skip:
            continue
        if r == 0:
            h[r][c] = 1
        else:
            h[r][c] = h[r-1][c]+1
        if c == 0:
            w[r][c] = 1
        else:
            w[r][c] = w[r][c-1]+1
        minw = w[r][c]
        for dh in range(h[r][c]):
            minw = min(minw, w[r-dh][c])
            area = (dh+1)*minw
            if area > area_max[0]:
                area_max = (area, [(r-dh, c-minw+1, r, c)])

print('area', area_max[0])
for t in area_max[1]:
    print('Cell 1:({}, {}) and Cell 2:({}, {})'.format(*t))

输出:

area 12
Cell 1:(2, 1) and Cell 2:(4, 4)

很棒!我从中制作出了一个Fortran版本,并将其编译为可在Python中使用,因为这样遍历Python中的大型数组非常缓慢。
杰森

4

这是将JF Sebastians方法转换为C#:

private Vector2 MaxRectSize(int[] histogram) {
        Vector2 maxSize = Vector2.zero;
        int maxArea = 0;
        Stack<Vector2> stack = new Stack<Vector2>();

        int x = 0;
        for (x = 0; x < histogram.Length; x++) {
            int start = x;
            int height = histogram[x];
            while (true) {
                if (stack.Count == 0 || height > stack.Peek().y) {
                    stack.Push(new Vector2(start, height));

                } else if(height < stack.Peek().y) {
                    int tempArea = (int)(stack.Peek().y * (x - stack.Peek().x));
                    if(tempArea > maxArea) {
                        maxSize = new Vector2(stack.Peek().y, (x - stack.Peek().x));
                        maxArea = tempArea;
                    }

                    Vector2 popped = stack.Pop();
                    start = (int)popped.x;
                    continue;
                }

                break;
            }
        }

        foreach (Vector2 data in stack) {
            int tempArea = (int)(data.y * (x - data.x));
            if(tempArea > maxArea) {
                maxSize = new Vector2(data.y, (x - data.x));
                maxArea = tempArea;
            }
        }

        return maxSize;
    }

    public Vector2 GetMaximumFreeSpace() {
        // STEP 1:
        // build a seed histogram using the first row of grid points
        // example: [true, true, false, true] = [1,1,0,1]
        int[] hist = new int[gridSizeY];
        for (int y = 0; y < gridSizeY; y++) {
            if(!invalidPoints[0, y]) {
                hist[y] = 1;
            }
        }

        // STEP 2:
        // get a starting max area from the seed histogram we created above.
        // using the example from above, this value would be [1, 1], as the only valid area is a single point.
        // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3.
        // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on
        // a single row of data.
        Vector2 maxSize = MaxRectSize(hist);
        int maxArea = (int)(maxSize.x * maxSize.y);

        // STEP 3:
        // build histograms for each additional row, re-testing for new possible max rectangluar areas
        for (int x = 1; x < gridSizeX; x++) {
            // build a new histogram for this row. the values of this row are
            // 0 if the current grid point is occupied; otherwise, it is 1 + the value
            // of the previously found historgram value for the previous position. 
            // What this does is effectly keep track of the height of continous avilable spaces.
            // EXAMPLE:
            //      Given the following grid data (where 1 means occupied, and 0 means free; for clairty):
            //          INPUT:        OUTPUT:
            //      1.) [0,0,1,0]   = [1,1,0,1]
            //      2.) [0,0,1,0]   = [2,2,0,2]
            //      3.) [1,1,0,1]   = [0,0,1,0]
            //
            //  As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous
            //  free space.
            for (int y = 0; y < gridSizeY; y++) {                
                if(!invalidPoints[x, y]) {
                    hist[y] = 1 + hist[y];
                } else {
                    hist[y] = 0;
                }
            }

            // find the maximum size of the current histogram. If it happens to be larger
            // that the currently recorded max size, then it is the new max size.
            Vector2 maxSizeTemp = MaxRectSize(hist);
            int tempArea = (int)(maxSizeTemp.x * maxSizeTemp.y);
            if (tempArea > maxArea) {
                maxSize = maxSizeTemp;
                maxArea = tempArea;
            }
        }

        // at this point, we know the max size
        return maxSize;            
    }

有关此注意事项:

  1. 该版本适用于Unity API。通过使用KeyValuePair替换Vector2的实例,可以轻松地使它更通用。Vector2仅用于方便的方式来存储两个值。
  2. invalidPoints []是布尔数组,其中true表示网格点处于“使用中”,false表示不是。

3

具有空间复杂度O(列)[也可以修改为O(行)]和时间复杂度O(行*列)的解决方案

public int maximalRectangle(char[][] matrix) {
    int m = matrix.length;
    if (m == 0)
        return 0;
    int n = matrix[0].length;
    int maxArea = 0;
    int[] aux = new int[n];
    for (int i = 0; i < n; i++) {
        aux[i] = 0;
    }
    for (int i = 0; i < m; i++) {
        for (int j = 0; j < n; j++) {
            aux[j] = matrix[i][j] - '0' + aux[j];
            maxArea = Math.max(maxArea, maxAreaHist(aux));
        }
    }
    return maxArea;
}

public int maxAreaHist(int[] heights) {
    int n = heights.length;
    Stack<Integer> stack = new Stack<Integer>();
    stack.push(0);
    int maxRect = heights[0];
    int top = 0;
    int leftSideArea = 0;
    int rightSideArea = heights[0];
    for (int i = 1; i < n; i++) {
        if (stack.isEmpty() || heights[i] >= heights[stack.peek()]) {
            stack.push(i);
        } else {
            while (!stack.isEmpty() && heights[stack.peek()] > heights[i]) {
                top = stack.pop();
                rightSideArea = heights[top] * (i - top);
                leftSideArea = 0;
                if (!stack.isEmpty()) {
                    leftSideArea = heights[top] * (top - stack.peek() - 1);
                } else {
                    leftSideArea = heights[top] * top;
                }
                maxRect = Math.max(maxRect, leftSideArea + rightSideArea);
            }
            stack.push(i);
        }
    }
    while (!stack.isEmpty()) {
        top = stack.pop();
        rightSideArea = heights[top] * (n - top);
        leftSideArea = 0;
        if (!stack.isEmpty()) {
            leftSideArea = heights[top] * (top - stack.peek() - 1);
        } else {
            leftSideArea = heights[top] * top;
        }
        maxRect = Math.max(maxRect, leftSideArea + rightSideArea);
    }
    return maxRect;
}

但是当我在LeetCode上尝试时,我得到了Time Limite超出了预期。有没有更简单的解决方案?


简单易懂..谢谢!
Swadhikar '16

2

我提出了O(nxn)方法。

首先,您可以列出所有最大的空白矩形。空表示仅覆盖0。最大的空白矩形不能在不覆盖(至少)一个1的方向上延伸。

可以在www.ulg.ac.be/telecom/rectangles上找到介绍O(nxn)算法以创建此类列表的论文以及源代码(未优化)。无需存储列表,每次在算法找到一个矩形时调用一次回调函数就足够了,并且仅存储最大的一个(如果需要,可以选择另一个条件)。

请注意,存在一个证明(见论文),最大的空白矩形的数量由图像的像素数量(在这种情况下为nxn)限制。

因此,选择最佳矩形可以在O(nxn)中完成,整个方法也是O(nxn)。

实际上,这种方法非常快,并且用于实时视频流分析。


0

这是jfs解决方案的一个版本,它还提供了最大矩形的位置:

from collections import namedtuple
from operator import mul

Info = namedtuple('Info', 'start height')

def max_rect(mat, value=0):
    """returns (height, width, left_column, bottom_row) of the largest rectangle 
    containing all `value`'s.

    Example:
    [[0, 0, 0, 0, 0, 0, 0, 0, 3, 2],
     [0, 4, 0, 2, 4, 0, 0, 1, 0, 0],
     [1, 0, 1, 0, 0, 0, 3, 0, 0, 4],
     [0, 0, 0, 0, 4, 2, 0, 0, 0, 0],
     [0, 0, 0, 2, 0, 0, 0, 0, 0, 0],
     [4, 3, 0, 0, 1, 2, 0, 0, 0, 0],
     [3, 0, 0, 0, 2, 0, 0, 0, 0, 4],
     [0, 0, 0, 1, 0, 3, 2, 4, 3, 2],
     [0, 3, 0, 0, 0, 2, 0, 1, 0, 0]]
     gives: (3, 4, 6, 5)
    """
    it = iter(mat)
    hist = [(el==value) for el in next(it, [])]
    max_rect = max_rectangle_size(hist) + (0,)
    for irow,row in enumerate(it):
        hist = [(1+h) if el == value else 0 for h, el in zip(hist, row)]
        max_rect = max(max_rect, max_rectangle_size(hist) + (irow+1,), key=area)
        # irow+1, because we already used one row for initializing max_rect
    return max_rect

def max_rectangle_size(histogram):
    stack = []
    top = lambda: stack[-1]
    max_size = (0, 0, 0) # height, width and start position of the largest rectangle
    pos = 0 # current position in the histogram
    for pos, height in enumerate(histogram):
        start = pos # position where rectangle starts
        while True:
            if not stack or height > top().height:
                stack.append(Info(start, height)) # push
            elif stack and height < top().height:
                max_size = max(max_size, (top().height, (pos - top().start), top().start), key=area)
                start, _ = stack.pop()
                continue
            break # height == top().height goes here

    pos += 1
    for start, height in stack:
        max_size = max(max_size, (height, (pos - start), start), key=area)

    return max_size

def area(size):
    return size[0] * size[1]

0

为了完整起见,这里是输出矩形坐标的C#版本。它基于dmarra的答案,但没有任何其他依赖关系。只有布尔函数GetPixel(int x,int y),当在坐标x,y处设置像素时返回true。

    public struct INTRECT
    {
        public int Left, Right, Top, Bottom;

        public INTRECT(int aLeft, int aTop, int aRight, int aBottom)
        {
            Left = aLeft;
            Top = aTop;
            Right = aRight;
            Bottom = aBottom;
        }

        public int Width { get { return (Right - Left + 1); } }

        public int Height { get { return (Bottom - Top + 1); } }

        public bool IsEmpty { get { return Left == 0 && Right == 0 && Top == 0 && Bottom == 0; } }

        public static bool operator ==(INTRECT lhs, INTRECT rhs)
        {
            return lhs.Left == rhs.Left && lhs.Top == rhs.Top && lhs.Right == rhs.Right && lhs.Bottom == rhs.Bottom;
        }

        public static bool operator !=(INTRECT lhs, INTRECT rhs)
        {
            return !(lhs == rhs);
        }

        public override bool Equals(Object obj)
        {
            return obj is INTRECT && this == (INTRECT)obj;
        }

        public bool Equals(INTRECT obj)
        {
            return this == obj;
        }

        public override int GetHashCode()
        {
            return Left.GetHashCode() ^ Right.GetHashCode() ^ Top.GetHashCode() ^ Bottom.GetHashCode();
        }
    }

    public INTRECT GetMaximumFreeRectangle()
    {
        int XEnd = 0;
        int YStart = 0;
        int MaxRectTop = 0;
        INTRECT MaxRect = new INTRECT();
        // STEP 1:
        // build a seed histogram using the first row of grid points
        // example: [true, true, false, true] = [1,1,0,1]
        int[] hist = new int[Height];
        for (int y = 0; y < Height; y++)
        {
            if (!GetPixel(0, y))
            {
                hist[y] = 1;
            }
        }

        // STEP 2:
        // get a starting max area from the seed histogram we created above.
        // using the example from above, this value would be [1, 1], as the only valid area is a single point.
        // another example for [0,0,0,1,0,0] would be [1, 3], because the largest area of contiguous free space is 3.
        // Note that at this step, the heigh fo the found rectangle will always be 1 because we are operating on
        // a single row of data.
        Tuple<int, int> maxSize = MaxRectSize(hist, out YStart);
        int maxArea = (int)(maxSize.Item1 * maxSize.Item2);
        MaxRectTop = YStart;
        // STEP 3:
        // build histograms for each additional row, re-testing for new possible max rectangluar areas
        for (int x = 1; x < Width; x++)
        {
            // build a new histogram for this row. the values of this row are
            // 0 if the current grid point is occupied; otherwise, it is 1 + the value
            // of the previously found historgram value for the previous position. 
            // What this does is effectly keep track of the height of continous avilable spaces.
            // EXAMPLE:
            //      Given the following grid data (where 1 means occupied, and 0 means free; for clairty):
            //          INPUT:        OUTPUT:
            //      1.) [0,0,1,0]   = [1,1,0,1]
            //      2.) [0,0,1,0]   = [2,2,0,2]
            //      3.) [1,1,0,1]   = [0,0,1,0]
            //
            //  As such, you'll notice position 1,0 (row 1, column 0) is 2, because this is the height of contiguous
            //  free space.
            for (int y = 0; y < Height; y++)
            {
                if (!GetPixel(x, y))
                {
                    hist[y]++;
                }
                else
                {
                    hist[y] = 0;
                }
            }

            // find the maximum size of the current histogram. If it happens to be larger
            // that the currently recorded max size, then it is the new max size.
            Tuple<int, int> maxSizeTemp = MaxRectSize(hist, out YStart);
            int tempArea = (int)(maxSizeTemp.Item1 * maxSizeTemp.Item2);
            if (tempArea > maxArea)
            {
                maxSize = maxSizeTemp;
                maxArea = tempArea;
                MaxRectTop = YStart;
                XEnd = x;
            }
        }
        MaxRect.Left = XEnd - maxSize.Item1 + 1;
        MaxRect.Top = MaxRectTop;
        MaxRect.Right = XEnd;
        MaxRect.Bottom = MaxRectTop + maxSize.Item2 - 1;

        // at this point, we know the max size
        return MaxRect;
    }

    private Tuple<int, int> MaxRectSize(int[] histogram, out int YStart)
    {
        Tuple<int, int> maxSize = new Tuple<int, int>(0, 0);
        int maxArea = 0;
        Stack<Tuple<int, int>> stack = new Stack<Tuple<int, int>>();
        int x = 0;
        YStart = 0;
        for (x = 0; x < histogram.Length; x++)
        {
            int start = x;
            int height = histogram[x];
            while (true)
            {
                if (stack.Count == 0 || height > stack.Peek().Item2)
                {
                    stack.Push(new Tuple<int, int>(start, height));
                }
                else if (height < stack.Peek().Item2)
                {
                    int tempArea = (int)(stack.Peek().Item2 * (x - stack.Peek().Item1));
                    if (tempArea > maxArea)
                    {
                        YStart = stack.Peek().Item1;
                        maxSize = new Tuple<int, int>(stack.Peek().Item2, (x - stack.Peek().Item1));
                        maxArea = tempArea;
                    }
                    Tuple<int, int> popped = stack.Pop();
                    start = (int)popped.Item1;
                    continue;
                }
                break;
            }
        }

        foreach (Tuple<int, int> data in stack)
        {
            int tempArea = (int)(data.Item2 * (x - data.Item1));
            if (tempArea > maxArea)
            {
                YStart = data.Item1;
                maxSize = new Tuple<int, int>(data.Item2, (x - data.Item1));
                maxArea = tempArea;
            }
        }

        return maxSize;
    }
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.