Re2d类,Python 2
更新:添加了“ 9.对齐”问题。
我的方法是使用Python re模块进行搜索和匹配。Re2d类准备要处理的文本,执行re函数,并格式化结果以输出。
请注意,这不是一种全新的语言-它是标准的正则表达式语言,投影到2维中,并添加了用于额外2D模式的标志。
该类具有以下用法:
re2dobject = Re2d(<horizontal pattern>, [<vertical pattern>], [<flags>])
两种模式都是标准的线性文本RE模式。如果未提供垂直样式,则该班级也将使用水平样式进行垂直匹配。这些标志是带有2D扩展名的标准RE标志。
测试中
1. Finding chessboards
Chessboard pattern at (2, 1, 4, 3)
print '\n1. Finding chessboards'
reob1 = Re2d('#(_#)+_?|_(#_)+#?')
found = reob1.search('~______~\n~##_#_#~\n~#_#_##~\n~##_#_#~\n~______~')
print 'Chessboard pattern at', found
assert not reob1.search('#_##\n_#_#\n__#_\n#_#_\n#_#_')
搜索方法找到了一个棋盘图案并返回一个四元组位置。元组具有x,y
匹配的第一个字符的位置和匹配
width, height
的区域的位置。仅给出一种模式,因此它将用于水平和垂直匹配。
2. Verifying chessboards
Is chess? True
print '\n2. Verifying chessboards'
reob2 = Re2d('^#(_#)*_?|_(#_)*#?$')
print 'Is chess?', reob2.match('_#_#_#_#\n#_#_#_#_\n_#_#_#_#')
assert not reob2.match('_#_#_#__\n__#_#_#_\n_#_#_#__')
棋盘使用match方法验证,该方法返回布尔值。注意,必须使用^
和以及$
开始和结束字符来匹配整个
文本。
3. Rectangle of digits
Found: [(0, 1, 5, 3), (1, 1, 4, 3), (2, 1, 3, 3), (3, 1, 2, 3), (0, 2, 5, 2), (1, 2, 4, 2), (2, 2, 3, 2), (3, 2, 2, 2), (6, 3, 2, 2)]
Not found: None
print '\n3. Rectangle of digits'
reob3 = Re2d(r'\d\d+', flags=MULTIFIND)
print 'Found:', reob3.search('hbrewvgr\n18774gwe\n84502vgv\n19844f22\ncrfegc77')
print 'Not found:', reob3.search('uv88wn000\nvgr88vg0w\nv888wrvg7\nvvg88wv77')
现在,我们使用该MULTIFIND
标志返回2位以上数字块的所有可能匹配项。该方法找到9个可能的匹配项。请注意,它们可以重叠。
4. Word search (orthogonal only)
Words: [(0, 0, 4, 1), (0, 3, 4, 1), (3, 3, -4, -1), (3, 2, -4, -1), (3, 0, -4, -1)] [(0, 0, 1, 4), (3, 0, 1, 4), (3, 3, -1, -4), (0, 3, -1, -4)]
Words: ['SNUG', 'WOLF', 'FLOW', 'LORE', 'GUNS'] ['S\nT\nE\nW', 'G\nO\nL\nF', 'F\nL\nO\nG', 'W\nE\nT\nS']
No words: [] []
print '\n4. Word search (orthogonal only)'
words = 'GOLF|GUNS|WOLF|FLOW|LORE|WETS|STEW|FLOG|SNUG'
flags = HORFLIP | VERFLIP | MULTIFIND
reob4a, reob4b = Re2d(words, '.', flags), Re2d('.', words, flags)
matching = 'SNUG\nTEQO\nEROL\nWOLF'
nomatch = 'ABCD\nEFGH\nIJKL\nMNOP'
print 'Words:', reob4a.search(matching), reob4b.search(matching)
print 'Words:', reob4a.findall(matching), reob4b.findall(matching)
print 'No words:', reob4a.findall(nomatch), reob4b.findall(nomatch)
此测试显示了垂直和水平翻转的使用。这允许匹配的单词被颠倒。不支持对角词。该
MULTIFIND
标志允许在所有4个方向上进行多个重叠匹配。findall方法使用搜索来找到匹配的框,然后提取匹配的文本块。请注意,搜索是如何使用负宽度和/或高度进行反向匹配的。垂直方向上的单词具有换行符-这与2D字符块的概念一致。
7. Calvins portals
Portals found: [(3, 1, 5, 6)]
Portal not found None
print '\n7. Calvins portals'
reob7 = Re2d(r'X\.{2,22}X|.X{2,22}.', r'X\.{3,22}X|.X{3,22}.', MULTIFIND)
yes = '....X......\n.XXXXXX.XX.\n...X...X...\n.X.X...XXX.\n...X...X.X.\n.XXX...X.X.\nX..XXXXX.X.'
no = 'XX..XXXX\nXX..X..X\nXX..X..X\n..X.X..X\n.X..X.XX'
print 'Portals found:', reob7.search(yes)
print 'Portal not found', reob7.search(no)
此搜索需要为每个维度使用单独的模式,因为每个维度的最小尺寸都不同。
9. Alignment
Found: ['#.,##', '##'] ['#\n.\n,\n.\n#', '#\n,\n.\n#']
Found: [(3, 4, 5, 1), (6, 4, 2, 1)] [(7, 0, 1, 5), (3, 1, 1, 4)]
Not found: None None
print '\n9. Alignment'
reob9a = Re2d(r'#.*#', r'.', MULTIFIND)
reob9b = Re2d(r'.', r'#.*#', MULTIFIND)
matching = '.,.,.,.#.,\n,.,#,.,.,.\n.,.,.,.,.,\n,.,.,.,.,.\n.,.#.,##.,\n,.,.,.,.,.'
nomatch = '.,.#.,.,\n,.,.,.#.\n.,#,.,.,\n,.,.,.,#\n.#.,.,.,\n,.,.#.,.\n#,.,.,.,\n,.,.,#,.'
print 'Found:', reob9a.findall(matching), reob9b.findall(matching)
print 'Found:', reob9a.search(matching), reob9b.search(matching)
print 'Not found:', reob9a.search(nomatch), reob9b.search(nomatch)
这组2个搜索找到2个垂直匹配和2个水平匹配,但找不到嵌入的#.,#
字符串。
10. Collinear Points (orthogonal only)
Found: [(0, 1, 7, 1)] [(3, 1, 1, 4)]
Not found: None None
print '\n10. Collinear Points (orthogonal only)'
matching = '........\n#..#..#.\n...#....\n#.......\n...#....'
nomatch = '.#..#\n#..#.\n#....\n..#.#'
reob10h = Re2d(r'#.*#.*#', '.')
reob10v = Re2d('.', r'#.*#.*#')
flags = MULTIFIND
print 'Found:', reob10h.search(matching, flags), reob10v.search(matching, flags)
print 'Not found:', reob10h.search(nomatch, flags), reob10v.search(nomatch, flags)
在这里,我们使用2个搜索来查找两个方向上的匹配项。它能够找到多个正交匹配,但是这种方法不支持对角匹配。
12. Avoid qQ
Found: (2, 2, 4, 4)
Not found: None
print '\n12. Avoid qQ'
reob12 = Re2d('[^qQ]{4,4}')
print 'Found:', reob12.search('bhtklkwt\nqlwQklqw\nvtvlwktv\nkQtwkvkl\nvtwlkvQk\nvnvevwvx')
print 'Not found:', reob12.search('zxvcmn\nxcvncn\nmnQxcv\nxcvmnx\nazvmne')
此搜索找到第一个匹配项。
13. Diamond Mining
.X.
X.X
.X.
.X.
X.X
.X.
..X..
./.\.
X...X
.\./.
\.X..
..X..
./.\.
X...X
.\./.
..X..
.XX.\
//.\.
X...X
.\./.
..X..
...X...
../.\..
./.X.\.
X.X.X.X
.\.X.//
..\./X.
.X.X..\
Diamonds: [(2, 2, 3, 3), (0, 6, 3, 3)] [(8, 0, 5, 5), (10, 2, 5, 5), (5, 3, 5, 5)] [(0, 0, 7, 7)]
Not found: None None None
print '\n13. Diamond Mining'
reob13a = Re2d(r'.X.|X.X', flags=MULTIFIND)
reob13b = Re2d(r'..X..|./.\\.|X...X|.\\./.', flags=MULTIFIND)
reob13c = Re2d(r'...X...|../.\\..|./...\\.|X.....X|.\\.../.|..\\./..', flags=MULTIFIND)
match = '''
...X......X....
../.\..../.\...
./.X.\..X...X..
X.X.X.XX.\./.\.
.\.X.//.\.X...X
..\./X...X.\./.
.X.X..\./...X..
X.X....X.......
.X.............
'''.strip().replace(' ', '')
nomatch = '''
.X......./....
.\....X.......
...X.\.\...X..
..X.\...\.X.\.
...X.X...X.\.X
../X\...\...X.
.X...\.\..X...
..\./.X....X..
...X..../.....
'''.strip().replace(' ', '')
for diamond in reob13a.findall(match)+reob13b.findall(match)+reob13c.findall(match):
print diamond+'\n'
print 'Diamonds:', reob13a.search(match), reob13b.search(match), reob13c.search(match)
print 'Not found:', reob13a.search(nomatch), reob13b.search(nomatch), reob13c.search(nomatch)
钻石问题更加困难。三种尺寸需要三个搜索对象。它可以在测试集中找到六颗钻石,但无法缩放为可变大小的钻石。这只是钻石问题的部分解决方案。
Python 2代码
import sys
import re
DEBUG = re.DEBUG
IGNORECASE = re.IGNORECASE
LOCALE = re.LOCALE
UNICODE = re.UNICODE
VERBOSE = re.VERBOSE
MULTIFIND = 1<<11
ROTATED = 1<<12 # not implemented
HORFLIP = 1<<13
VERFLIP = 1<<14
WRAPAROUND = 1<<15 # not implemented
class Re2d(object):
def __init__(self, horpattern, verpattern=None, flags=0):
self.horpattern = horpattern
self.verpattern = verpattern if verpattern != None else horpattern
self.flags = flags
def checkblock(self, block, flags):
'Return a position if block matches H and V patterns'
length = []
for y in range(len(block)):
match = re.match(self.horpattern, block[y], flags)
if match:
length.append(len(match.group(0)))
else:
break
if not length:
return None
width = min(length)
height = len(length)
length = []
for x in range(width):
column = ''.join(row[x] for row in block[:height])
match = re.match(self.verpattern, column, flags)
if match:
matchlen = len(match.group(0))
length.append(matchlen)
else:
break
if not length:
return None
height = min(length)
width = len(length)
# if smaller, verify with RECURSIVE checkblock call:
if height != len(block) or width != len(block[0]):
newblock = [row[:width] for row in block[:height]]
newsize = self.checkblock(newblock, flags)
return newsize
return width, height
def mkviews(self, text, flags):
'Return views of text block from flip/rotate flags, inc inverse f()'
# TODO add ROTATED to generate more views
width = len(text[0])
height = len(text)
views = [(text, lambda x,y,w,h: (x,y,w,h))]
if flags & HORFLIP and flags & VERFLIP:
flip2text = [row[::-1] for row in text[::-1]]
flip2func = lambda x,y,w,h: (width-1-x, height-1-y, -w, -h)
views.append( (flip2text, flip2func) )
elif flags & HORFLIP:
hortext = [row[::-1] for row in text]
horfunc = lambda x,y,w,h: (width-1-x, y, -w, h)
views.append( (hortext, horfunc) )
elif flags & VERFLIP:
vertext = text[::-1]
verfunc = lambda x,y,w,h: (x, height-1-y, w, -h)
views.append( (vertext, verfunc) )
return views
def searchview(self, textview, flags=0):
'Return matching textview positions or None'
result = []
for y in range(len(textview)):
testtext = textview[y:]
for x in range(len(testtext[0])):
size = self.checkblock([row[x:] for row in testtext], flags)
if size:
found = (x, y, size[0], size[1])
if flags & MULTIFIND:
result.append(found)
else:
return found
return result if result else None
def search(self, text, flags=0):
'Return matching text positions or None'
flags = self.flags | flags
text = text.split('\n') if type(text) == str else text
result = []
for textview, invview in self.mkviews(text, flags):
found = self.searchview(textview, flags)
if found:
if flags & MULTIFIND:
result.extend(invview(*f) for f in found)
else:
return invview(*found)
return result if result else None
def findall(self, text, flags=0):
'Return matching text blocks or None'
flags = self.flags | flags
strmode = (type(text) == str)
text = text.split('\n') if type(text) == str else text
result = []
positions = self.search(text, flags)
if not positions:
return [] if flags & MULTIFIND else None
if not flags & MULTIFIND:
positions = [positions]
for x0,y0,w,h in positions:
if y0+h >= 0:
lines = text[y0 : y0+h : cmp(h,0)]
else:
lines = text[y0 : : cmp(h,0)]
if x0+w >= 0:
block = [row[x0 : x0+w : cmp(w,0)] for row in lines]
else:
block = [row[x0 : : cmp(w,0)] for row in lines]
result.append(block)
if strmode:
result = ['\n'.join(rows) for rows in result]
if flags & MULTIFIND:
return result
else:
return result[0]
def match(self, text, flags=0):
'Return True if whole text matches the patterns'
flags = self.flags | flags
text = text.split('\n') if type(text) == str else text
for textview, invview in self.mkviews(text, flags):
size = self.checkblock(textview, flags)
if size:
return True
return False