使用正则表达式从Ruby中的字符串中提取子字符串

130

如何从Ruby的字符串中提取子字符串？

例：

String1 = "<name> <substring>"

我想提取substring从String1（最后一次出现在IE中一切<和>）。

— Madhusudhan
source

134

String1.scan(/<([^>]*)>/).last.first

scan创建其中，对于每个阵列<item>中String1包含的文本<和所述>一个元素阵列中的（因为与含有捕获组正则表达式中使用时，扫描创建包含每个匹配的捕获阵列）。last给您最后一个数组，first然后给您其中的字符串。

— sepp2k
source

319

"<name> <substring>"[/.*<([^>]*)/,1]
=> "substring"

scan如果只需要一个结果，则无需使用。当我们有Ruby时，
无需使用Python 。matchString[regexp,#]

请参阅：http : //ruby-doc.org/core/String.html#method-i-5B-5D

注意： str[regexp, capture] → new_str or nil

— 纳基隆
source

37

无需抹黑其他完全有效的解决方案（也许我认为，这是更具可读性的）。

— coreyward

41

@coreyward，如果更好，请争论一下。例如，sepp2k的解决方案更加灵活，这就是我if we need only one result在解决方案中指出的原因。而且match()[]比较慢，因为它是两种方法而不是一种。

— Nakilon

4

这是所介绍的所有方法中最快的，但即使是最慢的方法，在我的计算机上也仅花费4.5微秒。我不在乎推测为什么此方法更快。在性能上，投机是没有用的。仅测量计数。

— 韦恩·康拉德

8

我发现此解决方案更加简单明了（因为我是Ruby的新手）。谢谢。

— Ryan H.

在考虑产品和团队的整体成功时，@ Nakilon的可读性可以弥补微小的性能差异，因此coreyward发表了有效的评论。就是说，我认为string[regex]在这种情况下也可以理解，所以这就是我个人使用的方式。

— 尼克

24

您可以轻松地使用正则表达式...

在单词周围留出空格（但不能保留它们）：

str.match(/< ?([^>]+) ?>\Z/)[1]

或没有空格：

str.match(/<([^>]+)>\Z/)[1]

— 科里沃德
source

1

我不确定最后一个<>实际上是否必须是字符串中的最后一个东西。如果例如foo <bar> baz允许使用该字符串（并应给出结果bar），则将无法使用。

— sepp2k 2010年

我只是根据他提供的示例字符串去了。

— coreyward

10

这是使用该match方法的稍微灵活的方法。这样，您可以提取多个字符串：

s = "<ants> <pants>"
matchdata = s.match(/<([^>]*)> <([^>]*)>/)

# Use 'captures' to get an array of the captures
matchdata.captures   # ["ants","pants"]

# Or use raw indices
matchdata[0]   # whole regex match: "<ants> <pants>"
matchdata[1]   # first capture: "ants"
matchdata[2]   # second capture: "pants"

— 格兰特·伯奇迈尔
source

3

一个简单的扫描将是：

String1.scan(/<(\S+)>/).last

— 海军蓝
source