删除给定文本中某个字符后的字符串


15

我有一个像下面这样的数据集。我想删除字符©之后的所有字符。如何在R中做到这一点?

data_clean_phrase <- c("Copyright © The Society of Geomagnetism and Earth", 
"© 2013 Chinese National Committee ")

data_clean_df <- as.data.frame(data_clean_phrase)

是在特定字符之后还是在特定索引之后?
Dawny33

特定字符后:©
Hamideh 2015年

然后,似乎现有的答案解决了您的问题:)
Dawny33

Answers:


19

例如:

 rs<-c("copyright @ The Society of mo","I want you to meet me @ the coffeshop")
 s<-gsub("@.*","",rs)
 s
 [1] "copyright "             "I want you to meet me "

或者,如果要保留@字符:

 s<-gsub("(@).*","\\1",rs)
 s
 [1] "copyright @"             "I want you to meet me @"

编辑:如果您要删除最后一个@上的所有内容,则只需遵循前面的示例使用适当的正则表达式即可。例:

rs<-c("copyright @ The Society of mo located @ my house","I want you to meet me @ the coffeshop")
s<-gsub("(.*)@.*","\\1",rs)
s
[1] "copyright @ The Society of mo located " "I want you to meet me "

给定我们要寻找的匹配项,sub和gsub都会给您相同的答案。


谢谢。以及如果我想在文本的最后©这样做。考虑一下:c(“©aaa©bbb”)-> c(“©aaa”)
Hamideh 2015年

@HamidehIraj您可以使用正则表达式来执行该操作。
Dawny33

1
别客气。一旦习惯了正则表达式,您将发现从最后一个@ char中删除它同样容易。我已经编辑了答案,也包括了这种情况。
MASL 2015年
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.