将%>%与colnames()<-


73

我如何使用管道运算符将管道替换为类似的函数colnames()<-

这是我想做的事情:

library(dplyr)
averages_df <- 
   group_by(mtcars, cyl) %>%
   summarise(mean(disp), mean(hp))
colnames(averages_df) <- c("cyl", "disp_mean", "hp_mean")
averages_df

# Source: local data frame [3 x 3]
# 
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

但理想情况下,它将是这样的:

averages_df <- 
  group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  add_colnames(c("cyl", "disp_mean", "hp_mean"))

有没有一种方法,而无需每次都编写特殊功能?

答案是一个开始,但不完全是我的问题:dplyr中的链算术运算符


1
您可以将输入内容命名为summarise-group_by(mtcars, cyl) %>% summarise(disp_mean=mean(disp), hp_mean=mean(hp))尽管我看不到如何使用colnames这么多的阻力。是否必须在dplyr中完成每件事?
thelatemail 2015年

3
我相信中有一个rename()功能dplyr。还是,请执行@thelatemail所说的。
Rich Scriven

8
或者只是使用setNamesgroup_by(mtcars, cyl) %>% summarise(mean(disp), mean(hp)) %>% setNames(., c("cyl", "disp_mean", "hp_mean"))
大卫Arenburg

@DavidArenburg-现在,为什么我没有想到这一点,因为我只是在2分钟前指出这一点?
thelatemail 2015年

@thelatemail我正在写"names<-"(., ...,然后我告诉自己“等一下” ...
David Arenburg

Answers:


103

您可以使用colnames<-setNames(感谢@David Arenburg)

group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  `colnames<-`(c("cyl", "disp_mean", "hp_mean"))
  # or
  # `names<-`(c("cyl", "disp_mean", "hp_mean"))
  # setNames(., c("cyl", "disp_mean", "hp_mean")) 

#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

或从中选择Aliasset_colnamesmagrittr

library(magrittr)
group_by(mtcars, cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  set_colnames(c("cyl", "disp_mean", "hp_mean"))

dplyr::rename 如果仅(重)命名许多列中的一些(可能需要同时写上旧名称和新名称;请参阅@Richard Scriven的答案),这可能会更方便。


美丽。我假设该`foo<-`()语法将对任何此类“替换”功能均有效。
Alex Coppock 2015年

22

在中dplyr,有几种不同的方法可以重命名列。

一种是使用rename()功能。在此示例中,您需要反选由创建的名称summarise(),因为它们是表达式。

group_by(mtcars, cyl) %>%
    summarise(mean(disp), mean(hp)) %>%
    rename(disp_mean = `mean(disp)`, hp_mean = `mean(hp)`)
#   cyl disp_mean   hp_mean
# 1   4  105.1364  82.63636
# 2   6  183.3143 122.28571
# 3   8  353.1000 209.21429

您也可以使用select()。这有点容易,因为我们可以使用列号,而无需弄乱倒号。

group_by(mtcars, cyl) %>%
    summarise(mean(disp), mean(hp)) %>%
    select(1, disp_mean = 2, hp_mean = 3)

但是对于此示例,最好的方法是执行注释中提到的@thelatemail,即返回上一步并命名中的列summarise()

group_by(mtcars, cyl) %>%
    summarise(disp_mean = mean(disp), hp_mean = mean(hp))

11

我们可以通过使用带有dplyr的.funs参数来为汇总变量添加后缀,summarise_at如下代码所示。

library(dplyr)

# summarise_at with dplyr
mtcars %>% 
  group_by(cyl) %>%
  summarise_at(
    .cols = c("disp", "hp"),
    .funs = c(mean="mean")
  )
# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429

另外,我们可以通过几种方式设置列名称。

# set_names with magrittr
mtcars %>% 
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  magrittr::set_names(c("cyl", "disp_mean", "hp_mean"))

# set_names with purrr
mtcars %>% 
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  purrr::set_names(c("cyl", "disp_mean", "hp_mean"))

# setNames with stats
mtcars %>%
  group_by(cyl) %>%
  summarise(mean(disp), mean(hp)) %>%
  stats::setNames(c("cyl", "disp_mean", "hp_mean"))

# A tibble: 3 × 3
# cyl disp_mean   hp_mean
# <dbl>     <dbl>     <dbl>
# 1     4  105.1364  82.63636
# 2     6  183.3143 122.28571
# 3     8  353.1000 209.21429
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.