从data.frames的嵌套列表中提取同名


10

我有一个嵌套的data.frames列表,获取所有data.frames列名称的最简单方法是什么?

例:

d = data.frame(a = 1:3, b = 1:3, c = 1:3)

l = list(a = d, list(b = d, c = d))

结果:

$a
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c
[1] "a" "b" "c"

Answers:


7

已经有几个答案。但是,让我离开另一种方法。我用过rapply2()rawr包。

devtools::install_github('raredd/rawr')
library(rawr)
library(purrr)

rapply2(l = l, FUN = colnames) %>% 
flatten

$a
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c
[1] "a" "b" "c"

5

这是基本的R解决方案。

您可以定义一个自定义函数来展平嵌套列表(可以处理任何深度的嵌套列表,例如,两个以上的级别),即

flatten <- function(x){  
  islist <- sapply(x, class) %in% "list"
  r <- c(x[!islist], unlist(x[islist],recursive = F))
  if(!sum(islist))return(r)
  flatten(r)
}

然后使用下面的代码来获得名称

out <- Map(colnames,flatten(l))

这样

> out
$a
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c
[1] "a" "b" "c"

具有更深层嵌套列表的示例

l <- list(a = d, list(b = d, list(c = list(e = list(f= list(g = d))))))
> l
$a
  a b c
1 1 1 1
2 2 2 2
3 3 3 3

[[2]]
[[2]]$b
  a b c
1 1 1 1
2 2 2 2
3 3 3 3

[[2]][[2]]
[[2]][[2]]$c
[[2]][[2]]$c$e
[[2]][[2]]$c$e$f
[[2]][[2]]$c$e$f$g
  a b c
1 1 1 1
2 2 2 2
3 3 3 3

你会得到

> out
$a
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c.e.f.g
[1] "a" "b" "c"

4

这是尝试将其尽可能矢量化的方法,

i1 <- names(unlist(l, TRUE, TRUE))
#[1] "a.a1" "a.a2" "a.a3" "a.b1" "a.b2" "a.b3" "a.c1" "a.c2" "a.c3" "b.a1" "b.a2" "b.a3" "b.b1" "b.b2" "b.b3" "b.c1" "b.c2" "b.c3" "c.a1" "c.a2" "c.a3" "c.b1" "c.b2" "c.b3" "c.c1" "c.c2" "c.c3"
i2 <- names(split(i1, gsub('\\d+', '', i1)))
#[1] "a.a" "a.b" "a.c" "b.a" "b.b" "b.c" "c.a" "c.b" "c.c"

现在,我们可以分割i2点之前的所有内容,这将使

split(i2, sub('\\..*', '', i2))

#    $a
#    [1] "a.a" "a.b" "a.c"

#    $b
#    [1] "b.a" "b.b" "b.c"

#    $c
#    [1] "c.a" "c.b" "c.c"

为了彻底清洁它们,我们需要遍历并应用一个简单的正则表达式,

 lapply(split(i2, sub('\\..*', '', i2)), function(i)sub('.*\\.', '', i))

这使,

$a
[1] "a" "b" "c"

$b
[1] "a" "b" "c"

$c
[1] "a" "b" "c"

守则紧凑

i1 <- names(unlist(l, TRUE, TRUE))
i2 <- names(split(i1, gsub('\\d+', '', i1)))
final_res <- lapply(split(i2, sub('\\..*', '', i2)), function(i)sub('.*\\.', '', i))

3

尝试这个

d = data.frame(a = 1:3, b = 1:3, c = 1:3)

l = list(a = d, list(b = d, c = d))

foo <- function(x, f){
    if (is.data.frame(x)) return(f(x))
    lapply(x, foo, f = f)
}

foo(l, names)

这里的关键是data.frames实际上是特殊的列表,因此要测试的内容很重要。

小小的解释:这里需要做的是递归,因为对于每个元素,您可能会查看一个数据框,因此您想确定是应用names还是更深入地进行递归并foo再次调用。


问题是foo(l,names)也返回嵌套列表
user680111

我不。不确定,您做了什么不同的事情。
乔治

您可以unlist()在末尾添加,但是我不确定这是否是您想要的。
乔治

2

首先创建l1,这是一个只有名字的嵌套列表

l1 <- lapply(l, function(x) if(is.data.frame(x)){
  list(colnames(x)) #necessary to list it for the unlist() step afterwards
}else{
  lapply(x, colnames)
})

然后取消列出l1

unlist(l1, recursive=F)

2

这是使用purrr函数map_depthvec_depth

library(purrr)

return_names <- function(x) {
   if(inherits(x, "list"))
     return(map_depth(x, vec_depth(x) - 2, names))
    else return(names(x))
}

map(l, return_names)

#$a
#[1] "a" "b" "c"

#[[2]]
#[[2]]$b
#[1] "a" "b" "c"

#[[2]]$c
#[1] "a" "b" "c"
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.