警告消息：在“…”中：无效的因子水平，不适用

134

我不明白为什么收到此警告消息。

> fixed <- data.frame("Type" = character(3), "Amount" = numeric(3))
> fixed[1, ] <- c("lunch", 100)
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "lunch") :
  invalid factor level, NA generated
> fixed
  Type Amount
1 <NA>    100
2           0
3           0

r warnings r-faq

— 嗯
source

216

该警告消息是因为您的“类型”变量已成为一个因素，而“午餐”未定义。stringsAsFactors = FALSE在使数据框强制“类型”为字符时，请使用该标志。

> fixed <- data.frame("Type" = character(3), "Amount" = numeric(3))
> str(fixed)
'data.frame':   3 obs. of  2 variables:
 $ Type  : Factor w/ 1 level "": NA 1 1
 $ Amount: chr  "100" "0" "0"
> 
> fixed <- data.frame("Type" = character(3), "Amount" = numeric(3),stringsAsFactors=FALSE)
> fixed[1, ] <- c("lunch", 100)
> str(fixed)
'data.frame':   3 obs. of  2 variables:
 $ Type  : chr  "lunch" "" ""
 $ Amount: chr  "100" "0" "0"

— 大卫
source

1

@David为什么R将其转换为Factor？

— KannarKK

1

因为那是data.frame()函数中的默认设置（并且这是默认值，因为这是大多数用户在绝大多数时间里想要的）。

— 大卫

46

如果您是直接从CSV文件中读取内容，请执行以下操作。

myDataFrame <- read.csv("path/to/file.csv", header = TRUE, stringsAsFactors = FALSE)

— 奇拉格
source

stringAsFactors引发错误：未使用的参数（stringAsFactors = FALSE）

— Coliban，

1

stringsAsFactors- strings需要为复数（@Coliban）

— campeterson

24

这是一种灵活的方法，可以在所有情况下使用，尤其是：

以只影响一列，或
这dataframe是通过应用先前的操作（例如，不立即打开文件或创建新的数据框）获得的。

首先，使用函数取消分解字符串as.character，然后使用（或简单地）函数重新分解：as.factorfactor

fixed <- data.frame("Type" = character(3), "Amount" = numeric(3))

# Un-factorize (as.numeric can be use for numeric values)
#              (as.vector  can be use for objects - not tested)
fixed$Type <- as.character(fixed$Type)
fixed[1, ] <- c("lunch", 100)

# Re-factorize with the as.factor function or simple factor(fixed$Type)
fixed$Type <- as.factor(fixed$Type)

— toto_tico
source

6

解决此问题的最简单方法是在列中添加一个新因素。使用级别功能确定您拥有多少个因子，然后添加一个新因子。

    > levels(data$Fireplace.Qu)
    [1] "Ex" "Fa" "Gd" "Po" "TA"
    > levels(data$Fireplace.Qu) = c("Ex", "Fa", "Gd", "Po", "TA", "None")
    [1] "Ex"   "Fa"   "Gd"   "Po"   " TA"  "None"

— 埃迪·米勒
source

0

我有类似的问题，该数据从.xlsx文件检索。不幸的是，我在这里找不到正确的答案。我使用dplyr自行处理了以下问题，这可能会对其他人有所帮助：

#install.packages("xlsx")
library(xlsx)
extracted_df <- read.xlsx("test.xlsx", sheetName='Sheet1', stringsAsFactors=FALSE)
# Replace all NAs in a data frame with "G" character
extracted_df[is.na(extracted_df)] <- "G"

但是，我无法使用参数与相似的readxl软件包来处理它stringsAsFactors。由于这个原因，我已经搬到了xlsx包裹中。

— ozturkib
source