데이터 프레임 as factor - deiteo peuleim as factor

I would like to change the format (class) of some columns of my data.frame object (mydf) from charactor to factor.

I don't want to do this when I'm reading the text file by read.table() function.

Any help would be appreciated.

zx8754

48.9k11 gold badges114 silver badges192 bronze badges

asked Feb 12, 2012 at 18:17

4

Hi welcome to the world of R.

mtcars  #look at this built in data set
str(mtcars) #allows you to see the classes of the variables (all numeric)

#one approach it to index with the $ sign and the as.factor function
mtcars$am <- as.factor(mtcars$am)
#another approach
mtcars[, 'cyl'] <- as.factor(mtcars[, 'cyl'])
str(mtcars)  # now look at the classes

This also works for character, dates, integers and other classes

Since you're new to R I'd suggest you have a look at these two websites:

R reference manuals: http://cran.r-project.org/manuals.html

R Reference card: http://cran.r-project.org/doc/contrib/Short-refcard.pdf

answered Feb 12, 2012 at 18:28

데이터 프레임 as factor - deiteo peuleim as factor

Tyler RinkerTyler Rinker

106k64 gold badges319 silver badges509 bronze badges

3

# To do it for all names
df[] <- lapply( df, factor) # the "[]" keeps the dataframe structure

# to do it for some names in a vector named 'col_names'
col_names <- names(df)
df[col_names] <- lapply(df[col_names] , factor)

Explanation. All dataframes are lists and the results of [ used with multiple valued arguments are likewise lists, so looping over lists is the task of lapply. The above assignment will create a set of lists that the function data.frame.[<- should successfully stick back into into the dataframe, df

Another strategy would be to convert only those columns where the number of unique items is less than some criterion, let's say fewer than the log of the number of rows as an example:

cols.to.factor <- sapply( df, function(col) length(unique(col)) < log10(length(col)) )
df[ cols.to.factor] <- lapply(df[ cols.to.factor] , factor)

answered Feb 12, 2012 at 20:35

데이터 프레임 as factor - deiteo peuleim as factor

IRTFMIRTFM

256k20 gold badges361 silver badges483 bronze badges

3

You could use dplyr::mutate_if() to convert all character columns or dplyr::mutate_at() for select named character columns to factors:

library(dplyr)

# all character columns to factor:
df <- mutate_if(df, is.character, as.factor)

# select character columns 'char1', 'char2', etc. to factor:
df <- mutate_at(df, vars(char1, char2), as.factor)

answered Apr 10, 2018 at 12:06

데이터 프레임 as factor - deiteo peuleim as factor

sbhasbha

9,1482 gold badges69 silver badges60 bronze badges

1

If you want to change all character variables in your data.frame to factors after you've already loaded your data, you can do it like this, to a data.frame called dat:

character_vars <- lapply(dat, class) == "character"
dat[, character_vars] <- lapply(dat[, character_vars], as.factor)

This creates a vector identifying which columns are of class character, then applies as.factor to those columns.

Sample data:

dat <- data.frame(var1 = c("a", "b"),
                  var2 = c("hi", "low"),
                  var3 = c(0, 0.1),
                  stringsAsFactors = FALSE
                  )

answered Jan 7, 2016 at 21:59

데이터 프레임 as factor - deiteo peuleim as factor

Sam FirkeSam Firke

20.2k9 gold badges83 silver badges96 bronze badges

1

Another short way you could use is a pipe (%<>%) from the magrittr package. It converts the character column mycolumn to a factor.

library(magrittr)

mydf$mycolumn %<>% factor

answered Jun 24, 2016 at 8:12

데이터 프레임 as factor - deiteo peuleim as factor

chriadchriad

1,33215 silver badges22 bronze badges

2

I've doing it with a function. In this case I will only transform character variables to factor:

for (i in 1:ncol(data)){
    if(is.character(data[,i])){
        data[,i]=factor(data[,i])
    }
}

answered Jun 1, 2017 at 23:47

Edu MarínEdu Marín

891 silver badge5 bronze badges

1

Unless you need to identify the columns automatically, I found this to be the simplest solution:

df$name <- as.factor(df$name)

This makes column name in dataframe df a factor.

answered Dec 25, 2020 at 20:16

데이터 프레임 as factor - deiteo peuleim as factor

Christian LindigChristian Lindig

1,1971 gold badge8 silver badges24 bronze badges

You can use across with new dplyr 1.0.0

library(dplyr)

df <- mtcars 
#To turn 1 column to factor
df <- df %>% mutate(cyl = factor(cyl))

#Turn columns to factor based on their type. 
df <- df %>% mutate(across(where(is.character), factor))

#Based on the position
df <- df %>% mutate(across(c(2, 4), factor))

#Change specific columns by their name
df <- df %>% mutate(across(c(cyl, am), factor))

answered Jul 14, 2021 at 10:32

데이터 프레임 as factor - deiteo peuleim as factor

Ronak ShahRonak Shah

365k19 gold badges140 silver badges197 bronze badges