This function remove columns of a dataframe that provide no info (identitcal accross all rows) or disinct accross all rows. Useful to clean metadata files.

clean(tb, unique = TRUE, keep = NULL)

Arguments

tb

the data.frame or tibble

unique

whether to keep unique columns

keep

a vector of columns to keep

Examples

metaData <- data.frame(SRA = "SRA17CJQ1", ID1 = sample(letters, 12, replace = FALSE), group = c(rep("group1", 4), rep("group2", 4), rep("group3", 4))) metaData$ID2 <- toupper(metaData$ID1) metaData
#> SRA ID1 group ID2 #> 1 SRA17CJQ1 r group1 R #> 2 SRA17CJQ1 z group1 Z #> 3 SRA17CJQ1 y group1 Y #> 4 SRA17CJQ1 e group1 E #> 5 SRA17CJQ1 l group2 L #> 6 SRA17CJQ1 j group2 J #> 7 SRA17CJQ1 q group2 Q #> 8 SRA17CJQ1 h group2 H #> 9 SRA17CJQ1 i group3 I #> 10 SRA17CJQ1 t group3 T #> 11 SRA17CJQ1 o group3 O #> 12 SRA17CJQ1 s group3 S
clean(metaData)
#> ID1 group ID2 #> 1 r group1 R #> 2 z group1 Z #> 3 y group1 Y #> 4 e group1 E #> 5 l group2 L #> 6 j group2 J #> 7 q group2 Q #> 8 h group2 H #> 9 i group3 I #> 10 t group3 T #> 11 o group3 O #> 12 s group3 S
clean(metaData, unique = FALSE, keep = "ID1")
#> group ID1 #> 1 group1 r #> 2 group1 z #> 3 group1 y #> 4 group1 e #> 5 group2 l #> 6 group2 j #> 7 group2 q #> 8 group2 h #> 9 group3 i #> 10 group3 t #> 11 group3 o #> 12 group3 s