Recently started working on R with no choice (if let me choose, i always go for Python). But this seem to be not a bad thing that I started learning a new tool, new language syntax for data science.
Without saying much fancy words, here are I am listing several important things about R:
Installing and Updating Packages
# install single package
install.packages('tudyverse')
# install multiple package
install.packages(c("sf", "ggmap"))
# install specific version
devtools::install_version(
"ggmap",
version = "3.5.2"
)
# check installed packages
installed.packages()
# update, without prompts for permission/clarification
update.packages(ask = FALSE)
Shortcut :
# day to day R packages
list.of.packages <- c(
"rpart",
"rpart.plot",
"caTools",
"MASS",
"caret",
"Metrics",
"tree"
)
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
Custom Functions
Do not Repeat Yourself (DRY) is a general philosophy in Computer Science, so, I decided to create some custom functions to aid me on R:
#> Load custom functions to reuse in the code
#>> azt::function -> Wrap rpart.plot
rplot <- function(model, main = "Tree") {
rpart.plot(
model,
type = 4,
extra = 101,
roundint = FALSE,
nn = TRUE,
main = main
)
}
#>> azt::function -> Count leaf nodes of a given model
nleaf <- function(model) {
sum(model$frame$var == "<leaf>")
}
#>> azt::function -> Return the best CP value for given model
bestcp <- function(model){
model$cptable[which.min(model$cptable[,"xerror"]),"CP"]
}