Ciência de Dados com R
2020-11-16
Chapter 1 Quick Start: Comandos Básicos
Acompanhe aqui alguns comandos básicos da linguagem.
O R é uma linguagem com algumas características que a tornam mais adequada à Análise de Dados que outras linguagens de propósito geral como C, Java etc.
- Interpretada
- Tolerante a falha
- Orientada à Coleções (Vetores)
1.1 Basics: variables and assign
# This is a comment :-)
# help
help(plot)
## starting httpd help server ... done
# variables and assign...
# number
x = 10
y <- 1
class(x)
## [1] "numeric"
class(y)
## [1] "numeric"
# prints
x
## [1] 10
print(y)
## [1] 1
# vectors
x = c(1,2,3)
y <- seq(1,10,0.5)
z = c(1:10)
nomes <- c('ana','cris','julia')
x
## [1] 1 2 3
y
## [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
## [16] 8.5 9.0 9.5 10.0
z
## [1] 1 2 3 4 5 6 7 8 9 10
nomes
## [1] "ana" "cris" "julia"
# NA and Inf
x = c(1,2,3,NA,5,6,1/0,7,8,9,10)
x
## [1] 1 2 3 NA 5 6 Inf 7 8 9 10
1.2 Basic statistics
# mean
x = c(1:3)
mean(x)
## [1] 2
min(x)
## [1] 1
max(x)
## [1] 3
sum(x)
## [1] 6
x = c(1,NA,3)
mean(x)
## [1] NA
x = c(1,NA,3)
mean(x, na.rm = TRUE)
## [1] 2
x = c(1,1/0,3)
x
## [1] 1 Inf 3
mean(x)
## [1] Inf
# quartil
x = c(1,2,3,4,5)
quantile(x)
## 0% 25% 50% 75% 100%
## 1 2 3 4 5
y = c(1,2,3,4)
quantile(x)
## 0% 25% 50% 75% 100%
## 1 2 3 4 5
z = c(0:10)
z
## [1] 0 1 2 3 4 5 6 7 8 9 10
quantile(z)
## 0% 25% 50% 75% 100%
## 0.0 2.5 5.0 7.5 10.0
summary(x)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1 2 3 3 4 5
1.3 Basic plots
x = c(1:10)
hist(x)
barplot(x)
x = c(1,2,3,1,1,2,0,1,1)
hist(x)
x = rnorm(100) # random normal distribution
boxplot(x)
hist(x)
x = rnorm(100)
y = rnorm(100)
barplot(x,y)
boxplot(x,y)
plot(x,y) # scatter plot!
x = c('a','b','c','a','d','a','b')
hist(table(x))
1.4 Basic data set input and dataframes
head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
names(mtcars)
## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
## [11] "carb"
nrow(mtcars)
## [1] 32
x = mtcars$mpg
hist(x)
summary(x)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10.40 15.43 19.20 20.09 22.80 33.90
# Exercise: explore the following data sets
# ... http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html
names(quakes)
## [1] "lat" "long" "depth" "mag" "stations"
names(cars)
## [1] "speed" "dist"
names(iris)
## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"