Copyright © 2023 - Azat's Blog Azat's Blog

## Time Series Modeling

*Getting Apple time series data and obtaining returns*

```
# Getting Apple time series data and obtaining returns
install.packages("quantmod")
library(quantmod)
getSymbols("AAPL", src="yahoo")
apple.return <- periodReturn(AAPL, period="daily")
plot(apple.return, main="Apple Returns", ylab="Returns", xlab="Time")
```

Code language: R (r)

`periodReturn`

is a function in `quantmod`

package. It returns daily, weekly, monthly ,… returns.

*let’s check autocorrelation in the data*

This is a plot showing the autocorrelation of Apple’s daily returns over a certain time period. Autocorrelation measures the relationship between a time series and a lagged version of itself.

From the graph, the following observations can be made:

- At lag 0 (the y-axis intercept), the autocorrelation is 1, which is expected as any series is perfectly correlated with itself.
- As the lag increases (moving right on the x-axis), the autocorrelation values drop, hovering around the zero line.
- The blue dashed lines represent confidence intervals. When autocorrelation values are within these lines, it’s often considered that the autocorrelation is not statistically significant.
- Most of the autocorrelation values for the lags are within the confidence intervals, suggesting that there isn’t significant autocorrelation in the daily returns for most lags.

In simple terms, the plot indicates that the returns of Apple on one day don’t provide much information or have a strong correlation with its returns on subsequent days within the observed period.

**There is no evidence for one day’s return being a predictor for another day.**

## Auto-correlation Function

For a stationary time series {X_t} (such as returns) the auto-covariance of lag h is defined as

\gamma_X(h) = \text{Cov}(X_t, X_{t+h})

The auto-correlation of lag h is

\rho_X(h) = \frac{\gamma_X(h)}{\sqrt{\text{Var}(X_t)\text{Var}(X_{t+h})}}

The provided content describes two important concepts in the realm of time series analysis: **auto-covariance** and **auto-correlation**.

**Stationary Time Series**:

- A time series {X_t} is said to be stationary if its statistical properties (like
**mean and variance**) do not change over time. This is a crucial assumption in many time series analyses because most statistical methods rely on the underlying data being stationary.

**Auto-covariance**:

- For a stationary time series {X_t}, the auto-covariance of lag h is given by:

\gamma_X(h) = \text{Cov}(X_t, X_{t+h})

- This equation calculates
**how much two time points in the series**(separated by a lag of h)**vary together compared to their mean values**. Essentially, it gauges the linear relationship between values of the series at different points in time. - It’s important to note that when h=0, the auto-covariance is simply the variance of the time series.

**Auto-correlation**:

- The auto-correlation of lag h is given by:

\rho_X(h) = \frac{\gamma_X(h)}{\sqrt{\text{Var}(X_t)\text{Var}(X_{t+h})}}

**Auto-correlation quantifies the degree to which a data point in a time series is linearly related to past (or future) data**points.**It’s a normalized version of the auto-covariance, ensuring values lie between -1 and 1.**- The denominator in this formula normalizes the measure. Specifically, it divides the auto-covariance by the product of the standard deviations of the two time points being compared.
- An auto-correlation value close to 1 or -1 indicates a strong linear relationship between the time points, while a value close to 0 indicates a weak or no linear relationship.

Together, auto-covariance and auto-correlation provide valuable insights into the patterns and relationships within a time series, helping analysts decide on the type of model to use for forecasting and understanding underlying patterns in the data.

## Sample Auto-correlation Function

Auto-correlation function is a useful indicator of interdependency in a data set. The estimate of autocorrelation function based on observed data:

\hat{p}(h) = \frac{\sum_{t=1}^{T-h} (x_t - \bar{x})(x_{t+h} - \bar{x})}{\sum_{t=1}^T (x_t - \bar{x})^2}

## Example for Generated Data

- autocorrelationfunctionmaygiveusefulinformation.
- Forinstance,let’shavealookatanotherdataset(whichIhavegeneratedusingR)

ARIMA stands for AutoRegressive Integrated Moving Average. It’s a popular time series forecasting method. Let’s break down its components:

**AutoRegressive (AR)**: This component refers to the use of past values in the time series data to predict the future values. The term “autoregressive” means the regression of the variable against itself. An AR(p) model would use the last “p” values to predict the next value.*Example*: In an AR(1) model, the value at time`t`

is predicted based on the value at time`t-1`

.**Integrated (I)**: This represents the differencing of the time series data to make it stationary. Stationarity is an important concept in time series forecasting. A time series is stationary if its statistical properties (like mean, variance) do not change over time. Non-stationary data are typically transformed to become stationary, usually through differencing. The “I” in ARIMA represents the number of differences needed.*Example*: If your data has a trend (like increasing over time), taking the first difference (subtracting the previous value from the current value) can help to remove the trend and make the data stationary.**Moving Average (MA)**: This is not to be confused with the “moving average” method used to smooth time series data. In the ARIMA context, a moving average model uses the past white noise or error terms to predict future values. An MA(q) model would use the last “q” white noise terms.*Example*: In an MA(1) model, the value at time`t`

is predicted based on the white noise/error term at time`t-1`

.

In summary, an ARIMA model combines these three components:

**AR**: Captures the relationship between an observation and a number of lagged observations.**I**: Removes trends in the data to make it stationary.**MA**: Models the error term as a linear combination of error terms occurring contemporaneously and at various times in the past.

When choosing an ARIMA model, the goal is to find the values of `p`

(AR order), `d`

(differencing order), and `q`

(MA order) that provide the best fit to the time series data. This is usually done through a combination of visual inspection of autocorrelation and partial autocorrelation plots, as well as optimization techniques to minimize forecasting error.

ARIMA models are effective for many types of time series data, but they do assume a linear relationship and might not be as effective for data with non-linear patterns. In those cases, other models or techniques might be more appropriate.

```
model1 <- arima.sim(n = 500, list(order = c(0, 0, 1), ma = 0.8))
plot(model1, main = "Generated Data ARIMA(ma=0.8)", ylab = "Returns", xlab = "Time")
```

Code language: R (r)

The code provided is for simulating a time series using the `arima.sim`

function in R, which simulates a time series from a specified ARIMA model.

Let’s break down the code:

`model1 <- arima.sim(n = 500, list(order = c(0, 0, 1), ma = 0.8))`

`model1`

: This is the variable where the simulated time series will be stored.`arima.sim()`

: This is the function used to simulate the time series from an ARIMA model.`n = 500`

: This specifies that you want to generate a time series of 500 data points.`list(order = c(0, 0, 1), ma = 0.8)`

: This defines the parameters of the ARIMA model you want to simulate from.

`order = c(0, 0, 1)`

: This specifies the order of the ARIMA model.- The first number
`0`

indicates that it’s an AR(0) model, meaning there are no autoregressive terms. - The second number
`0`

indicates that the series is not differenced (I(0)). - The third number
`1`

indicates that it’s an MA(1) model, meaning there’s one moving average term.

- The first number
`ma = 0.8`

: This specifies the coefficient of the MA(1) term. The moving average equation for this model would look something like:

X_t = Z_t + 0.8Z_{t-1}

where (X_t) is the value of the time series at time t and (Z_t) and (Z_{t-1}) are white noise error terms at times t and t-1, respectively.

In summary, this code simulates a time series of length 500 from an MA(1) model with a moving average coefficient of 0.8. The resulting time series is stored in the `model1`

variable.

```
acf(model1, main = "Generated Data ARIMA(ma=0.8) autocorrelation ", ylab = "Returns", xlab = "Time")
```

Code language: R (r)

## Another Example of Generated Data:

```
model2 <- arima.sim(n = 500, list(order = c(1, 0, 0), ar = 0.75))
plot(model2, main = "Generated Data ARIMA(ar=0.75)", ylab = "Returns", xlab = "Time")
acf(model2, main = "Generated Data ARIMA(ar=0.75) autocorrelation ", ylab = "Returns", xlab = "Time")
```

Code language: R (r)

The code you provided is another example of simulating a time series using the `arima.sim`

function in R, this time from a different ARIMA model.

Here’s the breakdown of the code:

`model2 <- arima.sim(n = 500, list(order = c(1, 0, 0), ar = 0.75))`

`model2`

: This is the variable where the simulated time series will be stored.`arima.sim()`

: This is the function used to simulate the time series from an ARIMA model.`n = 500`

: This specifies that you want to generate a time series of 500 data points.`list(order = c(1, 0, 0), ar = 0.75)`

: This defines the parameters of the ARIMA model you want to simulate from.

`order = c(1, 0, 0)`

: This specifies the order of the ARIMA model.- The first number
`1`

indicates that it’s an AR(1) model, meaning there’s one autoregressive term. - The second number
`0`

indicates that the series is not differenced (I(0)). - The third number
`0`

indicates that it’s an MA(0) model, meaning there are no moving average terms.

- The first number
`ar = 0.75`

: This specifies the coefficient of the AR(1) term. The autoregressive equation for this model would look something like:

X_t = 0.75X_{t-1} + Z_t

where (X_t) is the value of the time series at time t, (X_{t-1}) is the value of the time series at time t-1, and (Z_t) is a white noise error term at time t.

In summary, this code simulates a time series of length 500 from an AR(1) model with an autoregressive coefficient of 0.75. The resulting time series is stored in the `model2`

variable.

## ARIMA

(1 - \sum_{i=1}^{p} \phi_i L^i) (1 - L)^d X_t = (1 + \sum_{j=1}^{q} \theta_j L^j) Z_t

Where:

- X_t is the observed time series value at time t .
- L is the lag operator. L^k X_t = X_{t-k} .
- d is the order of differencing. This means we difference the series d times to make it stationary.
- \phi_i are the parameters of the AR(p) component. p is the order of the autoregressive (AR) component. This represents the number of lags of the dependent variable that are used as predictors.
- \theta_j are the parameters of the MA(q) component. q is the order of the moving average (MA) component. This represents the number of lagged forecast errors that are used as predictors.
- Z_t is white noise error term at time t .

Breaking it down:

- (1 - \sum_{i=1}^{p} \phi_i L^i) is the AR(p) component.
- (1 - L)^d represents differencing of order d .
- (1 + \sum_{j=1}^{q} \theta_j L^j) is the MA(q) component.

The ARIMA model is a generalization of simpler models, and by choosing appropriate values for p , d , and q , you can represent AR, MA, ARMA, and pure differencing models.

**ARIMA in Simple Terms**

Imagine you’re trying to predict the weather for tomorrow. You have a basic idea because you know how the weather was today, yesterday, and the days before. You also know if today’s weather was an unusual surprise compared to yesterday. With these two pieces of information, you can make a pretty decent guess about tomorrow.

**ARIMA Components**

**AR (AutoRegressive)**: This is like saying, “If it was sunny for the past three days, it’s likely to be sunny tomorrow.” It’s based on the idea that what happened in the recent past can help predict what will happen next.**I (Integrated)**: Sometimes the weather doesn’t change drastically from one day to the next, but over a period (like a week), it can change a lot. So instead of looking at daily temperatures, you look at how much the temperature changes day by day. This can help you see a bigger picture or trend.**MA (Moving Average)**: This is about surprises or unexpected changes. If today was unusually rainy compared to the past few days, the MA part helps to factor in that surprise when predicting tomorrow’s weather.

**Bringing It All Together**

Using ARIMA is like saying:

- “I think tomorrow will be sunny because the past few days have been sunny (AR).”
- “It’s been getting slightly colder over the past week (I).”
- “But today was surprisingly warm, much warmer than I thought it would be based on the past few days (MA).”

Combining all this information gives you a well-rounded prediction for tomorrow’s weather.

**The ARIMA Formula Simplified**

Imagine you’re trying to predict how much money you’ll have in your piggy bank at the end of the week. You put in some coins every day, but the amount varies. You want a formula that helps you predict the total.

**AR (AutoRegressive) Part**

This is like saying, “Based on the last few days, if I added 5 coins yesterday, 4 coins two days ago, and 3 coins three days ago, I might add around 4 coins today.”

Formula in plain words:

`Today's coins = (a bit of yesterday's) + (a bit of day before's) + ...`

Code language: JavaScript (javascript)

**I (Integrated) Part**

Sometimes you don’t directly use the number of coins, but the difference in coins from one day to the next. So instead of saying “I added 5 coins today”, you might say “I added 2 more coins today than yesterday.”

Formula in plain words:

`Difference in coins today = Today's coins - Yesterday's coins`

Code language: JavaScript (javascript)

**MA (Moving Average) Part**

Now, sometimes, you add a few extra coins because you found some in the couch. This isn’t based on the previous days; it’s like a little surprise bonus. The MA part tries to capture these surprises.

Formula in plain words:

`Today's coins = (a bit of the surprise from yesterday) + (a bit of the surprise from the day before) + ...`

**Bringing It All Together**

The ARIMA formula combines all these parts. It looks at:

- The coins added based on the last few days (AR)
- The differences in coins over days to catch overall trends (I)
- And the surprising bonuses (like finding coins in the couch) that might affect the next day (MA)

Formula in very simple words:

`Today's prediction = (Bit of the past days) + (Trend over time) + (Recent surprises)`

This is the ARIMA formula broken down into simple, everyday terms. The real formula involves some intricate math, but this is the essence of it. It’s like a tool for making educated guesses based on past actions, trends, and surprises.

## Using ARIMA modeling on an example

```
install.packages("forecast")
library(forecast)
# Remember we already have ‘quantmod’. Also install ‘jsonlite’ .
install.packages("jsonlite")
library(jsonlite)
# We can use a function called ‘auto.arima’
# to automatically find the optimal parameters for an ARIMA model.
data.EURUSD <- getFX("EUR/USD", auto.assign = FALSE)
```

Code language: R (r)

## Definition of the Process

A time series {X_t} is said to be an ARIMA(p,d,q) model if after taking d differences, the differenced data can be represented as ARMA(p,q).

Differencing:

\begin{align*} \Delta X_t &= X_t - X_{t-1} \\ \Delta^2 X_t &= \Delta X_t - \Delta X_{t-1} \\ &\vdots \\ \Delta^d X_t &= \Delta^{d-1} X_t - \Delta^{d-1} X_{t-1} \end{align*}

## Definition of the Process

**auto.arima**

```
model.EURUSD <- auto.arima(data.EURUSD, max.p = 1, max.q = 1)
model.EURUSD
```

Code language: LSL (Linden Scripting Language) (lsl)

`auto.arima`

is a function that automatically selects the best ARIMA model for a given time series, based on the AIC (Akaike Information Criterion) value. It does this by fitting multiple ARIMA models and selecting the one with the lowest AIC.

**Parameters**:

`max.p = 1`

limits the maximum AR (Autoregressive) order to 1.`max.q = 1`

limits the maximum MA (Moving Average) order to 1.

So, in summary, this code is automatically selecting the best ARIMA model for the `data.EURUSD`

time series with a constraint that the AR order should not be more than 1 and the MA order should not be more than 1. The resulting model (with its parameters) will be stored in the `model.EURUSD`

object.

```
Series: data.EURUSD
ARIMA(0,1,1)
Coefficients:
ma1
0.4017
s.e. 0.0658
sigma^2 = 6.834e-06: log likelihood = 806.47
AIC=-1608.93 AICc=-1608.87 BIC=-1602.57
```

The output `model.EURUSD`

describes the ARIMA model that was selected as the best fit for the `data.EURUSD`

time series data, based on the criteria you provided (with `max.p = 1`

and `max.q = 1`

).

Here’s a breakdown of the result:

**ARIMA(0,1,1)**: This tells us the order of the ARIMA model.**0**: The first number represents the AR (Autoregressive) order. A value of 0 means that there’s no autoregressive term in the model.**1**: The second number represents the differencing order. A value of 1 means the time series has been differenced once to make it stationary.**1**: The third number represents the MA (Moving Average) order. A value of 1 means there’s one moving average term.**Coefficients**:**ma1**: This represents the coefficient for the first (and only, in this case) moving average term.**0.4017**: This is the estimated value of the MA(1) coefficient. It indicates the weight of the moving average term.**s.e. 0.0658**: This stands for “standard error” and represents the standard error associated with the estimate of the MA(1) coefficient. It gives an indication of the precision of the coefficient estimate.

In summary, the `auto.arima`

function has determined that an ARIMA(0,1,1) model with an MA(1) coefficient of 0.4017 is the best fit for the `data.EURUSD`

time series data based on the constraints you provided.

“Drift” in time series analysis refers to a consistent linear trend in the data. It’s a term that represents the idea that a series, while being random, can still have a general direction.

Imagine you observe a stock price over time. Even if there are random fluctuations day-to-day, there might be an underlying trend where the stock price increases over the years. This slow and consistent upward trend would be the drift.

Mathematically, drift can be thought of as a constant term in a time series model. If ( Y_t ) represents the value of the series at time ( t ), a model with drift could be expressed as:

[ Y_t = c + Y_{t-1} + \epsilon_t ]

Where:

- ( c ) is a constant, representing the drift.
- ( Y_{t-1} ) is the value of the series at the previous time step.
- ( \epsilon_t ) is the random error term at time ( t ).

In the context of ARIMA models, if you see a “drift” term, it means that the model has detected (or been specified to include) a consistent linear trend in the data. This drift is separate from any seasonal or cyclical patterns that might also be present in the data.

## Forecasts

```
# forecast
for.EURUSD <- forecast(model.EURUSD)
for.EURUSD
# plot
plot(for.EURUSD, main = "Forecasting EUR/USD", ylab = "Returns", xlab = "Time")
```

Code language: R (r)

**for.EURUSD <- forecast(model.EURUSD)**This line of code is using the `forecast()`

function from the forecast package in R. The function is applied to the `model.EURUSD`

which is an ARIMA model that was fitted to the `data.EURUSD`

series. The result of the forecast (predicted future values) is stored in the variable `for.EURUSD`

.

## The `forecast()`

Function:

The `forecast()`

function is one of the core functions from the `forecast`

package in R, which offers methods and tools for analyzing and forecasting time series data. Here’s a deeper dive into how the function works:

**Usage:**

`forecast(object, h = 10, level = c(80, 95), fan = FALSE, ...)`

Code language: PHP (php)

**Arguments:**

**object**: An object of class “time series” or a model object for which forecasting is possible. For example, an ARIMA model.**h**: The number of periods for forecasting. Default is 10.**level**: Confidence levels for prediction intervals. Default is c(80,95), meaning it provides the 80% and 95% prediction intervals.**fan**: If`TRUE`

, level is set to seq(51,99,by=3). This is suitable for fan plots.**…**: Other arguments depending on the method.

### How does it forecast?

The `forecast()`

function works differently based on the type of the model object passed to it. Here’s how it works for an ARIMA model:

**For ARIMA models**: The function forecasts using the ARIMA model equations. For a non-seasonal ARIMA(p, d, q) model:

**AR(p)**: Uses the last ‘p’ observed values and the autoregressive coefficients.**I(d)**: Uses differencing ‘d’ times to make the series stationary.**MA(q)**: Uses the last ‘q’ forecast errors and the moving average coefficients.

**Prediction Intervals**: The`forecast()`

function also provides prediction intervals for the forecasts. These intervals give a range in which the actual future value is likely to fall with a specified probability. For instance, an 80% prediction interval means there’s an 80% chance the future value will fall within that range.

### Underlying Models:

The forecasting technique used by the `forecast()`

function depends on the type of model object passed to it:

- For
**ARIMA models**, it uses the ARIMA equations as mentioned. - For
**Exponential Smoothing State Space Models (ETS)**, it uses the ETS equations. - For other models, it uses the corresponding model equations.

### Conclusion:

The `forecast()`

function is versatile and can handle various types of time series models. The method of forecasting will depend on the specific model passed to the function. When it comes to ARIMA, the forecast is generated using the AR, I, and MA components of the model.

*Using ARIMA modeling on an example (Disney)*

```
# Using ARIMA modeling on an example (Disney)
disney <- getSymbols("DIS", src = "yahoo", from = "2022-01-01", auto.assign = FALSE)
disney.close <- disney[, 4]
model.disney <- auto.arima(disney.close$DIS.Close)
model.disney
```

Code language: R (r)

```
> model.disney
Series: disney.close$DIS.Close
ARIMA(0,1,1) with drift
Coefficients:
ma1 drift
0.0852 -0.1632
s.e. 0.0477 0.1129
sigma^2 = 4.909: log likelihood = -997.72
AIC=2001.43 AICc=2001.49 BIC=2013.77
```

Code language: JavaScript (javascript)

`plot(disney.close$DIS.Close, main = "Disney Close Price", ylab = "Price", xlab = "Time")`

Code language: R (r)

```
for.disney <- forecast(model.disney)
for.disney
plot(for.disney, main = "Forecasting Disney Close Price", ylab = "Price", xlab = "Time")
```

Code language: R (r)

```
btc <- getSymbols("BTC-USD", src = "yahoo", auto.assign = FALSE)
btc.close <- btc.close[!is.na(btc.close)]
sum(is.na(btc.close))
plot(btc.close, main = "Bitcoin Close Price", ylab = "Price", xlab = "Time")
model.btc <- auto.arima(btc.close)
forecast.btc <- forecast(model.btc)
plot(forecast.btc, main = "Forecasting Bitcoin Close Price", ylab = "Price", xlab = "Time")
```

Code language: R (r)

```
> model.btc
Series: btc.close
ARIMA(0,1,0)
sigma^2 = 605350: log likelihood = -26811.39
AIC=53624.78 AICc=53624.78 BIC=53630.89
```

```
> forecast.btc
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
3322 29495.55 28498.45 30492.65 27970.62 31020.49
3323 29495.55 28085.44 30905.67 27338.97 31652.14
3324 29495.55 27768.52 31222.58 26854.29 32136.82
3325 29495.55 27501.35 31489.76 26445.68 32545.42
3326 29495.55 27265.97 31725.14 26085.69 32905.41
3327 29495.55 27053.16 31937.94 25760.24 33230.87
3328 29495.55 26857.47 32133.64 25460.95 33530.15
3329 29495.55 26675.32 32315.78 25182.38 33808.72
3330 29495.55 26504.25 32486.86 24920.75 34070.36
3331 29495.55 26342.44 32648.66 24673.28 34317.82
```

Code language: CSS (css)

```
btc2 <- getSymbols("BTC-USD", src = "yahoo",from="2022-09-01", auto.assign = FALSE)
btc2.close <- btc2[, 4]
btc2.close <- btc2.close[!is.na(btc2.close)]
sum(is.na(btc2.close))
plot(btc2.close, main = "Bitcoin2 Close Price", ylab = "Price", xlab = "Time")
model.btc2 <- auto.arima(btc2.close)
forecast.btc2 <- forecast(model.btc2)
plot(forecast.btc2, main = "Forecasting Bitcoin2 Close Price", ylab = "Price", xlab = "Time")
```

Code language: R (r)

## Back to Earth… What actually works

*analysis of 4 stocks in DAX*

```
ALV <- read.csv("ALV.csv", sep = ",", header = T)
head(ALV)
```

Code language: R (r)

head(ALV)

Date Open High Low Close Volume AdjClose

1 2009-12-30 87.85 88.19 87.15 87.15 707600 82.91

2 2009-12-29 88.61 88.70 87.47 87.87 1109700 83.60

3 2009-12-28 88.00 88.61 87.61 88.36 807500 84.06

4 2009-12-23 87.79 88.10 86.68 87.58 1178800 83.32

5 2009-12-22 86.38 87.77 86.10 87.29 1675500 83.05

6 2009-12-21 85.00 86.47 84.26 86.28 1711700 82.08

```
alvAC <- ALV$AdjClose[1:252]
BMW <- read.csv("BMW.csv", sep=",", header=T)
bmwAC <- BMW$AdjClose[1:252]
CBK <- read.csv("CBK.csv", sep=",", header=T)
cbkAC <- CBK$AdjClose[1:252]
TKA <- read.csv("TKA.csv", sep=",", header=T)
tkaAC <- TKA$AdjClose[1:252]
date <- ALV$Date[1:252]
date <- as.Date(date)
dax <- data.frame(date, alvAC, bmwAC, cbkAC, tkaAC)
head(dax)
```

Code language: R (r)

```
> head(dax)
date alvAC bmwAC cbkAC tkaAC
1 2009-12-30 82.91 31.56 5.89 26.09
2 2009-12-29 83.60 31.81 5.93 26.50
3 2009-12-28 84.06 31.56 5.94 26.19
4 2009-12-23 83.32 31.88 5.96 25.98
5 2009-12-22 83.05 31.92 5.90 25.61
6 2009-12-21 82.08 31.57 5.97 25.60
```

Code language: CSS (css)

`plot(dax$date, dax$alvAC, type = "l", main = "ALV.DE", xlab = "dates", ylab = "adj. close")`

Code language: R (r)

```
alvR <- 1:252; bmwR <- 1:252; cbkR <- 1:252; tkaR <- 1:252
for (i in 1:252){alvR[i] <-(alvAC[i]/alvAC[i+1]) -1 }
for (i in 1:252){bmwR[i] <-(bmwAC[i]/bmwAC[i+1]) -1 }
for (i in 1:252){cbkR[i] <-(cbkAC[i]/cbkAC[i+1]) -1 }
for (i in 1:252){tkaR[i] <-(tkaAC[i]/tkaAC[i+1]) -1 }
daxR <- data.frame(dax$date, alvR, bmwR, cbkR, tkaR)
daxRlog <- log(daxR[2:5] +1)
plot(dax$date,daxR$alvR, type="l",xlab="dates",ylab="returns")
lines(dax$date,daxRlog$alvR, type="l",col= "red")
```

Code language: R (r)

## Obtaining summary statistics

```
> summary(daxRlog)
alvR bmwR cbkR tkaR
Min. :-0.1040287 Min. :-0.0810461 Min. :-0.1854445 Min. :-0.0849687
1st Qu.:-0.0132207 1st Qu.:-0.0151538 1st Qu.:-0.0260998 1st Qu.:-0.0156567
Median : 0.0012461 Median :-0.0003887 Median :-0.0017109 Median : 0.0004372
Mean : 0.0008991 Mean : 0.0015571 Mean :-0.0003246 Mean : 0.0016948
3rd Qu.: 0.0177220 3rd Qu.: 0.0190364 3rd Qu.: 0.0206481 3rd Qu.: 0.0205402
Max. : 0.1171325 Max. : 0.1383962 Max. : 0.1717314 Max. : 0.1506061
NA's :1 NA's :1 NA's :1 NA's :1
```

Code language: JavaScript (javascript)

```
# boxplot for DAX data
boxplot(daxRlog, main = "DAX returns", ylab = "returns")
```

Code language: PHP (php)

```
# Checking Normality of Returns
# histogram
hist(daxRlog$alvR, main = "ALV.DE returns histogram", ylab = "returns", xlab = "Time")
```

Code language: PHP (php)

```
# Fromhistogramtodensityfunction
alv <- na.omit(daxR$alvR); DS <- density(alv)
yl <- c(min(DS$y), max(DS$y)) #set y limits
hist(alv, probability = T, xlab = "ALV returns", main = NULL, ylim = yl)
lines(DS)
a <- seq(min(alv), max(alv), 0.001)
lines(a, dnorm(a, mean(alv), sd(alv)), col = "red")
```

Code language: PHP (php)