SOME ASPECTS OF APPLICATION OF VECM ANALYSIS FOR MODELING CAUSAL RELATIONSHIPS BETWEEN SPOT AND FUTURES PRICES

The article is devoted to the issue of the application of econometric concept of cointegration and error correction models (VECM) to study the relationship between futures prices and spot prices. The author attempted to identify the determinants of the use of this methodology with respect to the relationship of spot and futures prices. In case of the prices of futures contracts and their underlying instruments causal modeling is associated with the need to deal with the multiple problems resulting from the specific nature of this dependency. These problems affect both the proper preparation of the data, as well as adaptation of the methods to the nature of the investigated phenomena. The article also points out the possible interpretation of the results of the VECM analysis in the context of the theory related to spot and futures prices linkages.


Introduction
In recent decades there has been a tremendous rise of derivatives markets reflected on the one hand by a huge variety of instruments offered on these markets, and on the other hand by increasing volume traded.Derivatives offer the ability to manage the risk of changes in prices (of goods, interest rates, exchange rates, etc.) and thus, they are widely used in business practice, especially for manufacturing and trade.On the other hand, there are highly leveraged products, and therefore these instruments themselves are of high risk.
A major category of derivatives are futures contracts today, which are offered on regulated markets, mainly on commodity, currency and stock exchanges.The existence of linkages between futures price and the price of the underlying instrument (spot or cash) appears to be obvious, because it arises from the origins of a contract, which is structured as a derivative of another instrument.The nature and extent of these relationships, however, are widely studied for decades because of the practical importance of this issue.The first significant works on this problem date back to the 30's of the last century.Keynes [1930] and Hicks [1939] developed the concept of Normal backwardation, referring to the relationship between futures prices and expected cash prices.Starting from the 90s, scientific research on the links between cash and futures markets with respect to the causal relationships became a predominant trend.These studies were possible because of the development of a methodology based on vector autoregression models, introduced by Sims [1980].These methods, which include cointegration analysis, vector autoregression models (VAR) and their transformation -error correction models (VECM), could be a starting point for the analyses allowing for causality inference (in Granger sense) with respect to the spot and futures prices.
The purpose of this article is to consider specific features of application of econometric methods including VECM modeling to analyze the causal relationship in Granger sense between futures prices and their underlying instruments.These reflections are presented in the context of the nature of the futures prices and the underlying spot.Furthermore, in the article a characteristic of this relationship was presented, which distinguishes these linkages and requires a specific approach to the VECM modeling in this area.Basing on a review of the previous empirical studies, the optional solutions for practical applications of this method were analyzed.

Long-run relationships investigation in economic phenomena
Econometric concept of cointegration refers to the relationships that occur in non--stationary processes, that time series are realizations of.Wherein non-stationarity in this case should be understood as the absence of a weak (covariance) stationarity, ie.when the conditions for the finite and fixed in time average, variance and covariance of the process are met.Non-stationary variables are cointegrated if there is a long-term relationship between them, which is a process with a lower degree of integration [Charemza, Deadman, 1997].An example of cointegrated series can be futures quotations and quotations of the underlying instrument (see Figure 1.).This is the case of cointegration CI (1, 1).
According to the definition of Engle and Granger [1987], two processes x t and y t are cointegrated of order d, b i.e. x t , y t CI(d, b), where d ≥ b > 0 if: 1. both series are integrated in the same degree d, 2. there is a linear combination of these variables  1 x t + 2 y t , which is integrated in d-b degree, where  1,  2 are elements of the cointegrating vector [ 1  2 ].
In the literature, there are two most common types of cointegration tests: the Granger procedure and the Johansen procedure.Johansen test is newer than the Granger procedure and it is considered to be more appropriate in view of the fact that it provides more efficient estimators, can also be carried out when the distributions of residuals are not normal and heteroscedastic.Moreover, it does not depend on the ordering of the variables in the regression equation [Kavussanos, Nomikos, 2003].Johansen procedure is based on the trace test and the maximum eigenvalue test, which are conducted on the foundation of error correction model specified as follows [Kusideł, 2000]: where: ∆y t -vector of the first differences of the current values of analyzed processes for m dependent variables, y t = [y 1t , y 2t , …, y mt ], D t -vector of deterministic components of the equations, such as intercept, time variable, dummy variables, including seasonal variables,  0 -matrix of parameters standing by variables of vector D t ,  -product of cointegrating vectors matrix and adjustment matrix, p -maximum lag of endogenous variables, ε t N(0,), where  is covariance matrix of the random component.In both -the trace test and the maximum eigenvalue test -rank of matrix  is validated.Johansen [1988] showed that the rank is equal to the number of independent cointegrating vectors.In the case of two variables, if the test results show that rank =0, then there is no cointegration relationship and model appropriate for describing the causal relationship (in Granger sense) between the two variables is VAR for first differences.If rank =1, then there is only one cointegrating vector, which is a prerequisite for the estimation of VECM model, and if rank =2, it can be assumed that the variables of vector y t are stationary and model ( 1) is VAR model for the variables in their levels.
Cointegration analysis preceded by the analysis of stationarity of time series allows therefore to choose the appropriate model (VAR or VECM) to test Granger causality.The definition of Granger causality states that the variable X t is the cause of the variable Y t if future values of Y t can better predicted on the basis of the available set of information than using the information other than X t [Osińska, 2008].The Granger representation theorem states that if there is cointegration between the variables, then there is a representation in the form of error correction model.The relationship between such variables can be interpreted in terms of causality, as far as it is justified, for example by economic theory.
The presence of a single cointegrating vector in time series indicates that the better model for the analysis of causal relationships in the studied phenomena is error correction model.It allows distinguishing between long-run and short-run dependence.The ability of two variables to remain in the long-run equilibrium is evaluated on basis of the significance of the parameter standing next to the error correction term in a given VECM equation.One can then specify the variable, due which the correction of the deviation from the long-term equilibrium takes place.On the basis of VECM it is also possible to conduct Granger causality test, which allows for the statistical inference of causality in the short run.The test procedure involves comparing estimated error correction model with a new model VECM with zero restrictions imposed on the coefficients of the variable which causality in the equation is examined.Granger causality test procedure for the VECM model is presented for example in [Osińska, 2008].

Relationship between spot and futures prices and causality
Linkages between cash and futures prices arise from the nature of the derivative.They are also reflected in the theoretical models of contract pricing.The most well-known formula for the valuation of futures prices is cost-of-carry model, introduced in the early 80's by Cornell and French [1983].This model has been developed for the valuation of forward contracts.Under conditions of non-stochastic interest rates, it is assumed however, that futures and forward prices are the same (for instruments characterized by the same parameters), and the formula for cost-of-carry is also used for pricing of futures contracts traded on regulated exchange markets.Depending on the category of the underlying instrument (commodities, currencies, interest rates, equities) formulas used to calculate the fair value of the futures contract in the model cost-of-carry vary.For each of the underlying instruments, there are different costs of storage.For example, cost-of-carry model for stock and index futures contracts, which are the most popular among investors, takes the form: (2) where: FV t -theoretical futures contract price at the moment t, S t -spot price at the moment t, r -risk-free rate, q -dividend yield (the ratio of dividends per share and the market price of shares), T= n/365 -time to maturity of the contract (n -number of days to maturity).
The concept of the cost-of-carry model is a basis for assumption that the current futures price is equal to the price that would be paid for the underlying instrument at the moment and the cost of its storage to a certain moment in the future.This moment is determined by the needs of the investor involved in the contract, and it is defined by the maturity of the contract.Connection between futures price and cash price described by the cost-of-carry model is also often presented in a slightly different way [Stoll, Whaley, 1990]: (3) where: R S,t -rate of return of the underlying instrument, R F,t -rate of return of futures contract.
There is also another concept of the futures and spot price relation, different form the cost-of-carry.According to that concept the price of a futures contract at a given moment is equal to the sum of the price of the underlying good S t , expected risk premium E t [P(T,t)] and expected change in the spot price E t [S T -S t ] [Fama, French, 1987] (4) The implication of both, cost-of-carry model and model the expected risk premium, is the existence of a stable long-term relationship between spot and futures prices [Asche, Guttormsen, 2002].Also assuming that the conditions underlying the cost-of-carry model are met, i.e. no transaction costs, short sale restrictions, the lack of information asymmetry, etc., then, considering equation ( 3), it can be concluded that changes in cash prices and futures prices should remain simultaneous, without any delay of one rate of return relative to the other [Lafuente, Novales, 2003].In most markets, however, causal relationships between spot and futures prices in terms of Granger are observed.Green and Joujon [2000] showed that bi-directional causality (i.e. when the spot price changes are the cause of changes in prices of futures and vice versa), as well as one-directional, does not contradict the fact that prices are formed on the basis of cost-of-carry model.
A number of studies carried out on the world exchange markets have been devoted to the issue of Granger causality between prices of futures and underlying instruments.A detailed review of the results of most studies conducted since the 80's to the 90's of the last century was provided by Sutcliffe [2006].The findings of these studies can be generalized concluding that causality more often runs from futures to spot, therefore futures market more frequently is leading in relation to the underlying instrument market.The reverse situation is much rarer, as the two-way causality.Another regularity disclosed in the studies is the causal relationship in prices depending on the degree of development of markets.In the less developed markets, spot and futures prices are usually less synchronized, so one of the markets clearly follows the other, than it takes place in more mature markets.This could mean, therefore, that the more efficient markets, which are generally mature markets, the weaker (or does not exist at all) leading role of one of the markets in the disclosure of the price.The above considerations apply when both markets are sufficiently liquid, because low market liquidity is a factor reducing the speed of the influx of new information in the prices of listed securities.However, one can distinguish other factors that can delay this process, such as limitations of the trading systems operating in a given stock exchange, the amount of transaction costs, price limits, etc.

Specific features of VECM modeling in case of spot and futures prices relationship
In particular, when relationship of two variables such as spot and futures prices is modeled, VECM equations can be written as follows: where: Δf t -logarithmic rate of return of futures contract, Δs t -logarithmic rate of return of underlying instrument, a S,0 , a F,0 -intercepts, a S,i , b S,i , a F,i , b F,i -short-run coefficients, ECT t-1 -error correction term, α S , α F -long-run coefficients, k i , t -deterministic variables, c S,i , c F,i -coefficients standing next to the deterministic variables, p -maximum lag of variables ∆f t and ∆s t . F,t ,  S,t -random components (Gaussian white noise).
Modeling Granger causality in the prices of futures contracts and their underlying instruments is associated with the need to deal with the multiple problems resulting from the specific nature of this relationship.These problems affect both the proper preparation of the data, as well as adjusting methodology, which is expected to correspond to the nature of the phenomena examined.

TABLE 1. Examples of causal analysis of spot-futures prices of different data frequency
Data frequency Examples of empirical research daily [Bohl et al., 2011], [Ozen et al., 2009], [Nieto et al., 1998], [Chen, Zheng, 2008], [Green, Joujon, 2000]; 1-hour [Gwilym, Buckle, 2001]; 15-minutes [Gosh, 1993], [Hodgson et al., 2006], [Cheung, Ng, 1999]; 5-minutes [Stoll, Whaley, 1990], [Chiang, Fong, 2001], [Frino, West, 1999], [Abhyankar, 1998]; 1-minute [Dwyer et al., 1996], [Kawaller et al., 1988], [Pizzi et al., 1998]; tick-by-tick [Chu et al., 1999], [Fung, Jiang, 1999]; Source: own research The first of the significant problems that causal modeling involves is the choice of frequencies of analyzed transactional data.In this case there are a lot of possibilitiesfrom the data of the highest frequency of observation (tick-by-tick), through intraday observations at regular time intervals (e.g. 5 -, 15 -, 30 -, 60 -minute) to the observations of closing prices (see Table 1.).Analyses carried out on closing prices allow avoiding non-synchronicity problems with the spot and futures transactional prices.Usually there is no need to reject non-overlapping observations over time, which could bias causal modeling results.This problem occurs in the case of intraday data, but in the era of high frequency investments analysis of trading data based on a frequency higher than the daily seem to have more practical value.They allow disclosing causal relationships that are manifested in very short time intervals.In addition, the analysis of high frequency data, also in terms of causal relationships between the prices of different instruments, contributes to the study of market microstructure, which is defined as a set of features and mechanisms of a particular market, which determine how prices are formed, and under what conditions and at what time transactions occur [Doman, 2011].It should be noted, however, that in the analysis of high frequency data, especially in the case of intra-day study of phenomena relating to the financial markets, hybrid models are often used.They, in addition to the error-correction mechanism, involve structures allowing for the modeling of irregular variability, typical for financial time series.Such models can take various forms, e.g.VECM-DCC-GARCH [Bohl et al., 2011], VECM-TGARCH [Floros, 2009], VECM-SV [Pajor, 2006].
Another problem emerging in the context of cointegration analysis and causality modeling is the issue of deterministic variables in the VECM equations.In the VECM models (and in general VAR) in a matrix of deterministic variables seasonal variables can appear.However, in the case of futures and spot prices, which tend to be cointegrated, they do not seem to be necessary.If both series have the same linear trend and seasonality, there is no need to take account of seasonal variables as deterministic variables [Gorecki, 2010].However, deterministic variables are often used to represent lack of continuity in the data set and they are applied to avoid structural breaks in the series, especially in the case of intraday data [Green, Joujon, 2000;Kavussanos, Nomikos, 2003].Such variable might be the number of days between consecutive sessions, which determines overnight, weekend or holiday break.An additional dummy-variable can also mark the moment of rollover of the series of contracts, because usually tested futures price time series are composed of many combined series.
In the cointegration analysis it is acceptable to adopt a priori the form of cointegrating vector [Charemza, Deadman, 1997;Majsterek, 2005].This assumption is also possible in the case of long-run dependencies between futures prices and cash prices.Then, the natural representation of the cointegrating vector is futures basis b t .Basis is a primary indicator of the relationship of spot and futures prices for the given moment.The effectiveness of hedging strategies depends on its value and stability.The formula of basis is expressed as the difference between the price of the underlying asset (S t ) and the price of futures contract (F t ): (7) Basing on the appropriate statistical tests one can show that time series of the spot and futures prices usually are non-stationary i.e. they are realizations of the process I(1).Thus, they are cointegrated if there is a stationary linear combination of them.This condition in a natural and intuitive way corresponds with the concept of basis.Alexander [1999] and Green and Joujon [2000] pointed out, however, that the basis, which represents cointegration relationship, is expressed in a slightly modified form, as the difference between the logarithms of spot and futures prices: (8) where s t  lnS t , and f t  lnF t .
The theoretical foundation for adaptation of the basis as a cointegrating vector has been presented by Brenner and Kroner [1995], and its empirical verification was carried out by Bohl et al. [2011].

Interpretation of the results of VECM modeling
The presence of causal relationships between spot and futures prices can be considered in relation to Efficient Market Hypothesis (EMH).The concept of informationally efficient market has been introduced by Fama [1965] in the 60's of the last century.According to the EMH in the efficient market all the information is already reflected in the prices, so it is not possible to predict future price movements and maintaining long-term rate of return higher than the market benchmark.The ability to obtain better forecasts of variables using past values of other variables contradicts the conditions of informationally efficient markets.Therefore, the of Granger causality is also used to verify the EMH on cash and futures markets.The use of VAR and VECM models to verify the efficient market hypothesis is described i.e. in [Maddala, 2006].An example of the application of this methodology for the analysis of the market efficiency provided Nieto et al. [1998].
However, in the literature of the subject, in regard to cointegration existing between the prices of instruments listed on the exchanges, there are different views on their impact on the efficiency of markets.Kuhl [2007] argued that the presence of cointegration is in contradiction with the existence of a weak form of efficiency.On the other hand, Sweeney [2003] demonstrated that the presence of cointegration is not related to the efficiency of the market, but only under certain conditions.Hakkio and Rush [1989] presented arguments for cointegration determining the existence of the efficiency of the market.Similar conclusions are presented by Mall et al. [2011].They found the existence of cointegration between the index futures market and the underlying market to be closely related to the informational efficiency.
The results of the cointegration and causality research in the long and short term related to futures and cash prices may also serve as a reference for the consideration of the price discovery process.It is based on the disclosure of information about the future price on one of the markets with the price on the second of the markets.Basing on the previous studies, two concepts of the price discovery can be distinguished.The first one is related to the theory of expectations, i.e. refers to the assumption that the futures price is an estimate of the future value of the underlying instrument.Term future relates to the delivery time (physical or cash settlement) of the original asset on expiry date of the contract.Such understanding of the role of price discovery function of the futures market corresponds to the idea of OTC markets, where the trade involves nonstandardized contract, which are forward contracts.The second concept is related to the change in the perception of the price discovery in recent years.It is seen as an opportunity to predict the behavior of the one price in the nearest future basing on the price from another market.In this sense, the process of realization of this function closely refers to the market microstructure.In this regard, the subject of specific study is a way of spreading new information in related spot and futures markets.Due to the fact that the goods offered in both markets are mutually substitutable, it is natural that such information has an impact on the prices of both -derivative and underlying instrument.Price discovery is performed by this market, on which new information is quicker reflected in the change in price.In this approach it is not assumed in advance that the futures market plays price discovery role.On the foundation of VECM model it is possible to estimate the extent to which one market leads the other.The measure CFW (ang.Common Factor Weight) developed on the basis of the studies of Schwarz and Szakmary [1994], as well as Gonzalo and Granger [1995], can be expressed as [see Bohl et al., 2011;Rittler, 2009] and where CFW S , CFW denote relative price discovery contribution of spot and futures market, and α S , α F are the parameters estimates in equations ( 5) and ( 6).When CFW S = 1 (or CFW F = 1), then the whole price discovery process takes place through spot market (or futures market respectively).Equations ( 9) and ( 10) are universal and apply to any normalization adopted to cointegrating vector, since they take into account the absolute values of the parameters α S , α F .Basing on the manner the indicators are calculated, described by formulas ( 9) and ( 10), it can be stated that the price discovery process is realized in the market, through which slower correcting deviations from the long-term equilibrium between spot and futures prices occur.Wherein, it is acceptable that it is present on both markets, in equal or varying degrees.The process of price discovery is associated with the existence of long-term dependence, but VECM systems also allow for the identification of causal relationships that occur in the short term.Problems with the economic interpretation of the results appear, however, when the results indicate the existence of bi-directional causality between spot and futures prices.On the basis of the theory it is difficult to explain the mechanism that makes the cash market prices affect prices on the futures market and vice versa.It seems that such a case can be regarded as a prerequisite for analyzes using transactional data of higher frequency, which allows to distinguish cause from effect.

Conclusions
Econometric concepts of cointegration and Granger causality are widely used in the studies of economic phenomena.They have found their application in the analysis of the price dependencies that exist between markets of shares, currencies, commodities, natural resources and associated derivatives markets.Their application to the studies of the relationship of cash and futures prices, however, requires an individualized approach taking into account the specific nature of both markets and the links between them.Particular attention should be paid primarily on the proper preparation of data for analysis and consideration of the characteristics of the examined phenomena, with a special regard to basis as a primary indicator of the linkage between spot and futures prices.
The results of the cointegration analysis and VECM modeling can be applied both to discussion on the informational efficiency of exchange markets, as well as the consideration of the price discovery function.As shown, however, interpretation of the results of the VECM analysis in this context is not obvious and clear, as there are different views on these issues.