🗊Презентация Forecasting with bayesian techniques MP

Нажмите для полного просмотра!
Forecasting with bayesian techniques MP, слайд №1Forecasting with bayesian techniques MP, слайд №2Forecasting with bayesian techniques MP, слайд №3Forecasting with bayesian techniques MP, слайд №4Forecasting with bayesian techniques MP, слайд №5Forecasting with bayesian techniques MP, слайд №6Forecasting with bayesian techniques MP, слайд №7Forecasting with bayesian techniques MP, слайд №8Forecasting with bayesian techniques MP, слайд №9Forecasting with bayesian techniques MP, слайд №10Forecasting with bayesian techniques MP, слайд №11Forecasting with bayesian techniques MP, слайд №12Forecasting with bayesian techniques MP, слайд №13Forecasting with bayesian techniques MP, слайд №14Forecasting with bayesian techniques MP, слайд №15Forecasting with bayesian techniques MP, слайд №16Forecasting with bayesian techniques MP, слайд №17Forecasting with bayesian techniques MP, слайд №18Forecasting with bayesian techniques MP, слайд №19Forecasting with bayesian techniques MP, слайд №20Forecasting with bayesian techniques MP, слайд №21Forecasting with bayesian techniques MP, слайд №22Forecasting with bayesian techniques MP, слайд №23Forecasting with bayesian techniques MP, слайд №24Forecasting with bayesian techniques MP, слайд №25Forecasting with bayesian techniques MP, слайд №26Forecasting with bayesian techniques MP, слайд №27Forecasting with bayesian techniques MP, слайд №28Forecasting with bayesian techniques MP, слайд №29Forecasting with bayesian techniques MP, слайд №30Forecasting with bayesian techniques MP, слайд №31Forecasting with bayesian techniques MP, слайд №32Forecasting with bayesian techniques MP, слайд №33Forecasting with bayesian techniques MP, слайд №34Forecasting with bayesian techniques MP, слайд №35Forecasting with bayesian techniques MP, слайд №36Forecasting with bayesian techniques MP, слайд №37Forecasting with bayesian techniques MP, слайд №38Forecasting with bayesian techniques MP, слайд №39Forecasting with bayesian techniques MP, слайд №40Forecasting with bayesian techniques MP, слайд №41Forecasting with bayesian techniques MP, слайд №42Forecasting with bayesian techniques MP, слайд №43Forecasting with bayesian techniques MP, слайд №44Forecasting with bayesian techniques MP, слайд №45Forecasting with bayesian techniques MP, слайд №46Forecasting with bayesian techniques MP, слайд №47Forecasting with bayesian techniques MP, слайд №48Forecasting with bayesian techniques MP, слайд №49Forecasting with bayesian techniques MP, слайд №50Forecasting with bayesian techniques MP, слайд №51Forecasting with bayesian techniques MP, слайд №52Forecasting with bayesian techniques MP, слайд №53Forecasting with bayesian techniques MP, слайд №54Forecasting with bayesian techniques MP, слайд №55Forecasting with bayesian techniques MP, слайд №56Forecasting with bayesian techniques MP, слайд №57Forecasting with bayesian techniques MP, слайд №58Forecasting with bayesian techniques MP, слайд №59Forecasting with bayesian techniques MP, слайд №60Forecasting with bayesian techniques MP, слайд №61Forecasting with bayesian techniques MP, слайд №62Forecasting with bayesian techniques MP, слайд №63Forecasting with bayesian techniques MP, слайд №64Forecasting with bayesian techniques MP, слайд №65Forecasting with bayesian techniques MP, слайд №66Forecasting with bayesian techniques MP, слайд №67Forecasting with bayesian techniques MP, слайд №68Forecasting with bayesian techniques MP, слайд №69Forecasting with bayesian techniques MP, слайд №70Forecasting with bayesian techniques MP, слайд №71Forecasting with bayesian techniques MP, слайд №72

Содержание

Вы можете ознакомиться и скачать презентацию на тему Forecasting with bayesian techniques MP. Доклад-сообщение содержит 72 слайдов. Презентации для любого класса можно скачать бесплатно. Если материал и наш сайт презентаций Mypresentation Вам понравились – поделитесь им с друзьями с помощью социальных кнопок и добавьте в закладки в своем браузере.

Слайды и текст этой презентации


Слайд 1


Forecasting with bayesian techniques MP, слайд №1
Описание слайда:

Слайд 2


Forecasting with bayesian techniques MP, слайд №2
Описание слайда:

Слайд 3





Introduction: Two Perspectives in Econometrics
Let θ be a vector of parameters to be estimated using data
For example, if  yt~ i.i.d. N(μ,σ2), then θ=[μ,σ2] are to be estimated from a sample {yt}
Classical perspective: 
there is an unknown true value for θ 
we obtain a point estimator as a function of the data:
Bayesian perspective:
θ is an unknown random variable, for which we have initial uncertain beliefs  - prior prob. distribution
we describe (changing) beliefs about θ in terms of probability distribution (not as a point estimator!)
Описание слайда:
Introduction: Two Perspectives in Econometrics Let θ be a vector of parameters to be estimated using data For example, if yt~ i.i.d. N(μ,σ2), then θ=[μ,σ2] are to be estimated from a sample {yt} Classical perspective: there is an unknown true value for θ we obtain a point estimator as a function of the data: Bayesian perspective: θ is an unknown random variable, for which we have initial uncertain beliefs - prior prob. distribution we describe (changing) beliefs about θ in terms of probability distribution (not as a point estimator!)

Слайд 4





Outline
Why a Bayesian Approach to VARs?
Brief Introduction to Bayesian Econometrics
Analytical Examples
Estimating a distribution mean
Linear Regression
Analytical priors and posteriors for BVARs
Prior selection in applications (incl. DSGE-VARs)
Описание слайда:
Outline Why a Bayesian Approach to VARs? Brief Introduction to Bayesian Econometrics Analytical Examples Estimating a distribution mean Linear Regression Analytical priors and posteriors for BVARs Prior selection in applications (incl. DSGE-VARs)

Слайд 5





Why a Bayesian Approach to VAR? 
Dimensionality problem with VARs:
	y contains n variables, p lags in the VAR
The number of parameters in c and A is n(1+np), and the number of parameters in Σ is n(n+1)/2
Assume n=4, p=4, then we are estimating 78 parameters, with n=8, p=4, we have 133 parameters
A tension: better in-sample fit – worse forecasting performance
Sims (Econometrica, 1980) acknowledged the problem:
	“Even with a small system like those here, forecasting, especially over relatively long horizons, would probably benefit substantially from use of Bayesian methods or other mean-square-error shrinking devices…”
Описание слайда:
Why a Bayesian Approach to VAR? Dimensionality problem with VARs: y contains n variables, p lags in the VAR The number of parameters in c and A is n(1+np), and the number of parameters in Σ is n(n+1)/2 Assume n=4, p=4, then we are estimating 78 parameters, with n=8, p=4, we have 133 parameters A tension: better in-sample fit – worse forecasting performance Sims (Econometrica, 1980) acknowledged the problem: “Even with a small system like those here, forecasting, especially over relatively long horizons, would probably benefit substantially from use of Bayesian methods or other mean-square-error shrinking devices…”

Слайд 6





Why a Bayesian Approach to VAR? (2)
Usually, only a fraction of estimated coefficients are statistically significant
parsimonious modeling should be favored 
What could we do? 
Estimate a VAR with classical methods and use standard tests to exclude variables (i.e. reduce number of lags)
Use Bayesian approach to VAR which allows for:
interaction between variables
flexible specification of the likelihood of such interaction
Описание слайда:
Why a Bayesian Approach to VAR? (2) Usually, only a fraction of estimated coefficients are statistically significant parsimonious modeling should be favored What could we do? Estimate a VAR with classical methods and use standard tests to exclude variables (i.e. reduce number of lags) Use Bayesian approach to VAR which allows for: interaction between variables flexible specification of the likelihood of such interaction

Слайд 7





Combining information: prior and posterior
Bayesian coefficient estimates combine information in the prior with evidence from the data
Bayesian estimation captures changes in beliefs about model parameters
Prior: initial beliefs (e.g., before we saw data)
Posterior: new beliefs = evidence from data + initial beliefs
Описание слайда:
Combining information: prior and posterior Bayesian coefficient estimates combine information in the prior with evidence from the data Bayesian estimation captures changes in beliefs about model parameters Prior: initial beliefs (e.g., before we saw data) Posterior: new beliefs = evidence from data + initial beliefs

Слайд 8





Shrinkage
There are many approaches to reducing over-parameterization in VARs
A common idea is shrinkage
Incorporating prior information is a way of introducing shrinkage
The prior information can be reduced to a few parameters, i.e. hyperparameters
Описание слайда:
Shrinkage There are many approaches to reducing over-parameterization in VARs A common idea is shrinkage Incorporating prior information is a way of introducing shrinkage The prior information can be reduced to a few parameters, i.e. hyperparameters

Слайд 9





Forecasting Performance of BVAR vs. alternatives
Source: Litterman, 1986
Описание слайда:
Forecasting Performance of BVAR vs. alternatives Source: Litterman, 1986

Слайд 10





Introduction to Bayesian Econometrics: Objects of Interest
Objects of interest:
Prior distribution:
Likelihood function:                   - likelihood of data at a given value of θ
Joint distribution (of unknown parameters and observables/data):
Marginal likelihood:
Posterior distribution: 
	i.e. what we learned about the parameters (1) having prior and (2) observing the data
Описание слайда:
Introduction to Bayesian Econometrics: Objects of Interest Objects of interest: Prior distribution: Likelihood function: - likelihood of data at a given value of θ Joint distribution (of unknown parameters and observables/data): Marginal likelihood: Posterior distribution: i.e. what we learned about the parameters (1) having prior and (2) observing the data

Слайд 11





Bayesian Econometrics: Objects of Interest (2)
The marginal likelihood… 
…is independent of the parameters of the model 
Therefore, we can write the posterior as proportional to prior and data:
Описание слайда:
Bayesian Econometrics: Objects of Interest (2) The marginal likelihood… …is independent of the parameters of the model Therefore, we can write the posterior as proportional to prior and data:

Слайд 12





Bayesian Econometrics: maximizing criterion
For practical purposes, it is useful to focus on the criterion:
Traditionally, priors that let us obtain analytical expressions for the posterior would be needed
Today, with increased computer power, we can use any prior and likelihood distribution, as long as we can evaluate them numerically
Then we can use Markov Chain Monte-Carlo (MCMC) methods to simulate the posterior distribution (not covered in this lecture)
Описание слайда:
Bayesian Econometrics: maximizing criterion For practical purposes, it is useful to focus on the criterion: Traditionally, priors that let us obtain analytical expressions for the posterior would be needed Today, with increased computer power, we can use any prior and likelihood distribution, as long as we can evaluate them numerically Then we can use Markov Chain Monte-Carlo (MCMC) methods to simulate the posterior distribution (not covered in this lecture)

Слайд 13





Bayesian Econometrics : maximizing criterion (2)
Maximizing C() gives the Bayes mode. In some cases (i.e. Normal distributions) this is also the mean and the median
The criterion can be generalized to:
λ controls relative importance of prior information vs. data
Описание слайда:
Bayesian Econometrics : maximizing criterion (2) Maximizing C() gives the Bayes mode. In some cases (i.e. Normal distributions) this is also the mean and the median The criterion can be generalized to: λ controls relative importance of prior information vs. data

Слайд 14





Analytical Examples
Let’s work on some analytical examples:
Sample mean
Linear regression model
Описание слайда:
Analytical Examples Let’s work on some analytical examples: Sample mean Linear regression model

Слайд 15





Estimating a Sample Mean
Let yt~ i.i.d. N(μ,σ2),  then the data density function is:
	where y={y1,…yT} 
For now: assume variance σ2 is known (certain)
Assume the prior distribution of mean μ is normal, μ~ N(m,σ2/ν):
	where the key parameters of the prior distribution are m and ν
Описание слайда:
Estimating a Sample Mean Let yt~ i.i.d. N(μ,σ2), then the data density function is: where y={y1,…yT} For now: assume variance σ2 is known (certain) Assume the prior distribution of mean μ is normal, μ~ N(m,σ2/ν): where the key parameters of the prior distribution are m and ν

Слайд 16





Estimating a Sample Mean
The posterior of μ:
…has  the following analytical form 
	with
	So, we “mix” prior m and the sample average (data)

Note: 
The posterior distribution of μ is also normal: μ~ N(m*,σ2/{ν+T}) 
Diffuse prior: ν→0 (prior is not informative, everything is in data)
Tight prior: ν→ ∞ (data not important, prior is rather informative)
Описание слайда:
Estimating a Sample Mean The posterior of μ: …has the following analytical form with So, we “mix” prior m and the sample average (data) Note: The posterior distribution of μ is also normal: μ~ N(m*,σ2/{ν+T}) Diffuse prior: ν→0 (prior is not informative, everything is in data) Tight prior: ν→ ∞ (data not important, prior is rather informative)

Слайд 17





Estimating a Sample Mean: Example
Assume the true distribution is Normal yt~N(3,1)
So, μ=3 is known to… God
A researcher (one of us) does not know μ 
for him/her it is a normally distributed random variable μ~N(m,1/v)
The researcher initially believes that m=1 and ν=1, so his/her prior is μ~N(1,1)
Описание слайда:
Estimating a Sample Mean: Example Assume the true distribution is Normal yt~N(3,1) So, μ=3 is known to… God A researcher (one of us) does not know μ for him/her it is a normally distributed random variable μ~N(m,1/v) The researcher initially believes that m=1 and ν=1, so his/her prior is μ~N(1,1)

Слайд 18





Posterior with prior N(1,1)
Compute the posterior distribution as sample size increases
Описание слайда:
Posterior with prior N(1,1) Compute the posterior distribution as sample size increases

Слайд 19





Posterior with Prior N(1,1/50)
Then, we look at more informative (tight) prior and set ν =50 (higher precision)
Описание слайда:
Posterior with Prior N(1,1/50) Then, we look at more informative (tight) prior and set ν =50 (higher precision)

Слайд 20





Examples: Regression Model I
Linear Regression model:
	where ut~ i.i.d. N(0,σ2) 
Assume:
β is random and unknown
but σ2  is fixed and known
Convenient matrix representation
where
The density function for data is:
Описание слайда:
Examples: Regression Model I Linear Regression model: where ut~ i.i.d. N(0,σ2) Assume: β is random and unknown but σ2 is fixed and known Convenient matrix representation where The density function for data is:

Слайд 21





Examples: Regression Model I (2)
Assume that the prior mean of β has multivariate Normal distribution N(m,σ2M):
	where the key parameters of the prior distribution are m and M
Bayesian rule states:
i.e., the posterior of β is  proportional to the product of the data density of data  and prior
Описание слайда:
Examples: Regression Model I (2) Assume that the prior mean of β has multivariate Normal distribution N(m,σ2M): where the key parameters of the prior distribution are m and M Bayesian rule states: i.e., the posterior of β is proportional to the product of the data density of data and prior

Слайд 22





Examples: Regression Model I (3)
We mix information – densities of data and prior – to get posterior distribution!
Result: the density function of β is…
… which means that the posterior distribution is again (!) normal 
with the mean and variance
Описание слайда:
Examples: Regression Model I (3) We mix information – densities of data and prior – to get posterior distribution! Result: the density function of β is… … which means that the posterior distribution is again (!) normal with the mean and variance

Слайд 23





Since we do not like black boxes… there are 2 ways to get m* and M* (2 parameters to characterize posterior)
Since we do not like black boxes… there are 2 ways to get m* and M* (2 parameters to characterize posterior)
The long: manipulate the product of density functions (see Hamilton book, p367)
The smart: use GLS regression… 
 
We have 2 ingredients:
prior distribution                      , which implies  
and our regression model that “catches” the impact of the data on the estimate of β
Описание слайда:
Since we do not like black boxes… there are 2 ways to get m* and M* (2 parameters to characterize posterior) Since we do not like black boxes… there are 2 ways to get m* and M* (2 parameters to characterize posterior) The long: manipulate the product of density functions (see Hamilton book, p367) The smart: use GLS regression… We have 2 ingredients: prior distribution , which implies and our regression model that “catches” the impact of the data on the estimate of β

Слайд 24





Define a “new” regression model
Define a “new” regression model
We simply stack our “ingredients” together to mix the information (prior and data) so that now β takes into account both!
The GLS estimator of β… is exactly our posterior mean
And the posterior variance of β is
Описание слайда:
Define a “new” regression model Define a “new” regression model We simply stack our “ingredients” together to mix the information (prior and data) so that now β takes into account both! The GLS estimator of β… is exactly our posterior mean And the posterior variance of β is

Слайд 25





Examples: Regression Model II
So far the life was easy(-ier), in the linear regression model
β was random and unknown, but σ2  was fixed and known
What if σ2  is random and unknown?..
Bayesian rule states:
i.e., the posterior of β and σ2  is  proportional to the product of the density of data, prior of β (given σ2) and prior of σ2
Описание слайда:
Examples: Regression Model II So far the life was easy(-ier), in the linear regression model β was random and unknown, but σ2 was fixed and known What if σ2 is random and unknown?.. Bayesian rule states: i.e., the posterior of β and σ2 is proportional to the product of the density of data, prior of β (given σ2) and prior of σ2

Слайд 26





Examples: Regression Model II ()
To manipulate the product 
…we assume the following distributions:
Normal for data
Normal for the prior for β (conditional on σ2): β|σ2 ̴ N(m, σ2M)
and Inverse-Gamma for the prior for σ2 : σ2 ̴ IG(λ,l)
Note: inverse-gamma is handy! It guaranties that random draws σ2 >0!
Описание слайда:
Examples: Regression Model II () To manipulate the product …we assume the following distributions: Normal for data Normal for the prior for β (conditional on σ2): β|σ2 ̴ N(m, σ2M) and Inverse-Gamma for the prior for σ2 : σ2 ̴ IG(λ,l) Note: inverse-gamma is handy! It guaranties that random draws σ2 >0!

Слайд 27





Examples: Regression Model II (3)
By manipulating the product (see more details in the appendix B) 
…we get the following result 
with mean  and variance of the posterior for β|σ2 ̴ N(m*, σ2M*)
And parameters for posterior for σ2 ̴ IG(λ*,l*)
Описание слайда:
Examples: Regression Model II (3) By manipulating the product (see more details in the appendix B) …we get the following result with mean and variance of the posterior for β|σ2 ̴ N(m*, σ2M*) And parameters for posterior for σ2 ̴ IG(λ*,l*)

Слайд 28





Priors: summary 
In the above examples we dealt with 2 types of prior distributions of our parameters:
Case 1 prior 
assumes β is unknown and normally distributed (Gaussian)
σ2 is a known parameter
the assumption Gaussian errors delivers posterior normal distribution for β
Case 2 (conjugate) priors 
assumes β and σ2  are unknown 
β and σ2  have prior normal and Inverse-Gamma distributions respectively
with Gaussian errors delivers posterior distributions for β and σ2 of the same family
Описание слайда:
Priors: summary In the above examples we dealt with 2 types of prior distributions of our parameters: Case 1 prior assumes β is unknown and normally distributed (Gaussian) σ2 is a known parameter the assumption Gaussian errors delivers posterior normal distribution for β Case 2 (conjugate) priors assumes β and σ2 are unknown β and σ2 have prior normal and Inverse-Gamma distributions respectively with Gaussian errors delivers posterior distributions for β and σ2 of the same family

Слайд 29





Bayesian VARs
Linear Regression examples will help us to deal with our main object – Bayesian VARs
A VAR is typically written as
	where yt contains n variables, the VAR includes p lags, and the data sample size is T
We have seen that it is convenient to work with a matrix representation for a regression
Can we get it for our VAR? Yes! 
…and it will help to get posteriors for our parameters
Описание слайда:
Bayesian VARs Linear Regression examples will help us to deal with our main object – Bayesian VARs A VAR is typically written as where yt contains n variables, the VAR includes p lags, and the data sample size is T We have seen that it is convenient to work with a matrix representation for a regression Can we get it for our VAR? Yes! …and it will help to get posteriors for our parameters

Слайд 30





VAR in a matrix form: example
Consider, as an example, a VAR for n variables and p=2
Stack the variables and coefficients
Then, the VAR
Let                                                 and rewrite
          where        is a Kroneker product
Описание слайда:
VAR in a matrix form: example Consider, as an example, a VAR for n variables and p=2 Stack the variables and coefficients Then, the VAR Let and rewrite where is a Kroneker product

Слайд 31





How to Estimate a BVAR: Case 1 Prior
Consider Case 1 prior for a VAR:
coefficients in A are unknown with multivariate Normal prior distribution:
 
and known Σe
“Old trick”  to get the posterior: use GLS estimator (appendix C for details) 
Result
So the posterior distribution is multivariate normal
Описание слайда:
How to Estimate a BVAR: Case 1 Prior Consider Case 1 prior for a VAR: coefficients in A are unknown with multivariate Normal prior distribution: and known Σe “Old trick” to get the posterior: use GLS estimator (appendix C for details) Result So the posterior distribution is multivariate normal

Слайд 32





How to Estimate a BVAR: Case 2 (conjugate) Priors
Before we see the case of an unknown Σe
need to introduce a multivariate distribution to characterize the unknown random error covariance matrix Σe
Consider a matrix
Each raw                    is a draw form N(0,S)
The nxn matrix
has an Inverse Wishart distribution with k degrees of freedom: Σe~IWn(S,l)
If Σe ~ IWn(S,l), then Σe-1 follows a Wishart distribution: Σe-1~Wn(S-1,l)
Wishart distribution might be more convenient 
Σe-1 is a measure of precision (since Σe is a measure of dispersion)
Описание слайда:
How to Estimate a BVAR: Case 2 (conjugate) Priors Before we see the case of an unknown Σe need to introduce a multivariate distribution to characterize the unknown random error covariance matrix Σe Consider a matrix Each raw is a draw form N(0,S) The nxn matrix has an Inverse Wishart distribution with k degrees of freedom: Σe~IWn(S,l) If Σe ~ IWn(S,l), then Σe-1 follows a Wishart distribution: Σe-1~Wn(S-1,l) Wishart distribution might be more convenient Σe-1 is a measure of precision (since Σe is a measure of dispersion)

Слайд 33





How to Estimate a BVAR: Conjugate Priors
Assume Conjugate priors:
The VAR parameters A and Σe are both unknown 
prior for A is multivariate Normal:
and for Σe is Inverse Wishart:
Follow the analogy with univariate regression examples to put down the moments for posterior distributions
Recall matrix representation for our VAR:
Posterior for A is multivariate normal: 
Posterior for Σe is Inv. Wishart:
See appendix D for details
Описание слайда:
How to Estimate a BVAR: Conjugate Priors Assume Conjugate priors: The VAR parameters A and Σe are both unknown prior for A is multivariate Normal: and for Σe is Inverse Wishart: Follow the analogy with univariate regression examples to put down the moments for posterior distributions Recall matrix representation for our VAR: Posterior for A is multivariate normal: Posterior for Σe is Inv. Wishart: See appendix D for details

Слайд 34





BVARs: Minnesota Prior Implementation
The Minnesota prior – a particular case of the “Case 1 prior” (unknown model coefficients, but known error variance):
Assume random walk is a reasonable model for every yit in the VAR
Hence, for every yit
coefficient for the first own lag yit-1 has a prior mean of 1
coefficients for all other lags yit-k , yjt-1 , yjt-k have 0 prior mean
So, our prior for coefficients of VAR(2) example would be:
Описание слайда:
BVARs: Minnesota Prior Implementation The Minnesota prior – a particular case of the “Case 1 prior” (unknown model coefficients, but known error variance): Assume random walk is a reasonable model for every yit in the VAR Hence, for every yit coefficient for the first own lag yit-1 has a prior mean of 1 coefficients for all other lags yit-k , yjt-1 , yjt-k have 0 prior mean So, our prior for coefficients of VAR(2) example would be:

Слайд 35





BVARs: Minnesota Prior Implementation
The Minnesota prior
The prior variance for the coefficient of lag k in equation i for variable j is:
… and depends only on three hyperparameters: 
the tightness parameter γ (typically the same in all equations)
and the relative weight parameter w: is 1 for own lags and <1 for other variables
parameter q governs the tightness of the prior depending on the lag (often set to 1)
        is a “scale correction”
the ratio of residual variances for OLS-estimated AR:
Описание слайда:
BVARs: Minnesota Prior Implementation The Minnesota prior The prior variance for the coefficient of lag k in equation i for variable j is: … and depends only on three hyperparameters: the tightness parameter γ (typically the same in all equations) and the relative weight parameter w: is 1 for own lags and <1 for other variables parameter q governs the tightness of the prior depending on the lag (often set to 1) is a “scale correction” the ratio of residual variances for OLS-estimated AR:

Слайд 36





BVARs: Minnesota Prior Implementation
The Minnesota prior
Interpretation: 
the prior on the first own lag is 
the prior on the own lag k is 

the prior std. dev. declines at a rate k, i.e. coefficients for longer lags are more likely to be close to 0
the prior on the first lag of another variable is   
                    
the prior std. dev. is reduced by a factor w: i.e. it is more likely that the first lags of other variables are irrelevant 
the prior std. dev. on other variables’ lags 
declines at a rate k
Описание слайда:
BVARs: Minnesota Prior Implementation The Minnesota prior Interpretation: the prior on the first own lag is the prior on the own lag k is the prior std. dev. declines at a rate k, i.e. coefficients for longer lags are more likely to be close to 0 the prior on the first lag of another variable is the prior std. dev. is reduced by a factor w: i.e. it is more likely that the first lags of other variables are irrelevant the prior std. dev. on other variables’ lags declines at a rate k

Слайд 37





Remarks:
Remarks:
The overall tightness of the prior is governed by γ
smaller γ  model for yit shrinks towards random walk
The effect of other lagged variables is controlled by w 
smaller    estimates shrink towards AR model (yit is not affected by yjt)
Practitioner’s advice (RATS Manual) on the choice of hyperparameters:
Set γ=0.2, =0.5
Focus on forecast errors statistics, when selecting alternative hyperparameters
Loosen priors on own lags and tighten on other lags to improve
Substitute priors manually if there is a strong reason
Описание слайда:
Remarks: Remarks: The overall tightness of the prior is governed by γ smaller γ  model for yit shrinks towards random walk The effect of other lagged variables is controlled by w smaller  estimates shrink towards AR model (yit is not affected by yjt) Practitioner’s advice (RATS Manual) on the choice of hyperparameters: Set γ=0.2, =0.5 Focus on forecast errors statistics, when selecting alternative hyperparameters Loosen priors on own lags and tighten on other lags to improve Substitute priors manually if there is a strong reason

Слайд 38





BVARs: Prior Selection
Minnesota and conjugate priors are useful (e.g., to obtain closed-form solutions), but can be too restrictive:
Independence across equations
Symmetry in the prior can sometimes be a problem
Increased computer power allows to simulate more general prior distributions using numerical methods
Three examples:
DSGE-VAR approach: Del Negro and Schorfheide (IER, 2004)
Explore different prior distributions and hyperparameters: Kadiyala and Karlsson (1997)
Choosing the hyperparameters to maximize the marginal likelihood: Giannone, Lenza and Primiceri (2011)
Описание слайда:
BVARs: Prior Selection Minnesota and conjugate priors are useful (e.g., to obtain closed-form solutions), but can be too restrictive: Independence across equations Symmetry in the prior can sometimes be a problem Increased computer power allows to simulate more general prior distributions using numerical methods Three examples: DSGE-VAR approach: Del Negro and Schorfheide (IER, 2004) Explore different prior distributions and hyperparameters: Kadiyala and Karlsson (1997) Choosing the hyperparameters to maximize the marginal likelihood: Giannone, Lenza and Primiceri (2011)

Слайд 39





Del Negro and Schorfheide (2004): DSGE-VAR Approach
Del Negro and Schorfheide (2004)
We want to estimate a BVAR model 
We also have a DSGE model for the same variables
It can be solved and linearized: approximated with a RF VAR
Then, we can use coefficients from the DSGE-based VAR as prior means to estimate the BVAR
Several advantages:
DSGE-VAR may improve forecasts by restricting parameter values 
At the same time, can improve empirical performance of DSGE relaxing its restrictions 
Our priors (from DSGE) are based on deep structural parameters consistent  with economic theory
Описание слайда:
Del Negro and Schorfheide (2004): DSGE-VAR Approach Del Negro and Schorfheide (2004) We want to estimate a BVAR model We also have a DSGE model for the same variables It can be solved and linearized: approximated with a RF VAR Then, we can use coefficients from the DSGE-based VAR as prior means to estimate the BVAR Several advantages: DSGE-VAR may improve forecasts by restricting parameter values At the same time, can improve empirical performance of DSGE relaxing its restrictions Our priors (from DSGE) are based on deep structural parameters consistent with economic theory

Слайд 40





Del Negro and Schorfheide (2004)
We estimate the following BVAR:
The solution for the DSGE model has a reduced-form VAR representation 
	where θ are deep structural parameters
Idea:
Combine artificial and T actual observations (Y,X) and to get the posterior distribution
T*=λT “artificial” observations are generated from the DSGE model: (Y*,X*)
Описание слайда:
Del Negro and Schorfheide (2004) We estimate the following BVAR: The solution for the DSGE model has a reduced-form VAR representation where θ are deep structural parameters Idea: Combine artificial and T actual observations (Y,X) and to get the posterior distribution T*=λT “artificial” observations are generated from the DSGE model: (Y*,X*)

Слайд 41





Del Negro and Schorfheide (2004)
Parameter λ is a “weight” of “artificial” (prior) data from DSGE
λ=0 delivers OLS-estimated VAR: i.e. DSGE not important
Large λ shrinks coefficients towards the DSGE solution: i.e. data not important
to find an “optimal” λ marginal likelihood is maximized (appendix E)
Can implement the procedure analytically… let’s see
Описание слайда:
Del Negro and Schorfheide (2004) Parameter λ is a “weight” of “artificial” (prior) data from DSGE λ=0 delivers OLS-estimated VAR: i.e. DSGE not important Large λ shrinks coefficients towards the DSGE solution: i.e. data not important to find an “optimal” λ marginal likelihood is maximized (appendix E) Can implement the procedure analytically… let’s see

Слайд 42





Likelihood of the VAR of a DSGE Model
Recall the likelihood function for an unconstrained VAR
Similarly, the (Quasi-) likelihood for the “artificial” data:
 	which is a prior density for the BVAR parameters
Rewrite the likelihood for the “artificial” data (open brackets)
Описание слайда:
Likelihood of the VAR of a DSGE Model Recall the likelihood function for an unconstrained VAR Similarly, the (Quasi-) likelihood for the “artificial” data: which is a prior density for the BVAR parameters Rewrite the likelihood for the “artificial” data (open brackets)

Слайд 43





Likelihood of the VAR of a DSGE Model
Описание слайда:
Likelihood of the VAR of a DSGE Model

Слайд 44





DSGE-VAR prior
Описание слайда:
DSGE-VAR prior

Слайд 45





DSGE-VAR posterior
Описание слайда:
DSGE-VAR posterior

Слайд 46





Results
Описание слайда:
Results

Слайд 47





Results
Описание слайда:
Results

Слайд 48





Kadiyala and Karlsson (1997)
Small Model: a bivariate VAR with unemployment and industrial production
Sample period: 1964:1 to 1990:4.
Estimate the model through 1978:4
Criterion to chose hyperparameters: forecasting performance over 1979:1-1982:3
Use the remaining sub-sample 1982:4-1990:4 for forecasting
Large “”Litterman” Model: a VAR with 7 variables (real GNP, inflation, unemployment, money, investment, interest rate and inventories)
Sample period: 1948:1 to 1986:4.
Estimate the model through 1980:1
Use the remaining sub-sample 1980:2-1986:4 for forecasting
Описание слайда:
Kadiyala and Karlsson (1997) Small Model: a bivariate VAR with unemployment and industrial production Sample period: 1964:1 to 1990:4. Estimate the model through 1978:4 Criterion to chose hyperparameters: forecasting performance over 1979:1-1982:3 Use the remaining sub-sample 1982:4-1990:4 for forecasting Large “”Litterman” Model: a VAR with 7 variables (real GNP, inflation, unemployment, money, investment, interest rate and inventories) Sample period: 1948:1 to 1986:4. Estimate the model through 1980:1 Use the remaining sub-sample 1980:2-1986:4 for forecasting

Слайд 49





Kadiyala and Karlsson (1997)
Compare different priors based on the VAR forecasting performance (RMSE)
Standard VAR(p)…
… can be rewritten (see slide 29):
… and
	where
Описание слайда:
Kadiyala and Karlsson (1997) Compare different priors based on the VAR forecasting performance (RMSE) Standard VAR(p)… … can be rewritten (see slide 29): … and where

Слайд 50





Prior distributions in K&K
K&K use a number of competing prior distributions… 
Minnesota, Normal-Wishart, Normal-Diffuse, Extended Natural Conjugate (see appendix E)
… for       and
Parameters of the prior distribution for     :
each yit is a random walk (just as in Minnesota priors above)
The variance of each coefficient depends on two hyper-parameters w, :
Описание слайда:
Prior distributions in K&K K&K use a number of competing prior distributions… Minnesota, Normal-Wishart, Normal-Diffuse, Extended Natural Conjugate (see appendix E) … for and Parameters of the prior distribution for : each yit is a random walk (just as in Minnesota priors above) The variance of each coefficient depends on two hyper-parameters w, :

Слайд 51





Prior distributions in K&K
Описание слайда:
Prior distributions in K&K

Слайд 52





Forecast Comparison in K&K: Small Model, unemployment
Описание слайда:
Forecast Comparison in K&K: Small Model, unemployment

Слайд 53





Forecast Comparison in K&K: Large Model
Описание слайда:
Forecast Comparison in K&K: Large Model

Слайд 54





Giannone, Lenza and Primiceri (2011)
Use three VARs to compare forecasting performance
Small VAR: GDP, GDP deflator, Federal Funds rate for the U.S
Medium VAR: includes small VAR plus consumption, investment, hours worked and wages
Large VAR: expand the medium VAR with up to 22 variables
The prior distributions of the VAR parameters ϴ={, Σ, Σe} depend on a small number of hyperparameters
The hyperparameters are themselves uncertain and follow either gamma or inverse gamma distributions
This is to the contrast of Minnesota priors where hyperparameters are fixed!
Описание слайда:
Giannone, Lenza and Primiceri (2011) Use three VARs to compare forecasting performance Small VAR: GDP, GDP deflator, Federal Funds rate for the U.S Medium VAR: includes small VAR plus consumption, investment, hours worked and wages Large VAR: expand the medium VAR with up to 22 variables The prior distributions of the VAR parameters ϴ={, Σ, Σe} depend on a small number of hyperparameters The hyperparameters are themselves uncertain and follow either gamma or inverse gamma distributions This is to the contrast of Minnesota priors where hyperparameters are fixed!

Слайд 55





Giannone, Lenza and Primiceri (2011)
The marginal likelihood is obtained by integrating out the parameters of the model:
But the prior distribution of  is itself a function of the hyperparameters of the model i.e. p(θ)=p (θ|γ)
Описание слайда:
Giannone, Lenza and Primiceri (2011) The marginal likelihood is obtained by integrating out the parameters of the model: But the prior distribution of  is itself a function of the hyperparameters of the model i.e. p(θ)=p (θ|γ)

Слайд 56





Giannone, Lenza and Primiceri (2011)
We  interpret the model as a hierarchical model by replacing pγ(θ)=p(θ|γ) and evaluate the marginal likelihood:
The hyperparameters γ are uncertain
Informativeness of their prior distribution is chosen via maximizing the posterior distribution
Maximizing the posterior of γ corresponds to maximizing the one-step ahead forecasting accuracy of the model
Описание слайда:
Giannone, Lenza and Primiceri (2011) We interpret the model as a hierarchical model by replacing pγ(θ)=p(θ|γ) and evaluate the marginal likelihood: The hyperparameters γ are uncertain Informativeness of their prior distribution is chosen via maximizing the posterior distribution Maximizing the posterior of γ corresponds to maximizing the one-step ahead forecasting accuracy of the model

Слайд 57





Giannone, Lenza and Primiceri (2011)
Описание слайда:
Giannone, Lenza and Primiceri (2011)

Слайд 58





In all cases BVARs demonstrate better forecasting performance vis-à-vis the unrestricted VARs 
In all cases BVARs demonstrate better forecasting performance vis-à-vis the unrestricted VARs 
BVARs are roughly at par with the factor models, known to be good forecasting devices
Описание слайда:
In all cases BVARs demonstrate better forecasting performance vis-à-vis the unrestricted VARs In all cases BVARs demonstrate better forecasting performance vis-à-vis the unrestricted VARs BVARs are roughly at par with the factor models, known to be good forecasting devices

Слайд 59





Conclusions
BVARs is a useful tool to improve forecasts
This is not a “black box”
posterior distribution parameters are typically functions of prior parameters and data
Choice of priors can go:
from a simple Minnesota prior (that is convenient for analytical results) 
…to a full-fledged DSGE model that incorporates theory-consistent structural parameters
The choice of hyperparameters for the prior depends on the nature of the time series we want to forecast
No “one size fits all approach”
Описание слайда:
Conclusions BVARs is a useful tool to improve forecasts This is not a “black box” posterior distribution parameters are typically functions of prior parameters and data Choice of priors can go: from a simple Minnesota prior (that is convenient for analytical results) …to a full-fledged DSGE model that incorporates theory-consistent structural parameters The choice of hyperparameters for the prior depends on the nature of the time series we want to forecast No “one size fits all approach”

Слайд 60





Thank You!
Thank You!
Описание слайда:
Thank You! Thank You!

Слайд 61





Appendix A: Remarks about the marginal likelihood
Remarks about the marginal likelihood:
If we have M1,….MN competing models, the marginal likelihood of model Mj, f({yt}|Mj) can be seen as:
The update on the weight of model Mj after observing the data
The out-of-sample prediction record of model j.
Model comparison between two models is performed with the posterior odds ratio:
Favor’s parsimonious modeling: in-built “Occam’s Razor.”
Описание слайда:
Appendix A: Remarks about the marginal likelihood Remarks about the marginal likelihood: If we have M1,….MN competing models, the marginal likelihood of model Mj, f({yt}|Mj) can be seen as: The update on the weight of model Mj after observing the data The out-of-sample prediction record of model j. Model comparison between two models is performed with the posterior odds ratio: Favor’s parsimonious modeling: in-built “Occam’s Razor.”

Слайд 62





Appendix A: Remarks about the marginal likelihood
Remarks about the marginal likelihood:
Predict the first observation using the prior:
Record the first observable and its probability, f(y1o). Update your beliefs:
Predict the second observation:
Record f(y2o|y1o).
Eventually, you get f({yo})=f(y1o) f(y2o|y1o)…..f(yTo|y1o, y2o,…, yT-1o).
Описание слайда:
Appendix A: Remarks about the marginal likelihood Remarks about the marginal likelihood: Predict the first observation using the prior: Record the first observable and its probability, f(y1o). Update your beliefs: Predict the second observation: Record f(y2o|y1o). Eventually, you get f({yo})=f(y1o) f(y2o|y1o)…..f(yTo|y1o, y2o,…, yT-1o).

Слайд 63





Appendix B: Linear Regression with conjugate priors
To calculate the posterior distribution for parameters
…we assume the following for distributions:
Normal for data
Normal for the prior for β (conditional on σ2): β|σ2 ̴ N(m, σ2M)
and Inverse-gamma for the prior for σ2 : σ2 ̴ IΓ(λ,k)
Next consider the product
Описание слайда:
Appendix B: Linear Regression with conjugate priors To calculate the posterior distribution for parameters …we assume the following for distributions: Normal for data Normal for the prior for β (conditional on σ2): β|σ2 ̴ N(m, σ2M) and Inverse-gamma for the prior for σ2 : σ2 ̴ IΓ(λ,k) Next consider the product

Слайд 64


Forecasting with bayesian techniques MP, слайд №64
Описание слайда:

Слайд 65


Forecasting with bayesian techniques MP, слайд №65
Описание слайда:

Слайд 66





Appendix C: How to Estimate a BVAR, Case 1 prior
Use GLS estimator for the regression
Continue (next slide)
Описание слайда:
Appendix C: How to Estimate a BVAR, Case 1 prior Use GLS estimator for the regression Continue (next slide)

Слайд 67





Appendix C: How to Estimate a BVAR, Case 1 Prior
Continue
So, the moments for the posterior distribution are:
The posterior distribution is then multivariate normal
Описание слайда:
Appendix C: How to Estimate a BVAR, Case 1 Prior Continue So, the moments for the posterior distribution are: The posterior distribution is then multivariate normal

Слайд 68





Appendix D: How to Estimate a BVAR: Conjugate Priors
Note that in the case of the Conjugate priors we rely on the following VAR representation
… while in the Minnesota priors case we employed
Though, if we have priors for vectorized coefficients in the form                          
we can also get priors for coefficients in the matrix form
For the mean we simply need to convert α back to the matrix form A
The variance matrix for     can be obtained from the variance for    :
Описание слайда:
Appendix D: How to Estimate a BVAR: Conjugate Priors Note that in the case of the Conjugate priors we rely on the following VAR representation … while in the Minnesota priors case we employed Though, if we have priors for vectorized coefficients in the form we can also get priors for coefficients in the matrix form For the mean we simply need to convert α back to the matrix form A The variance matrix for can be obtained from the variance for :

Слайд 69





Appendix E: Prior and Posterior distributions in Kadiyala and Karlsson (1997)
Описание слайда:
Appendix E: Prior and Posterior distributions in Kadiyala and Karlsson (1997)

Слайд 70





Appendix E: Posterior distributions of forecast for unemployment and industrial production in K&K (1997), h=4, T0 =1985:4
Описание слайда:
Appendix E: Posterior distributions of forecast for unemployment and industrial production in K&K (1997), h=4, T0 =1985:4

Слайд 71





Appendix E: Posterior distribution of the unemployment rate forecast in K&K (1997)
Описание слайда:
Appendix E: Posterior distribution of the unemployment rate forecast in K&K (1997)

Слайд 72





Appendix E: Choosing λ
Описание слайда:
Appendix E: Choosing λ



Похожие презентации
Mypresentation.ru
Загрузить презентацию