Image credit: Nikos Chatsios chatsios.n@gmail.com

Wald-test function in Julia

Wald test in Julia. Julia vs R code and F vs Chi-square distribution

Image credit: Nikos Chatsios chatsios.n@gmail.com

Wald-test function in Julia

Wald test in Julia. Julia vs R code and F vs Chi-square distribution

It all started when I was trying to do some joint restriction on the regression’s coefficients… in Julia… and I failed… And this is because there is no function (at least yet) to perform a Wald test on the estimated regression model. So I made it! In short, I just rewrote the wald.test function from the aod package in R, and created a new function for Julia. It’s called, guess what… wald_test.

For those already familiar with the wald.test from the aod package in R, the wald_test function in Julia works exactly the same way. For anyone who is interested, you can download the Julia function from my GitHub repo.

The rest of the article shows some examples of regression coefficient restrictions using the wald test, first implementing in R and then in Julia.

  1. The first example is a simple joint restriction.
  2. In the second example we also jointly restrict all the regression coefficients except of the intercept to 0, which is practically the standard F test of a regression.
  3. Finally, in the third example, we include the intercept in the restricted coefficients.

Through these examples, we explore the relationship between the F and the \(\chi^2\) distribution, we solve a single problem using 2 different tests (ANOVA (F-test) and Wald (\(\chi^2\)-test)) and we will also show how these tests do not always both work, and this is how the need for this function in Julia was born.

Packages and data

First things first. Let’s get ready with the packages and the data we are going to use.

In R, the dataset iris is already loaded in the object iris and we will also need the package aod which contains the wald.test function.

In Julia, we will need to load the package RDatasets for the iris dataset, the package DataFrames in order to work with dataframes and the package GLM for the ftest function.

Oh, and don’t forget our new function wald_test! Assuming you have already downloaded the wald_test.jl file, all you have to do is to load it too1.

R:

library(aod)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa
4.6 3.1 1.5 0.2 setosa
5.0 3.6 1.4 0.2 setosa
5.4 3.9 1.7 0.4 setosa
4.6 3.4 1.4 0.3 setosa
5.0 3.4 1.5 0.2 setosa
4.4 2.9 1.4 0.2 setosa
4.9 3.1 1.5 0.1 setosa
5.4 3.7 1.5 0.2 setosa
4.8 3.4 1.6 0.2 setosa
4.8 3.0 1.4 0.1 setosa
4.3 3.0 1.1 0.1 setosa
5.8 4.0 1.2 0.2 setosa
5.7 4.4 1.5 0.4 setosa
5.4 3.9 1.3 0.4 setosa
5.1 3.5 1.4 0.3 setosa
5.7 3.8 1.7 0.3 setosa
5.1 3.8 1.5 0.3 setosa
5.4 3.4 1.7 0.2 setosa
5.1 3.7 1.5 0.4 setosa
4.6 3.6 1.0 0.2 setosa
5.1 3.3 1.7 0.5 setosa
4.8 3.4 1.9 0.2 setosa
5.0 3.0 1.6 0.2 setosa
5.0 3.4 1.6 0.4 setosa
5.2 3.5 1.5 0.2 setosa
5.2 3.4 1.4 0.2 setosa
4.7 3.2 1.6 0.2 setosa
4.8 3.1 1.6 0.2 setosa
5.4 3.4 1.5 0.4 setosa
5.2 4.1 1.5 0.1 setosa
5.5 4.2 1.4 0.2 setosa
4.9 3.1 1.5 0.2 setosa
5.0 3.2 1.2 0.2 setosa
5.5 3.5 1.3 0.2 setosa
4.9 3.6 1.4 0.1 setosa
4.4 3.0 1.3 0.2 setosa
5.1 3.4 1.5 0.2 setosa
5.0 3.5 1.3 0.3 setosa
4.5 2.3 1.3 0.3 setosa
4.4 3.2 1.3 0.2 setosa
5.0 3.5 1.6 0.6 setosa
5.1 3.8 1.9 0.4 setosa
4.8 3.0 1.4 0.3 setosa
5.1 3.8 1.6 0.2 setosa
4.6 3.2 1.4 0.2 setosa
5.3 3.7 1.5 0.2 setosa
5.0 3.3 1.4 0.2 setosa
7.0 3.2 4.7 1.4 versicolor
6.4 3.2 4.5 1.5 versicolor
6.9 3.1 4.9 1.5 versicolor
5.5 2.3 4.0 1.3 versicolor
6.5 2.8 4.6 1.5 versicolor
5.7 2.8 4.5 1.3 versicolor
6.3 3.3 4.7 1.6 versicolor
4.9 2.4 3.3 1.0 versicolor
6.6 2.9 4.6 1.3 versicolor
5.2 2.7 3.9 1.4 versicolor
5.0 2.0 3.5 1.0 versicolor
5.9 3.0 4.2 1.5 versicolor
6.0 2.2 4.0 1.0 versicolor
6.1 2.9 4.7 1.4 versicolor
5.6 2.9 3.6 1.3 versicolor
6.7 3.1 4.4 1.4 versicolor
5.6 3.0 4.5 1.5 versicolor
5.8 2.7 4.1 1.0 versicolor
6.2 2.2 4.5 1.5 versicolor
5.6 2.5 3.9 1.1 versicolor
5.9 3.2 4.8 1.8 versicolor
6.1 2.8 4.0 1.3 versicolor
6.3 2.5 4.9 1.5 versicolor
6.1 2.8 4.7 1.2 versicolor
6.4 2.9 4.3 1.3 versicolor
6.6 3.0 4.4 1.4 versicolor
6.8 2.8 4.8 1.4 versicolor
6.7 3.0 5.0 1.7 versicolor
6.0 2.9 4.5 1.5 versicolor
5.7 2.6 3.5 1.0 versicolor
5.5 2.4 3.8 1.1 versicolor
5.5 2.4 3.7 1.0 versicolor
5.8 2.7 3.9 1.2 versicolor
6.0 2.7 5.1 1.6 versicolor
5.4 3.0 4.5 1.5 versicolor
6.0 3.4 4.5 1.6 versicolor
6.7 3.1 4.7 1.5 versicolor
6.3 2.3 4.4 1.3 versicolor
5.6 3.0 4.1 1.3 versicolor
5.5 2.5 4.0 1.3 versicolor
5.5 2.6 4.4 1.2 versicolor
6.1 3.0 4.6 1.4 versicolor
5.8 2.6 4.0 1.2 versicolor
5.0 2.3 3.3 1.0 versicolor
5.6 2.7 4.2 1.3 versicolor
5.7 3.0 4.2 1.2 versicolor
5.7 2.9 4.2 1.3 versicolor
6.2 2.9 4.3 1.3 versicolor
5.1 2.5 3.0 1.1 versicolor
5.7 2.8 4.1 1.3 versicolor
6.3 3.3 6.0 2.5 virginica
5.8 2.7 5.1 1.9 virginica
7.1 3.0 5.9 2.1 virginica
6.3 2.9 5.6 1.8 virginica
6.5 3.0 5.8 2.2 virginica
7.6 3.0 6.6 2.1 virginica
4.9 2.5 4.5 1.7 virginica
7.3 2.9 6.3 1.8 virginica
6.7 2.5 5.8 1.8 virginica
7.2 3.6 6.1 2.5 virginica
6.5 3.2 5.1 2.0 virginica
6.4 2.7 5.3 1.9 virginica
6.8 3.0 5.5 2.1 virginica
5.7 2.5 5.0 2.0 virginica
5.8 2.8 5.1 2.4 virginica
6.4 3.2 5.3 2.3 virginica
6.5 3.0 5.5 1.8 virginica
7.7 3.8 6.7 2.2 virginica
7.7 2.6 6.9 2.3 virginica
6.0 2.2 5.0 1.5 virginica
6.9 3.2 5.7 2.3 virginica
5.6 2.8 4.9 2.0 virginica
7.7 2.8 6.7 2.0 virginica
6.3 2.7 4.9 1.8 virginica
6.7 3.3 5.7 2.1 virginica
7.2 3.2 6.0 1.8 virginica
6.2 2.8 4.8 1.8 virginica
6.1 3.0 4.9 1.8 virginica
6.4 2.8 5.6 2.1 virginica
7.2 3.0 5.8 1.6 virginica
7.4 2.8 6.1 1.9 virginica
7.9 3.8 6.4 2.0 virginica
6.4 2.8 5.6 2.2 virginica
6.3 2.8 5.1 1.5 virginica
6.1 2.6 5.6 1.4 virginica
7.7 3.0 6.1 2.3 virginica
6.3 3.4 5.6 2.4 virginica
6.4 3.1 5.5 1.8 virginica
6.0 3.0 4.8 1.8 virginica
6.9 3.1 5.4 2.1 virginica
6.7 3.1 5.6 2.4 virginica
6.9 3.1 5.1 2.3 virginica
5.8 2.7 5.1 1.9 virginica
6.8 3.2 5.9 2.3 virginica
6.7 3.3 5.7 2.5 virginica
6.7 3.0 5.2 2.3 virginica
6.3 2.5 5.0 1.9 virginica
6.5 3.0 5.2 2.0 virginica
6.2 3.4 5.4 2.3 virginica
5.9 3.0 5.1 1.8 virginica

Julia:

using RDatasets, DataFrames, GLM
include("wald_test.jl")
iris = dataset("datasets", "iris");

1. Joint restrictions

First let’s form a regression model using the iris dataset and name it model.

R:

model <- lm(Sepal.Length ~ Petal.Length + Petal.Width, iris)
summary(model)
## 
## Call:
## lm(formula = Sepal.Length ~ Petal.Length + Petal.Width, data = iris)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.18534 -0.29838 -0.02763  0.28925  1.02320 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   4.19058    0.09705  43.181  < 2e-16 ***
## Petal.Length  0.54178    0.06928   7.820 9.41e-13 ***
## Petal.Width  -0.31955    0.16045  -1.992   0.0483 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4031 on 147 degrees of freedom
## Multiple R-squared:  0.7663, Adjusted R-squared:  0.7631 
## F-statistic:   241 on 2 and 147 DF,  p-value: < 2.2e-16

Julia:

model = lm(@formula(SepalLength ~ PetalLength + PetalWidth), iris)
SepalLength ~ 1 + PetalLength + PetalWidth

Coefficients:
──────────────────────────────────────────────────────────────────────────────
              Estimate  Std. Error   t value  Pr(>|t|)  Lower 95%    Upper 95%
──────────────────────────────────────────────────────────────────────────────
(Intercept)   4.19058    0.0970459  43.1815     <1e-84   3.9988     4.38237   
PetalLength   0.541777   0.0692818   7.81991    <1e-12   0.40486    0.678694  
PetalWidth   -0.319551   0.160453   -1.99156    0.0483  -0.636642  -0.00245875
──────────────────────────────────────────────────────────────────────────────

As you notice, our model includes 2 regressors and an intercept. We also notice that the estimated coefficient of the variable PetalLength is statistically significant with a p-value close to 0, while the variable PetalWidth is statistically insignificant (at 1% significance level) having a p-value = 0.048.

In this situation, assume that we want to test whether the coefficients of these variables are jointly significant different than 0.

To do so, we perform a wald test on both of the coefficients.

R:

wald_results <- wald.test(Sigma = vcov(model), b = coef(model), Terms = 2:3)
wald_results
## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 481.9, df = 2, P(> X2) = 0.0

Julia:

wald_results = wald_test(Sigma = vcov(model), b = coef(model), Terms = 2:3)
Wald test:
----------

Chi-squared test:
X2 = 481.90740153204536, df = 2, P(> X2) = 2.265360705998325e-105

2. Joint restrictions of all the coefficients except of the intercept

For this example, we will create a new variable which is going to be the square of Sepal Length (SepalLength2) and we will fit a new model.

R:

iris <- data.frame(iris, Sepal.Length2 = iris$Sepal.Length^2)
model2 <- lm(Sepal.Width ~ Sepal.Length + Sepal.Length2, iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length2
5.1 3.5 1.4 0.2 setosa 26.01
4.9 3.0 1.4 0.2 setosa 24.01
4.7 3.2 1.3 0.2 setosa 22.09
4.6 3.1 1.5 0.2 setosa 21.16
5.0 3.6 1.4 0.2 setosa 25.00
5.4 3.9 1.7 0.4 setosa 29.16
4.6 3.4 1.4 0.3 setosa 21.16
5.0 3.4 1.5 0.2 setosa 25.00
4.4 2.9 1.4 0.2 setosa 19.36
4.9 3.1 1.5 0.1 setosa 24.01
5.4 3.7 1.5 0.2 setosa 29.16
4.8 3.4 1.6 0.2 setosa 23.04
4.8 3.0 1.4 0.1 setosa 23.04
4.3 3.0 1.1 0.1 setosa 18.49
5.8 4.0 1.2 0.2 setosa 33.64
5.7 4.4 1.5 0.4 setosa 32.49
5.4 3.9 1.3 0.4 setosa 29.16
5.1 3.5 1.4 0.3 setosa 26.01
5.7 3.8 1.7 0.3 setosa 32.49
5.1 3.8 1.5 0.3 setosa 26.01
5.4 3.4 1.7 0.2 setosa 29.16
5.1 3.7 1.5 0.4 setosa 26.01
4.6 3.6 1.0 0.2 setosa 21.16
5.1 3.3 1.7 0.5 setosa 26.01
4.8 3.4 1.9 0.2 setosa 23.04
5.0 3.0 1.6 0.2 setosa 25.00
5.0 3.4 1.6 0.4 setosa 25.00
5.2 3.5 1.5 0.2 setosa 27.04
5.2 3.4 1.4 0.2 setosa 27.04
4.7 3.2 1.6 0.2 setosa 22.09
4.8 3.1 1.6 0.2 setosa 23.04
5.4 3.4 1.5 0.4 setosa 29.16
5.2 4.1 1.5 0.1 setosa 27.04
5.5 4.2 1.4 0.2 setosa 30.25
4.9 3.1 1.5 0.2 setosa 24.01
5.0 3.2 1.2 0.2 setosa 25.00
5.5 3.5 1.3 0.2 setosa 30.25
4.9 3.6 1.4 0.1 setosa 24.01
4.4 3.0 1.3 0.2 setosa 19.36
5.1 3.4 1.5 0.2 setosa 26.01
5.0 3.5 1.3 0.3 setosa 25.00
4.5 2.3 1.3 0.3 setosa 20.25
4.4 3.2 1.3 0.2 setosa 19.36
5.0 3.5 1.6 0.6 setosa 25.00
5.1 3.8 1.9 0.4 setosa 26.01
4.8 3.0 1.4 0.3 setosa 23.04
5.1 3.8 1.6 0.2 setosa 26.01
4.6 3.2 1.4 0.2 setosa 21.16
5.3 3.7 1.5 0.2 setosa 28.09
5.0 3.3 1.4 0.2 setosa 25.00
7.0 3.2 4.7 1.4 versicolor 49.00
6.4 3.2 4.5 1.5 versicolor 40.96
6.9 3.1 4.9 1.5 versicolor 47.61
5.5 2.3 4.0 1.3 versicolor 30.25
6.5 2.8 4.6 1.5 versicolor 42.25
5.7 2.8 4.5 1.3 versicolor 32.49
6.3 3.3 4.7 1.6 versicolor 39.69
4.9 2.4 3.3 1.0 versicolor 24.01
6.6 2.9 4.6 1.3 versicolor 43.56
5.2 2.7 3.9 1.4 versicolor 27.04
5.0 2.0 3.5 1.0 versicolor 25.00
5.9 3.0 4.2 1.5 versicolor 34.81
6.0 2.2 4.0 1.0 versicolor 36.00
6.1 2.9 4.7 1.4 versicolor 37.21
5.6 2.9 3.6 1.3 versicolor 31.36
6.7 3.1 4.4 1.4 versicolor 44.89
5.6 3.0 4.5 1.5 versicolor 31.36
5.8 2.7 4.1 1.0 versicolor 33.64
6.2 2.2 4.5 1.5 versicolor 38.44
5.6 2.5 3.9 1.1 versicolor 31.36
5.9 3.2 4.8 1.8 versicolor 34.81
6.1 2.8 4.0 1.3 versicolor 37.21
6.3 2.5 4.9 1.5 versicolor 39.69
6.1 2.8 4.7 1.2 versicolor 37.21
6.4 2.9 4.3 1.3 versicolor 40.96
6.6 3.0 4.4 1.4 versicolor 43.56
6.8 2.8 4.8 1.4 versicolor 46.24
6.7 3.0 5.0 1.7 versicolor 44.89
6.0 2.9 4.5 1.5 versicolor 36.00
5.7 2.6 3.5 1.0 versicolor 32.49
5.5 2.4 3.8 1.1 versicolor 30.25
5.5 2.4 3.7 1.0 versicolor 30.25
5.8 2.7 3.9 1.2 versicolor 33.64
6.0 2.7 5.1 1.6 versicolor 36.00
5.4 3.0 4.5 1.5 versicolor 29.16
6.0 3.4 4.5 1.6 versicolor 36.00
6.7 3.1 4.7 1.5 versicolor 44.89
6.3 2.3 4.4 1.3 versicolor 39.69
5.6 3.0 4.1 1.3 versicolor 31.36
5.5 2.5 4.0 1.3 versicolor 30.25
5.5 2.6 4.4 1.2 versicolor 30.25
6.1 3.0 4.6 1.4 versicolor 37.21
5.8 2.6 4.0 1.2 versicolor 33.64
5.0 2.3 3.3 1.0 versicolor 25.00
5.6 2.7 4.2 1.3 versicolor 31.36
5.7 3.0 4.2 1.2 versicolor 32.49
5.7 2.9 4.2 1.3 versicolor 32.49
6.2 2.9 4.3 1.3 versicolor 38.44
5.1 2.5 3.0 1.1 versicolor 26.01
5.7 2.8 4.1 1.3 versicolor 32.49
6.3 3.3 6.0 2.5 virginica 39.69
5.8 2.7 5.1 1.9 virginica 33.64
7.1 3.0 5.9 2.1 virginica 50.41
6.3 2.9 5.6 1.8 virginica 39.69
6.5 3.0 5.8 2.2 virginica 42.25
7.6 3.0 6.6 2.1 virginica 57.76
4.9 2.5 4.5 1.7 virginica 24.01
7.3 2.9 6.3 1.8 virginica 53.29
6.7 2.5 5.8 1.8 virginica 44.89
7.2 3.6 6.1 2.5 virginica 51.84
6.5 3.2 5.1 2.0 virginica 42.25
6.4 2.7 5.3 1.9 virginica 40.96
6.8 3.0 5.5 2.1 virginica 46.24
5.7 2.5 5.0 2.0 virginica 32.49
5.8 2.8 5.1 2.4 virginica 33.64
6.4 3.2 5.3 2.3 virginica 40.96
6.5 3.0 5.5 1.8 virginica 42.25
7.7 3.8 6.7 2.2 virginica 59.29
7.7 2.6 6.9 2.3 virginica 59.29
6.0 2.2 5.0 1.5 virginica 36.00
6.9 3.2 5.7 2.3 virginica 47.61
5.6 2.8 4.9 2.0 virginica 31.36
7.7 2.8 6.7 2.0 virginica 59.29
6.3 2.7 4.9 1.8 virginica 39.69
6.7 3.3 5.7 2.1 virginica 44.89
7.2 3.2 6.0 1.8 virginica 51.84
6.2 2.8 4.8 1.8 virginica 38.44
6.1 3.0 4.9 1.8 virginica 37.21
6.4 2.8 5.6 2.1 virginica 40.96
7.2 3.0 5.8 1.6 virginica 51.84
7.4 2.8 6.1 1.9 virginica 54.76
7.9 3.8 6.4 2.0 virginica 62.41
6.4 2.8 5.6 2.2 virginica 40.96
6.3 2.8 5.1 1.5 virginica 39.69
6.1 2.6 5.6 1.4 virginica 37.21
7.7 3.0 6.1 2.3 virginica 59.29
6.3 3.4 5.6 2.4 virginica 39.69
6.4 3.1 5.5 1.8 virginica 40.96
6.0 3.0 4.8 1.8 virginica 36.00
6.9 3.1 5.4 2.1 virginica 47.61
6.7 3.1 5.6 2.4 virginica 44.89
6.9 3.1 5.1 2.3 virginica 47.61
5.8 2.7 5.1 1.9 virginica 33.64
6.8 3.2 5.9 2.3 virginica 46.24
6.7 3.3 5.7 2.5 virginica 44.89
6.7 3.0 5.2 2.3 virginica 44.89
6.3 2.5 5.0 1.9 virginica 39.69
6.5 3.0 5.2 2.0 virginica 42.25
6.2 3.4 5.4 2.3 virginica 38.44
5.9 3.0 5.1 1.8 virginica 34.81

Julia:

iris[:SepalLength2] = iris.SepalLength .^ 2
model2 = lm(@formula(SepalWidth ~ SepalLength + SepalLength2), iris)
SepalWidth ~ 1 + SepalLength + SepalLength2

Coefficients:
─────────────────────────────────────────────────────────────────────────────────
                Estimate  Std. Error   t value  Pr(>|t|)    Lower 95%   Upper 95%
─────────────────────────────────────────────────────────────────────────────────
(Intercept)    6.41584      1.58499    4.04787    <1e-4    3.28352      9.54815  
SepalLength   -1.08556      0.536246  -2.02437    0.0447  -2.14531     -0.0258139
SepalLength2   0.0857066    0.044755   1.91502    0.0574  -0.00273979   0.174153 
─────────────────────────────────────────────────────────────────────────────────

In essence, this is the F-test which is most of the times printed at the bottom of a regression accompanied by its degrees of freedom and the p-value. This test informs us, whether the regressors are jointly equal to zero, and thus if the regression as a whole is useless. In fact, in this case, simply the (unconditional) mean of the dependent variable would have been as good predictor as all of these variables together.

In R, so far so good. As you can see, at the bottom line of summary(model2) we have what we want.

R:

summary(model2)
## 
## Call:
## lm(formula = Sepal.Width ~ Sepal.Length + Sepal.Length2, data = iris)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.13070 -0.26310 -0.02446  0.25728  1.38725 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    6.41584    1.58499   4.048 8.33e-05 ***
## Sepal.Length  -1.08556    0.53625  -2.024   0.0447 *  
## Sepal.Length2  0.08571    0.04476   1.915   0.0574 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4304 on 147 degrees of freedom
## Multiple R-squared:  0.03783,    Adjusted R-squared:  0.02474 
## F-statistic:  2.89 on 2 and 147 DF,  p-value: 0.05877

On the other hand, the regression output in Julia doesn’t provide so much information. To retrieve the information about the F-test we will need to do the following things. What we are actually going to do is to use ANOVA which is an F-test and compare our model (sometimes called full model) with the null model which is a regression without any independent variables except from the constant term. So, first we have to create the null model and then perform an F-test (ANOVA) comparing these two models (using the ftest function from the GLM package).

Julia:

nullmodel2 = lm(@formula(SepalWidth ~ 1), iris)
ftest(model2.model, nullmodel2.model)
        Res. DOF DOF ΔDOF     SSR    ΔSSR     R²    ΔR²     F*  p(>F)
Model 1      147   4      27.2362         0.0378                     
Model 2      149   2   -2 28.3069 -1.0708 0.0000 0.0378 2.8895 0.0588

Speaking about Julia, not so straight forward, right? Now, what if we could skip all this workaround and just perform a restriction, jointly to all the coefficients except from the intercept? That’s right! We can use our new Julia function wald_test exactly the same way as we did before.

Julia:

wald_results = wald_test(Sigma = vcov(model2), b = coef(model2), Terms = 2:3)
Wald test:
----------

Chi-squared test:
X2 = 5.779097987201187, df = 2, P(> X2) = 0.05560128349216448

But as you are about to notice, our result is a Chi square (Χ^2) test instead of an F-test. If we want to see the equivalent F-test, using some basic knowledge from distribution theory, we can go from a Chi square distribution to an F distribution using the following formula:

\[\frac{\chi^2}{df} = F\]

Where the degrees of freedom from the Chi square distribution is the number of restrictions, and this is also the numerator degrees of freedom from the equivalent F distribution.

Julia:

wald_chi2 = wald_results.result["chi2"].chi2
wald_df = wald_results.result["chi2"].df
fstat = wald_chi2 / wald_df
2.8895489936005934

Now that we have the F statistic, we could calculate the p-value which corresponds to the value of the F statistic on the pdf (probability density function) of the F distribution. But this is already too much work for something like this… Once again, we can skip this mess just by asking our new wald_test function to perform an F test for our joint restrictions. All we need is to add the denominator degrees of freedom dendf.

Julia:

dendf = nrow(iris)-length(coef(model2))
wald_results = wald_test(Sigma = vcov(model2), b = coef(model2), Terms = 2:3, df = dendf)
Wald test:
----------

Chi-squared test:
X2 = 5.779097987201187, df = 2, P(> X2) = 0.05560128349216448

F test:
W = 2.8895489936005934, df1 = 2, df2 = 147, P(> W) = 0.05876576520988747

As you notice, the p-values of the \(x^2\) and the F tests are very close but not identical. But as the denominator degrees of freedom is getting larger (i.e. as the number of observations rises or the number of coefficients is getting smaller), the p-value of the F test is getting closer and closer to the p-value of the \(x^2\) test.

3. Joint restrictions of all the coefficients including the intercept

When I faced the previous problem I thought I could take the long way and perform an ANOVA, just to get the job done. That was until I faced the next problem. Consider the problem where you want to test whether all the regression’s coefficients including the intercept are jointly equal to zero. What is the null model in this case? It’s not even the sample mean of the dependent variable as in the previous case. And would it even be nested to our full model? This is the situation which led me to create the wald test function for Julia.

See how easy it is to test if all the coefficients are equal to 0 or whether the intercept, the first and the second coefficients are respectively equal to 4.2, 0.54 and -0.3. In both cases, the \(x^2-test\)and the F-test come to the same conclusion.

Julia:

model
SepalLength ~ 1 + PetalLength + PetalWidth

Coefficients:
──────────────────────────────────────────────────────────────────────────────
              Estimate  Std. Error   t value  Pr(>|t|)  Lower 95%    Upper 95%
──────────────────────────────────────────────────────────────────────────────
(Intercept)   4.19058    0.0970459  43.1815     <1e-84   3.9988     4.38237   
PetalLength   0.541777   0.0692818   7.81991    <1e-12   0.40486    0.678694  
PetalWidth   -0.319551   0.160453   -1.99156    0.0483  -0.636642  -0.00245875
──────────────────────────────────────────────────────────────────────────────
dendf = nrow(iris)-length(coef(model))
wald_results = wald_test(Sigma = vcov(model), b = coef(model), Terms = 1:3, df = dendf).result
Wald test:
----------

Chi-squared test:
X2 = 32008.931513284944, df = 3, P(> X2) = 0.0

F test:
W = 10669.643837761649, df1 = 3, df2 = 147, P(> W) = 1.0022050012971983e-171
wald_results = wald_test(Sigma = vcov(model), b = coef(model), Terms = 1:3, H0 = [4.2, 0.54, -0.3], df = dendf).result
Wald test:
----------

Chi-squared test:
X2 = 0.763308780543053, df = 3, P(> X2) = 0.858221426416421

F test:
W = 0.25443626018101767, df1 = 3, df2 = 147, P(> W) = 0.8580771658399398

Feel free to download, modify and use the code!

Comments of all kinds are more than welcome!


  1. The wald_test function requires the Distributions package which is loaded automatically upon the inclusion of the wald_test.jl file.

Avatar
Kleanthis Natsiopoulos
Ph.D. Candidate in Economics

My research interests include data science, machine learning, econometrics and computational economics.

Related

comments powered by Disqus