θ^N
minθ∈ΘN−1∑i=1Nq(wi,θ)
If the solution θ^N is interior point of Θ, objective function is twice differentiable and gradient of the objective function is zero, then Hessian of the objective function (which is H^) is positive semi-definite.
Now what Wooldridge is saying that for given sample the empirical Hessian is not guaranteed to be positive definite or even positive semidefinite. This is true, since Wooldridge does not require that objective function N−1∑Ni=1q(wi,θ) has nice properties, he requires that there exists a unique solution θ0 for
minθ∈ΘEq(w,θ).
So for given sample objective function N−1∑Ni=1q(wi,θ) may be minimized on the boundary point of Θ in which Hessian of objective function needs not to be positive definite.
Further in his book Wooldridge gives an examples of estimates of Hessian which are guaranteed to be numerically positive definite. In practice non-positive definiteness of Hessian should indicate that solution is either on the boundary point or the algorithm failed to find the solution. Which usually is a further indication that the model fitted may be inappropriate for a given data.
Here is the numerical example. I generate non-linear least squares problem:
yi=c1xc2i+εi
I take X uniformly distributed in interval [1,2] and ε normal with zero mean and variance σ2. I generated a sample of size 10, in R 2.11.1 using set.seed(3)
. Here is the link to the values of xi and yi.
I chose the objective function square of usual non-linear least squares objective function:
q(w,θ)=(y−c1xc2i)4
Here is the code in R for optimising function, its gradient and hessian.
##First set-up the epxressions for optimising function, its gradient and hessian.
##I use symbolic derivation of R to guard against human error
mt <- expression((y-c1*x^c2)^4)
gradmt <- c(D(mt,"c1"),D(mt,"c2"))
hessmt <- lapply(gradmt,function(l)c(D(l,"c1"),D(l,"c2")))
##Evaluate the expressions on data to get the empirical values.
##Note there was a bug in previous version of the answer res should not be squared.
optf <- function(p) {
res <- eval(mt,list(y=y,x=x,c1=p[1],c2=p[2]))
mean(res)
}
gf <- function(p) {
evl <- list(y=y,x=x,c1=p[1],c2=p[2])
res <- sapply(gradmt,function(l)eval(l,evl))
apply(res,2,mean)
}
hesf <- function(p) {
evl <- list(y=y,x=x,c1=p[1],c2=p[2])
res1 <- lapply(hessmt,function(l)sapply(l,function(ll)eval(ll,evl)))
res <- sapply(res1,function(l)apply(l,2,mean))
res
}
First test that gradient and hessian works as advertised.
set.seed(3)
x <- runif(10,1,2)
y <- 0.3*x^0.2
> optf(c(0.3,0.2))
[1] 0
> gf(c(0.3,0.2))
[1] 0 0
> hesf(c(0.3,0.2))
[,1] [,2]
[1,] 0 0
[2,] 0 0
> eigen(hesf(c(0.3,0.2)))$values
[1] 0 0
The hessian is zero, so it is positive semi-definite. Now for the values of x and y given in the link we get
> df <- read.csv("badhessian.csv")
> df
x y
1 1.168042 0.3998378
2 1.807516 0.5939584
3 1.384942 3.6700205
4 1.327734 -3.3390724
5 1.602101 4.1317608
6 1.604394 -1.9045958
7 1.124633 -3.0865249
8 1.294601 -1.8331763
9 1.577610 1.0865977
10 1.630979 0.7869717
> x <- df$x
> y <- df$y
> opt <- optim(c(1,1),optf,gr=gf,method="BFGS")
> opt$par
[1] -114.91316 -32.54386
> gf(opt$par)
[1] -0.0005795979 -0.0002399711
> hesf(opt$par)
[,1] [,2]
[1,] 0.0002514806 -0.003670634
[2,] -0.0036706345 0.050998404
> eigen(hesf(opt$par))$values
[1] 5.126253e-02 -1.264959e-05
Gradient is zero, but the hessian is non positive.
Note: This is my third attempt to give an answer. I hope I finally managed to give precise mathematical statements, which eluded me in the previous versions.