Contrastes de dependencia espacial

Anuncio
Sesión 4: Contrastes de
dependencia espacial univariante:
técnicas avanzadas de AEDE
Profesora: Coro Chasco Yrigoyen
Universidad Autónoma de Madrid
17 a 21 de mayo, 2010
2010, Coro Chasco Yrigoyen
All Rights Reserved
Índice del Curso













S1: Introducción a la Econometría Espacial
SP1: Introducción al programa GeoDa
S2: Efectos espaciales: dependencia espacial
S3: Análisis Exploratorio de Datos Espaciales (AEDE): técnicas básicas
SP2: AEDE en GeoDa: técnicas básicas
S4: Contrastes de dependencia espacial: técnicas avanzadas de AEDE
S5: Análisis confirmatorio de datos espaciales: especificación de los
modelos de dependencia espacial
SP3: AEDE en GeoDa: técnicas avanzadas
S6: Estimación y contrastes de un modelo de regresión espacial por el
método de Mínimos Cuadrados Ordinarios
S7: Estimación y contraste de los modelos de dependencia espacial
SP4: El módulo de regresión espacial en el programa GeoDa
S8: Estimación y contraste del modelo del error espacial y estrategias
de modelización espacial.
SP5: Aplicación de la estrategia de modelización clásica a
casos prácticos con el programa GeoDa
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
2
. CHASCO, C. (2003), “Econometría espacial aplicada a la
predicción-extrapolación de datos microterritoriales”. Comunidad
de Madrid; pp. 62-78.
Session 4
Overview and Goals

Global spatial autocorrelation
1.
2.
3.
4.
5.

Moran’s I
Moran’s scatterplot
Geary’s c
Mantel’s 
Getis and Ord’s G(d)
Local spatial autocorrelation
1. Getis and Ord’s local statistics
4. LISA tests & maps


Bivariate & space-time spatial autocorrelation
Spatial autocorrelation tests for rates
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
3
Session 4
4.1. Global spatial autocorrelation


Used to test for the presence of general spatial trends in the
distribution of a geographical variable over a whole space.
But how can we determine the existence
of spatial autocorrelation?
4.1.1.
4.1.2.
4.1.3.
4.1.4.
4.1.5.
Moran’s I
Moran’s scatterplot
Geary’s c
Mantel’s 
Getis and Ord’s G(d)
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
Rta. disp. por hab. (1997)
(miles ptas.)
1.400 a 1.800
1.125 a 1.400
900 a 1.125
4
Session 4
. MORAN, P. (1948), “The interpretation of statistical
maps”. Journal of the Royal Statistical Society B, vol.
10; pp. 243-251.
4.1. Global spatial autocorrelation
4.1.1. Moran’s I

Moran’I
theoretical
mean: E(I) =
W*

Possitive aut.

Negativ aut.
N: sample size
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
5
. CLIFF, A. y J. ORD (1973), “Spatial
autocorrelation”. London: Pion.
. CLIFF, A. y J. ORD (1981), “Spatial processes,
models and applications”. London: Pion
Session 4
4.1. Global spatial autocorrelation
4.1.1. Moran’s I (II)



For N  , z(I) follows a standard normal
distribution: z(I)  N(0,1)
Inference is typically based on a
standardized z-value,
Assumptions:


Normalisation: the variable X follows an asymptotic
normal distribution.
Randomisation by permutation: unknown
distribution function for X
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
6
Session 4
4.1. Global spatial autocorrelation
4.1.1. Moran’s I (III)

Normalisation: the variable X follows a
normal distribution
1) For N  , zN(I) follows a standard normal
distribution: zN(I)  N(0,1)
2) Significance of zN(I): in a standard normal table
1
EN  I   
N 1
VarN  I  
4 AN 2  8  A  D  N  12 A2
4 A2  N 2  1
1 N
A   Li  S0
2 i 1
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
1 n
D   Li Li  1
2 i 1
7
Session 4
4.1. Global spatial autocorrelation
4.1.1. Moran’s I (IV)

Permutation: randomisation with unknown
distribution function
1) A reference distribution for I is generated empirically.
2) Randomly permuting observations & computing Moran’s for a
set of n! new samples
3) E[I] & SD[I] are computed directly from the generated
distribution of Moran’s Is
4) Significance of z(I): in a standard normal table.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
8
4.1. Global spatial autocorrelation
4.1.1. Moran’s I (V)
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
9
Session 4
4.1. Global spatial autocorrelation
4.1.1. Moran’s I (VI)

Interpretation:
Non-significant values for z(I) should be interpreted as a rejection
of H0(no spatial autocorrelation).
 Significant z(I) > 0  positive spatial autocorrelation: it is
possible to find out similar high/low values of a variable X
spatially clustered than could be by chance.
 Significant z(I) < 0  negative spatial autocorrelation: there is
a lack of similar high/low values of X spatially clustered than
could be by chance. This pattern is perfectly represented by a
checkerboard.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
10
Session 4
4.1. Global spatial
autocorrelation
4.1.1. Moran’s I (VII)

A negative significant z(I): spatial autocorrelation (lack of
clustering more than would be in
a random pattern)
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
11
Session 4
. CLIFF, A. y J. ORD (1981), “Spatial processes,
models and applications”. London: Pion; chapter 5.
4.1. Global spatial autocorrelation
4.1.1. Moran’s I (VIII)



Correlogram: an analytic method that is of value
in assessing the spatial scale of a process.
Sometimes the strength of spatial interaction will
vary in a complex way with distance.
Higher-order spatial autocorrelation: spatial
correlogram
1.5
Z (I) M ORAN
1
0.5
0
1
-0.5
-1
-1.5
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
12
2
3
4
5
6
7
8
9
Session 4
4.1. Global spatial autocorrelation
4.1.2. Moran’s scatterplot

Visualizes I as the slope of the regression line in a
scatterplot with Wz on Y-axis and z on X-axis.
. ANSELIN, L. & S. BAO (1997),
“Exploratory Spatial Data Analysis”. In
“Recent developments in spatial analysis”
(Eds. Fischer y Getis), Springer-Verlag,
Berlín; pp. 35-59.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
13
Session 4
4.1. Global spatial autocorrelation
4.1.2. Moran’s scatterplot (II)
Moran scatterplot map
II
I
(-)
(+)
III
IV
(+)
(-)
Moran scatterplot
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
14
Session 4
4.1. Global spatial autocorrelation
4.1.2. Moran’s scatterplot (III)
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
15
Session 4
4.1. Global spatial autocorrelation
4.1.3. Geary’s c
N  1   2 wij  xi  x j 

c
2 S0
N
x  x 
i 1

2
2
i

Geary’s c
theoretical
mean: E(c) = 1
Perfect possitive aut.: c = 0,
xi  xj  xi – xj = 0
Geary’s c: depends on the (absolute) difference between neighboring values of
a variable. It is similar to the Durbin-Watson test. It’s a variance test.
Moran’s I: depends on the difference between each value of X variable and its
mean. It is similar to the Pearson correlation coefficient. It’s a covariance test.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
16
Session 4
4.1. Global spatial autocorrelation
4.1.3. Geary’s c (II)


For N  , z(c) follows a standard normal
distribution: z(c)  N(0,1)
Inference is typically based on a standardized z-value,
c  E c
z c 
SD  c 


Normalisation: the variable X follows an asymptotic normal distribution.
Randomisation by permutation: unknown distribution function for X
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
17
Session 4
4.1. Global spatial autocorrelation
4.1.3. Geary’s c (III)

Interpretation:
c  E c
z c 
SD  c 
Non-significant values for z(c) should be interpreted as
a rejection of H0(no spatial autocorrelation).
 Significant z(c) < 0  positive spatial autocorrelation: it is
possible to find out similar spatially clustered high/low values of
a variable X than it would be by chance.
 Significant z(I) > 0  negative spatial autocorrelation: there is
a lack of clustered similar high/low values of X than it would be
by chance.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
18
Session 4
4.1. Global spatial autocorrelation
4.1.4. Mantel’s 

Mantel (1967): matrix association index, which is the sum of
the cross-product of the coincident elements of matrices A, B:
    a ij bij
i
j
wij
xi x j
Moran’s I
x  x 
i
j
2
Geary’s c
Spatial association measures can be obtained, in general, expressing
similarities by means of matrices: 1) spatial similarity (e.g., the spatial
weight matrix) and 2) value similarities.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
19
Session 4
4.1. Global spatial autocorrelation
4.1.5. Getis and Ord G(d)

Spatial autocorrelation is measured as a distanced-based or spatial
clustering measure. For this test, two spatial units are neighbors if
they are located at a certain distance (d).
N
G d  
N
 w  d  x x
i 1 j 1
N
ij
 x x
i
X>0
j
;
N
i 1 j 1

i
j
for j  i
W = binary,
symmetric
It measures the association degree
existent between the values of X
around “i” and the association in
the value of X around “j”
j
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
20
Session 4
4.1. Global spatial autocorrelation
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
21
Session 4
4.2. Local spatial autocorrelation


Concentration -in a particular zone of the global space- of
particularly high/low values of a variable more than the expected
mean value (or mean of the variable).
This phenomenon takes place in non-stationary spatial
processes: spatial dependence changes with location.
 Sometimes there is no global spatial autocorrelation in a
variable but small spatial clusters, in which it takes a
significant concentration/lack of high values.
 Sometimes there is global spatial autocorrelation in a
variable, but each region contributes differently to it.
TESTS: 1. Getis and Ord’s local statistics
2. LISA tests.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
22
Session 4
4.2. Local spatial autocorrelation
4.2.1. Getis and Ord’s local statistics

Gi(d), Gi*(d), New Gi(d), New Gi*(d)

Gi(d) measures the concentration (or lack or it) of the weighted
sum of values of variable Y in a subregion of “j” locations around
“i” in the global space.
GLOBAL
N
G d  
LOCAL
N
N
 wij  d  xi x j
i 1 j 1
N
N
 xi x j
i 1 j 1
Gi  d  
 w d  x
j 1
ij
N
x
j 1
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
j
j
ji

; for  x j  0

W binary & symmetric
23
Session 4
4.2. Local spatial autocorrelation
4.2.1. Getis and Ord’s local statistics (II)

Gi*(d): Local spatial
concentration also considers
the value of variable X in “i”.
Since wii = 0, the only
difference with Gi(d) is only
in the denominator.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
N
Gi  d  
 w d  x
j 1
ij
j
N
x
j 1
j
j

for  x j  0

W binary & symmetric
24
Session 4
4.2. Local spatial autocorrelation
4.2.1. Getis and Ord’s local statistics (III)




New Gi(d), New Gi*(d): the standardized versions of
Gi(d) and Gi*(d).
They distribute as normal variables.
Significant positive values of these tests = positive
spatial autocorrelation = concentration of high values
of the variable.
Significant negative values of these tests = positive
spatial autocorrelation = concentration of low values of
the variable.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
25
Session 4
Anselin, L. (1995). “Local indicators of spatial
association - LISA.” Geographical Analysis 27, 93–115.
4.2. Local spatial autocorrelation
4.2.2. LISA tests & maps





LISA: Local Indicators of Spatial Autocorrelation
Detect the contribution of each location to global
spatial autocorrelation
Local spatial autocorrelation statistics are useful to
identify hot spots: Spatial concentration of high/low
values or Spatial outliers
Local autocorrelation is always present in global spatial
autocorrelation, but it can also exist in the absence of it.
Local Moran’s I is the most popular.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
26
Session 4
4.2. Local spatial autocorrelation
Anselin, L. (1995). “Local indicators of spatial
association - LISA.” Geographical Analysis 27, 93–115.
4.2.2. LISA tests… (II)


Gives an indication of the extent of
significant spatial clustering of similar
values around one observation “i”.
The sum of LISAs for all observations
is proportional to the global Moran’s I.
Local Moran’s I (LISA)
zi, zj: standaridzed yi values
For a row-standardised W

OBS I_DIST01
1
168.0678
2
-1.155578
3
-0.88391
4
0.044727
5
-5.440304
Z_DIST01
-1.845431
0.842808
0.342822
0.291255
1.480166
P_DIST01
0.000557
0.677512
0.673284
0.950351
0.019687
The moments for Ii
statistic, under the
null hypothesis of no
spatial association,
can be derived for a
randomisation
hypothesis.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
27
Session 4
4.2. Local spatial autocorrelation
4.2.2. LISA tests & maps (III)
Non-significant Moran’s I
over the whole area of the
Spanish provinces
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
28
4.2. Local spatial autocorrelation
4.2.2. LISA tests & maps (III) Session 4
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
29
. LÓPEZ, F. & C. CHASCO (2004), “Space-time lags: Specification
strategy in spatial regression models ”. REAL 04-T17,
http://www2.uiuc.edu/unit/real/d-paper/real04-t-17.pdf .
Session 4
4.3. Bivariate & space-time plots
Bivariate
Moran spatial
autocorrelation
Multivariate
spatial
correlation
Wartenberg, 1985
Moran
space-time
autocorrelation
Anselin et al., 2002
mkl  zkW zl
s
zk  [Yk  k ]/  k
z kWzl
I kl 
z z k
Our proposal
I t  k ,t
zt kWzt

zt k zt  k
zt  [Yt   t ] /  t
z l  [Yl   l ] /  l
z t  k  [Yt  k   t  k ] /  t  k
Ws is a doubly standardized W
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
30
Session 4
4.3. Bivariate & space-time plots (II)
It-k,t: Moran space-time autocorrelation coefficient
Moran’s I value coincides with the slope
Employment rate (E) of the regression line of Wzt on zt-k
p-value=0.001
p-value=0.427
Wzt
Zt-k
t=2002
k=4
W:contiguity matrix (0-1)
Population (P)
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
31
4.3. Bivariate & space-time
plots (IV)
Bivariate LISA
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
32
Session 4
Session 4
Anselin, L. (2005). “Exploring spatial data with GeoDa”
University of Illinois, Urbana-Champaign.
3.3. Spatial correlation analysis for rates




El cálculo e inferencia de los estadísticos de dependencia espacial se
basa en el supuesto de estacionariedad de las variables originales
(media y varianza son constantes en el espacio).
Varianza constante: suele incumplirse cuando las observaciones
son muy diferentes entre sí en términos de superficie o población.
Cuando las variables están transformadas en tasas o
proporciones, suele haber muchas observaciones cuyo valor es
muy pequeño o nulo (en áreas pequeñas o despobladas) o también
valores atípicos extremadamente elevados (cuando se produce
algún suceso inesperado en áreas pequeñas).
Situación habitual en variables expresadas como tasas de eventos
ligados a la población (tasa de mortalidad, paro, delitos, etc.)
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
33
Session 4
Anselin, L. (2005). “Exploring spatial data with GeoDa”
University of Illinois, Urbana-Champaign.
3.3. Spatial correlation analysis for rates



En medicina y epidemiología: se utilizan métodos para eliminar
(o aminorar) la heteroscedasticidad en las ratios.
El cálculo directo de tasas (ej. mortalidad) como cociente entre el
número de fallecidos y la población total existente en una unidad
geográfica puede estar sesgado, dado que la población expuesta
al riesgo (el denominador de la tasa) puede diferir mucho de unos
lugares a otros.
En el campo económico microterritorial, es fácil obtener tasas o
proporciones con valores nulos (o casi nulos) en unidades muy
pequeñas o de escasa población (zonas rurales, con población
envejecida), dificultando o sesgando la comparación inter-territorial
en términos de dicha variable.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
34
Session 4
Anselin, L. (2005). “Exploring spatial data with GeoDa”
University of Illinois, Urbana-Champaign.
3.3. Spatial correlation analysis for rates


Estandarización Empírica Bayesiana (EB): estandarización o
alisado de las tasas.
Método directo de estandarización de tasas (Sáez y Saurina, 2007,
pág. 34):
pˆ i  pi   ni n 
pi  xini
xi, el valor de la variable X en la unidad espacial i
ni la población de i
n la población del conjunto total de unidades del sistema.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
35
Session 4
Anselin, L. (2005). “Exploring spatial data with GeoDa”
University of Illinois, Urbana-Champaign.
3.3. Spatial correlation analysis for rates

Estandarización Empírica Bayesiana (EB): el implementado en
el programa GeoDa (Anselin, 2005).
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
36
Session 4
Anselin, L. (2005). “Exploring spatial data with GeoDa”
University of Illinois, Urbana-Champaign.
3.3. Spatial correlation analysis for rates

Assunçao y Reis (1999) adaptan el estadístico I de Moran para
evitar el sesgo propio de esta situación.
@ 2010, Coro Chasco Yrigoyen
All Rights Reserved
37
Descargar