Para provarmos que o Coeficiente de Determinação equivale ao quadrado do Coeficiente de Correlação, precisamos provar inicialmente:
Teorema 1: S Q tot = n . ( y 2 ¯ − y ¯ 2 ) {\displaystyle SQ_{\text{tot}}=n.{\biggl (}{\overline {y^{2}}}-{\overline {y}}^{2}{\biggr )}}
editar
Prova: S Q tot = ( y 1 − y ¯ ) 2 + ( y 2 − y ¯ ) 2 + ⋯ + ( y n − y ¯ ) 2 {\displaystyle SQ_{\text{tot}}=(y_{1}-{\overline {y}})^{2}+(y_{2}-{\overline {y}})^{2}+\cdots +(y_{n}-{\overline {y}})^{2}}
editar
= ( y 1 2 − 2. y 1 . y ¯ + y ¯ 2 ) + ⋯ + ( y n 2 − 2. y n . y ¯ + y ¯ 2 ) {\displaystyle =(y_{1}^{2}-2.y_{1}.{\overline {y}}+{\overline {y}}^{2})+\cdots +(y_{n}^{2}-2.y_{n}.{\overline {y}}+{\overline {y}}^{2})}
= ( ∑ y 2 ) − ( 2. y ¯ . ∑ y ) + ( n . y ¯ 2 ) {\displaystyle ={\bigl (}\sum {y^{2}}{\bigr )}-{\bigl (}2.{\overline {y}}.\sum {y}{\bigr )}+{\bigl (}n.{\overline {y}}^{2}{\bigr )}}
= ( n . y 2 ¯ ) − 2. y ¯ . ( n . y ¯ ) + n . y ¯ 2 {\displaystyle =(n.{\overline {y^{2}}})-2.{\overline {y}}.(n.{\overline {y}})+n.{\overline {y}}^{2}}
= n . ( y 2 ¯ − 2. y ¯ 2 + y ¯ 2 ) {\displaystyle =n.({\overline {y^{2}}}-2.{\overline {y}}^{2}+{\overline {y}}^{2})}
= n . ( y 2 ¯ − y ¯ 2 ) c . q . d . {\displaystyle =n.({\overline {y^{2}}}-{\overline {y}}^{2})\ \ \ \ \ c.q.d.}
Teorema 2: S Q r e s = n . ( x ¯ . y ¯ − x y ¯ ) 2 x ¯ 2 − x 2 ¯ + S Q t o t {\displaystyle SQ_{res}=n.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}+SQ_{tot}}
editar
Prova: Inicialmente, precisamos reescrever a expressão do valor estimado pela Regressão Linear :
y ^ k = A . x k + B {\displaystyle {\hat {y}}_{k}=A.x_{k}+B}
= A . x k + ( y ¯ − A . x ¯ ) {\displaystyle =A.x_{k}+({\overline {y}}-A.{\overline {x}})}
= A . ( x k − x ¯ ) + y ¯ {\displaystyle =A.(x_{k}-{\overline {x}})+{\overline {y}}}
S Q r e s = ( y ^ 1 − y 1 ) 2 + ( y ^ 2 − y 2 ) 2 + ⋯ + ( y ^ n − y n ) 2 {\displaystyle SQ_{res}=({\hat {y}}_{1}-y_{1})^{2}+({\hat {y}}_{2}-y_{2})^{2}+\cdots +({\hat {y}}_{n}-y_{n})^{2}}
= [ A . ( x 1 − x ¯ ) + y ¯ − y 1 ] 2 + ⋯ + [ A . ( x n − x ¯ ) + y ¯ − y n ] 2 {\displaystyle =[A.(x_{1}-{\overline {x}})+{\overline {y}}-y_{1}]^{2}+\cdots +[A.(x_{n}-{\overline {x}})+{\overline {y}}-y_{n}]^{2}}
= [ A . ( x 1 − x ¯ ) + ( y ¯ − y 1 ) ] 2 + ⋯ + [ A . ( x n − x ¯ ) + ( y ¯ − y n ) ] 2 {\displaystyle =[A.(x_{1}-{\overline {x}})+({\overline {y}}-y_{1})]^{2}+\cdots +[A.(x_{n}-{\overline {x}})+({\overline {y}}-y_{n})]^{2}} = A 2 . ( x 1 − x ¯ ) 2 + 2. A . ( x 1 − x ¯ ) . ( y ¯ − y 1 ) + ( y ¯ − y 1 ) 2 + ⋯ + A 2 . ( x n − x ¯ ) 2 + 2. A . ( x n − x ¯ ) . ( y ¯ − y n ) + ( y ¯ − y n ) 2 {\displaystyle =A^{2}.(x_{1}-{\overline {x}})^{2}+2.A.(x_{1}-{\overline {x}}).({\overline {y}}-y_{1})+({\overline {y}}-y_{1})^{2}+\cdots +A^{2}.(x_{n}-{\overline {x}})^{2}+2.A.(x_{n}-{\overline {x}}).({\overline {y}}-y_{n})+({\overline {y}}-y_{n})^{2}} = A 2 . ( x 1 2 − 2. x 1 . x ¯ + x ¯ 2 ) + 2. A . ( x 1 . y ¯ − x 1 . y 1 − x ¯ . y ¯ + x ¯ . y 1 ) + ( y ¯ 2 − 2. y ¯ . y 1 + y 1 2 ) + ⋯ {\displaystyle =A^{2}.(x_{1}^{2}-2.x_{1}.{\overline {x}}+{\overline {x}}^{2})+2.A.(x_{1}.{\overline {y}}-x_{1}.y_{1}-{\overline {x}}.{\overline {y}}+{\overline {x}}.y_{1})+({\overline {y}}^{2}-2.{\overline {y}}.y_{1}+y_{1}^{2})+\cdots } = A 2 . x 1 2 − 2 A 2 . x 1 . x ¯ + A 2 . x ¯ 2 + 2. A . x 1 . y ¯ − 2. A . x 1 . y 1 − 2. A . x ¯ . y ¯ + 2. A . x ¯ . y 1 + y ¯ 2 − 2. y ¯ . y 1 + y 1 2 + ⋯ {\displaystyle =A^{2}.x_{1}^{2}-2A^{2}.x_{1}.{\overline {x}}+A^{2}.{\overline {x}}^{2}+2.A.x_{1}.{\overline {y}}-2.A.x_{1}.y_{1}-2.A.{\overline {x}}.{\overline {y}}+2.A.{\overline {x}}.y_{1}+{\overline {y}}^{2}-2.{\overline {y}}.y_{1}+y_{1}^{2}+\cdots }
= A 2 . ( ∑ x 2 ) − 2 A 2 . x ¯ . ( ∑ x ) + n . A 2 . x ¯ 2 + 2. A . y ¯ . ( ∑ x ) − 2. A . ( ∑ x . y ) − 2. A . n . x ¯ . y ¯ + 2. A . x ¯ . ( ∑ y ) + n . y ¯ 2 − 2. y ¯ . ( ∑ y ) + ( ∑ y 2 ) {\displaystyle =A^{2}.(\sum {x^{2}})-2A^{2}.{\overline {x}}.(\sum {x})+n.A^{2}.{\overline {x}}^{2}+2.A.{\overline {y}}.(\sum {x})-2.A.(\sum {x.y})-2.A.n.{\overline {x}}.{\overline {y}}+2.A.{\overline {x}}.(\sum {y})+n.{\overline {y}}^{2}-2.{\overline {y}}.(\sum {y})+(\sum {y^{2}})}
= A 2 . ( n . x 2 ¯ ) − 2 A 2 . x ¯ . ( n . x ¯ ) + n . A 2 . x ¯ 2 + 2. A . y ¯ . ( n . x ¯ ) − 2. A . ( n . x y ¯ ) − 2. A . n . x ¯ . y ¯ + 2. A . x ¯ . ( n . y ¯ ) + n . y ¯ 2 − 2. y ¯ . ( n . y ¯ ) + ( n . y 2 ¯ ) {\displaystyle =A^{2}.(n.{\overline {x^{2}}})-2A^{2}.{\overline {x}}.(n.{\overline {x}})+n.A^{2}.{\overline {x}}^{2}+2.A.{\overline {y}}.(n.{\overline {x}})-2.A.(n.{\overline {xy}})-2.A.n.{\overline {x}}.{\overline {y}}+2.A.{\overline {x}}.(n.{\overline {y}})+n.{\overline {y}}^{2}-2.{\overline {y}}.(n.{\overline {y}})+(n.{\overline {y^{2}}})}
= n . ( A 2 . x 2 ¯ − 2. A 2 . x ¯ 2 + A 2 . x ¯ 2 + 2. A . x ¯ . y ¯ − 2. A . x y ¯ − 2. A . x ¯ . y ¯ + 2. A . x ¯ . y ¯ + y ¯ 2 − 2. y ¯ 2 + y 2 ¯ ) {\displaystyle =n.(A^{2}.{\overline {x^{2}}}-2.A^{2}.{\overline {x}}^{2}+A^{2}.{\overline {x}}^{2}+2.A.{\overline {x}}.{\overline {y}}-2.A.{\overline {xy}}-2.A.{\overline {x}}.{\overline {y}}+2.A.{\overline {x}}.{\overline {y}}+{\overline {y}}^{2}-2.{\overline {y}}^{2}+{\overline {y^{2}}})}
= n . ( A 2 . x 2 ¯ − A 2 . x ¯ 2 + 2. A . x ¯ . y ¯ − 2. A . x y ¯ − y ¯ 2 + y 2 ¯ ) {\displaystyle =n.(A^{2}.{\overline {x^{2}}}-A^{2}.{\overline {x}}^{2}+2.A.{\overline {x}}.{\overline {y}}-2.A.{\overline {xy}}-{\overline {y}}^{2}+{\overline {y^{2}}})}
= n . [ A 2 . ( x 2 ¯ − x ¯ 2 ) + 2 A . ( x ¯ . y ¯ − x y ¯ ) + y 2 ¯ − y ¯ 2 ] {\displaystyle =n.[A^{2}.({\overline {x^{2}}}-{\overline {x}}^{2})+2A.({\overline {x}}.{\overline {y}}-{\overline {xy}})+{\overline {y^{2}}}-{\overline {y}}^{2}]}
= n . [ ( x ¯ . y ¯ − x y ¯ x ¯ 2 − x 2 ¯ ) 2 . ( x 2 ¯ − x ¯ 2 ) + 2. x ¯ . y ¯ − x y ¯ x ¯ 2 − x 2 ¯ . ( x ¯ . y ¯ − x y ¯ ) ] + n . ( y 2 ¯ − y ¯ 2 ) {\displaystyle =n.\left[\left({\dfrac {{\overline {x}}.{\overline {y}}-{\overline {xy}}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}\right)^{2}.({\overline {x^{2}}}-{\overline {x}}^{2})+2.{\dfrac {{\overline {x}}.{\overline {y}}-{\overline {xy}}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}.({\overline {x}}.{\overline {y}}-{\overline {xy}})\right]+n.({\overline {y^{2}}}-{\overline {y}}^{2})}
= n . [ ( x ¯ . y ¯ − x y ¯ ) 2 . − ( x ¯ 2 − x 2 ¯ ) ( x ¯ 2 − x 2 ¯ ) 2 + 2. ( x ¯ . y ¯ − x y ¯ ) 2 x ¯ 2 − x 2 ¯ ] + S Q tot {\displaystyle =n.\left[{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}.-({\overline {x}}^{2}-{\overline {x^{2}}})}{({\overline {x}}^{2}-{\overline {x^{2}}})^{2}}}+2.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}\right]+SQ_{\text{tot}}}
= n . [ − ( x ¯ . y ¯ − x y ¯ ) 2 x ¯ 2 − x 2 ¯ + 2. ( x ¯ . y ¯ − x y ¯ ) 2 x ¯ 2 − x 2 ¯ ] + S Q tot {\displaystyle =n.\left[{\dfrac {-({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}+2.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}\right]+SQ_{\text{tot}}}
= n . ( x ¯ . y ¯ − x y ¯ ) 2 x ¯ 2 − x 2 ¯ + S Q y ¯ c . q . d . {\displaystyle =n.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}+SQ_{\overline {y}}\ \ \ \ \ \ c.q.d.}
Teorema 3: R 2 = ( x ¯ . y ¯ − x y ¯ ) 2 ( x 2 ¯ − x ¯ 2 ) . ( y 2 ¯ − y ¯ 2 ) {\displaystyle R^{2}={\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{({\overline {x^{2}}}-{\overline {x}}^{2}).({\overline {y^{2}}}-{\overline {y}}^{2})}}}
Prova: R 2 = 1 − S Q r e s S Q tot = S Q tot S Q tot − S Q r e s S Q tot = S Q tot − S Q r e s S Q tot = S Q tot − [ n . ( x ¯ . y ¯ − x y ¯ ) 2 x ¯ 2 − x 2 ¯ + S Q tot ] S Q tot {\displaystyle R^{2}=1-{\dfrac {SQ_{res}}{SQ_{\text{tot}}}}={\dfrac {SQ_{\text{tot}}}{SQ_{\text{tot}}}}-{\dfrac {SQ_{res}}{SQ_{\text{tot}}}}={\dfrac {SQ_{\text{tot}}-SQ_{res}}{SQ_{\text{tot}}}}={\dfrac {SQ_{\text{tot}}-\left[n.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x}}^{2}-{\overline {x^{2}}}}}+SQ_{\text{tot}}\right]}{SQ_{\text{tot}}}}}
= n . ( x ¯ . y ¯ − x y ¯ ) 2 x 2 ¯ − x ¯ 2 . 1 S Q tot = n . ( x ¯ . y ¯ − x y ¯ ) 2 ( x 2 ¯ − x ¯ 2 ) . n . ( y 2 ¯ − y ¯ 2 ) = ( x ¯ . y ¯ − x y ¯ ) 2 ( x 2 ¯ − x ¯ 2 ) . ( y 2 ¯ − y ¯ 2 ) c . q . d . {\displaystyle =n.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{{\overline {x^{2}}}-{\overline {x}}^{2}}}.{\dfrac {1}{SQ_{\text{tot}}}}=n.{\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{({\overline {x^{2}}}-{\overline {x}}^{2}).n.({\overline {y^{2}}}-{\overline {y}}^{2})}}={\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{({\overline {x^{2}}}-{\overline {x}}^{2}).({\overline {y^{2}}}-{\overline {y}}^{2})}}\ \ \ \ \ \ c.q.d.}
Teorema 4: ( Coeficiente de Correlação)² = Coeficiente de Determinação
Prova: Coeficiente de Correlação = R = ∑ ( x − x ¯ ) . ( y − y ¯ ) ∑ ( x − x ¯ ) 2 . ∑ ( y − y ¯ ) 2 {\displaystyle R={\dfrac {\sum {(x-{\overline {x}}).(y-{\overline {y}})}}{{\sqrt {\sum {(x-{\overline {x}})^{2}}}}.{\sqrt {\sum {(y-{\overline {y}})^{2}}}}}}}
Para elevá-lo ao quadrado, façamos separadamente numerador e denominador:
Quadrado do numerador: [ ∑ ( x − x ¯ ) . ( y − y ¯ ) ] 2 {\displaystyle [\ \sum {(x-{\overline {x}}).(y-{\overline {y}})}\ ]^{2}}
= [ ∑ ( x . y − x . y ¯ − x ¯ . y + x ¯ . y ¯ ) ] 2 {\displaystyle =[\ \sum {(x.y-x.{\overline {y}}-{\overline {x}}.y+{\overline {x}}.{\overline {y}})}\ ]^{2}}
= [ ( x 1 . y 1 − x 1 . y ¯ − x ¯ . y 1 + x ¯ . y ¯ ) + ⋯ + ( x n . y n − x n . y ¯ − x ¯ . y n + x ¯ . y ¯ ) ] 2 {\displaystyle =[\ (x_{1}.y_{1}-x_{1}.{\overline {y}}-{\overline {x}}.y_{1}+{\overline {x}}.{\overline {y}})+\cdots +(x_{n}.y_{n}-x_{n}.{\overline {y}}-{\overline {x}}.y_{n}+{\overline {x}}.{\overline {y}})\ ]^{2}}
= [ ( ∑ x . y ) − y ¯ . ( ∑ x ) − x ¯ . ( ∑ y ) + n . x ¯ . y ¯ ] 2 {\displaystyle =[\ (\sum {x.y})-{\overline {y}}.(\sum {x})-{\overline {x}}.(\sum {y})+n.{\overline {x}}.{\overline {y}}\ ]^{2}}
= [ ( n . x . y ¯ ) − y ¯ . ( n . x ¯ ) − x ¯ . ( n . y ¯ ) + n . x ¯ . y ¯ ] 2 {\displaystyle =[\ (n.{\overline {x.y}})-{\overline {y}}.(n.{\overline {x}})-{\overline {x}}.(n.{\overline {y}})+n.{\overline {x}}.{\overline {y}}\ ]^{2}}
= [ n . ( x . y ¯ − x ¯ . y ¯ ) ] 2 {\displaystyle =[\ n.({\overline {x.y}}-{\overline {x}}.{\overline {y}})\ ]^{2}}
= n 2 . ( x . y ¯ − x ¯ . y ¯ ) 2 {\displaystyle =n^{2}.({\overline {x.y}}-{\overline {x}}.{\overline {y}})^{2}}
Agora, façamos o quadrado do denominador:
[ ∑ ( x − x ¯ ) 2 . ∑ ( y − y ¯ ) 2 ] 2 {\displaystyle [\ {\sqrt {\sum {(x-{\overline {x}})^{2}}}}.{\sqrt {\sum {(y-{\overline {y}})^{2}}}}\ ]^{2}}
= [ ∑ ( x − x ¯ ) 2 ] . [ ∑ ( y − y ¯ ) 2 ] {\displaystyle =[\ \sum {(x-{\overline {x}})^{2}}\ ].[\ \sum {(y-{\overline {y}})^{2}}\ ]}
= [ ∑ ( x 2 − 2. x . x ¯ + x ¯ 2 ) ] . [ ∑ ( y 2 − 2. y . y ¯ + y ¯ 2 ) ] {\displaystyle =[\ \sum {(x^{2}-2.x.{\overline {x}}+{\overline {x}}^{2})}\ ].[\ \sum {(y^{2}-2.y.{\overline {y}}+{\overline {y}}^{2})}\ ]}
= [ ( ∑ x 2 ) − 2. x ¯ . ( ∑ x ) + n . x ¯ 2 ] . [ ( ∑ y 2 ) − 2. y ¯ . ( ∑ y ) + n . y ¯ 2 ] {\displaystyle =[\ (\sum {x^{2}})-2.{\overline {x}}.(\sum {x})+n.{\overline {x}}^{2}\ ].[\ (\sum {y^{2}})-2.{\overline {y}}.(\sum {y})+n.{\overline {y}}^{2}\ ]}
= [ ( n . x 2 ¯ ) − 2. x ¯ . ( n . x ¯ ) + n . x ¯ 2 ] . [ ( n . y 2 ¯ ) − 2. y ¯ . ( n . y ¯ ) + n . y ¯ 2 ] {\displaystyle =[\ (n.{\overline {x^{2}}})-2.{\overline {x}}.(n.{\overline {x}})+n.{\overline {x}}^{2}\ ].[\ (n.{\overline {y^{2}}})-2.{\overline {y}}.(n.{\overline {y}})+n.{\overline {y}}^{2}\ ]}
= ( n . x 2 ¯ − n . x ¯ 2 ) . ( n . y 2 ¯ − n . y ¯ 2 ) {\displaystyle =(n.{\overline {x^{2}}}-n.{\overline {x}}^{2}).(n.{\overline {y^{2}}}-n.{\overline {y}}^{2})}
= [ n . ( x 2 ¯ − x ¯ 2 ) ] . [ n . ( y 2 ¯ − y ¯ 2 ) ] {\displaystyle =[\ n.({\overline {x^{2}}}-{\overline {x}}^{2})\ ].[\ n.({\overline {y^{2}}}-{\overline {y}}^{2})\ ]}
= n 2 . ( x 2 ¯ − x ¯ 2 ) . ( y 2 ¯ − y ¯ 2 ) {\displaystyle =n^{2}.({\overline {x^{2}}}-{\overline {x}}^{2}).({\overline {y^{2}}}-{\overline {y}}^{2})}
Juntando, temos:
(Coeficiente de Correlação)² = ( R ) 2 = n 2 . ( x . y ¯ − x ¯ . y ¯ ) 2 n 2 . ( x 2 ¯ − x ¯ 2 ) . ( y 2 ¯ − y ¯ 2 ) {\displaystyle (R)^{2}={\dfrac {n^{2}.({\overline {x.y}}-{\overline {x}}.{\overline {y}})^{2}}{n^{2}.({\overline {x^{2}}}-{\overline {x}}^{2}).({\overline {y^{2}}}-{\overline {y}}^{2})}}}
= ( x ¯ . y ¯ − x y ¯ ) 2 ( x 2 ¯ − x ¯ 2 ) . ( y 2 ¯ − y ¯ 2 ) {\displaystyle ={\dfrac {({\overline {x}}.{\overline {y}}-{\overline {xy}})^{2}}{({\overline {x^{2}}}-{\overline {x}}^{2}).({\overline {y^{2}}}-{\overline {y}}^{2})}}} = Coeficiente de Determinação (R²) c.q.d.