Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ
Tải xuống
CHAPTER 11 The Regression Fallacy. Only for the sake of this exercise we will assume that “intelligence” is an innate property of individuals and can be represented by a real number z. If one picks at random a student entering the U of U, the intelligence of this student is a random variable which we assume to be normally distributed with mean µ and standard deviation σ. | CHAPTER 11 The Regression Fallacy Only for the sake of this exercise we will assume that intelligence is an innate property of individuals and can be represented by a real number z. If one picks at random a student entering the U of U the intelligence of this student is a random variable which we assume to be normally distributed with mean and standard deviation a. Also assume every student has to take two intelligence tests the first at the beginning of his or her studies the other half a year later. The outcomes of these tests are x and y. x and y measure the intelligence z which is assumed to be the same in both tests plus a random error and 5 i.e. 11.0.14 11.0.15 X Z y z 5 309 310 11. THE REGRESSION FALLACY Here z N p t2 N 0 ct2 and 6 N 0 ct2 i.e. we assume that both errors have the same variance . The three variables 6 and z are independent of each other. Therefore x and y are jointly normal. var x t2 G2 var y t2 G2 cov x y cov z z 6 t2 0 0 0 t2. Therefore p t2W2. The contour lines of the joint density are ellipses with center p p whose main axes are the lines y x and y x in the x y-plane. Now what is the conditional mean Since var x var y 10.3.17 gives the line E y x x p p x p i.e. it is a line which goes through the center of the ellipses but which is flatter than the line x y representing the real underlying linear relationship if there are no errors. Geometrically one can get it as the line which intersects each ellipse exactly where the ellipse is vertical. Therefore the parameters of the best prediction of y on the basis of x are not the parameters of the underlying relationship. Why not Because not only y but also x is subject to errors. Assume you pick an individual by random and it turns out that his or her first test result is very much higher than the average. Then it is more likely that this is an individual which was lucky in the first exam and his or her true IQ is lower than the one measured than that the individual is an Einstein who had a bad