Introduction to Random Variables 1 Definition of random variable Sometimes, it is not enough to describe all possible results of an experiment: 1 Definition of random variable 2 Discrete and continuous random variable Toss a coin 3 times: {(HHH), (HHT), …} Throw a dice twice: {(1,1), (1,2), (1,3), …} Probability function Distribution function Density function Some tine it is useful to associate a number to each result of an experiment Define a variable 3 Characteristic measures of a random variable We don’t know the result of the experiment before we carry it out We don’t know the value of the variable before the experiment Mean, variance Other measures 4 Transformation of random variables 1 Estadística, Profesora: María Durbán 2 Estadística, Profesora: María Durbán 1 Definition of random variable 1 Definition of random variable A random variable is a function which associates a real number to each element of the sample space Sometimes, it is not enough to describe all possible results of an experiment: Toss a coin 3 times: {(HHH), (HHT), …} Throw a dice twice: {(1,1), (1,2), (1,3), …} Random Variables are represented in capital letters, generally the last letters of the alphabet: X,Y, Z, etc. A veces es útil asociar un número a cada resultado del experimento. X = Number of head on the first toss X[(HHH)]=1, X[(THT)]=0, … No conocemos el resultado del experimento antes de realizarlo Y = Sum of points Y[(1,1)]=2, Y[(1,2)]=3, … The values taken by the variable are represented by small letters, No conocemos el valor que va a tomar la variable antes del experimento x=1 is a possible value of X y=3.2 is a possible value of Y z=-7.3 is a possible value of Z 3 Estadística, Profesora: María Durbán 4 Estadística, Profesora: María Durbán 1 Definition of random variable 1 Definition of random variable Examples E si X(sk) = a Number of defective units in a random sample of 5 units Number of faults per cm2 of X(si) = b; si ∈ E sk RX material a b Lifetime of a lamp Resistance to compression of concrete • The space RX is the set of ALL possible values of X(s). • Each possible event of E has an associated value in RX • We can consider Rx as another random space 5 Estadística, Profesora: María Durbán Introduction to Random Variables 1 Definition of random variable E 6 Estadística, Profesora: María Durbán si X(si) = b; si ∈ E sk 1 Definition of random variable X(sk) = a 22 Discrete Discrete and and continuos continuousrandom randomvariables variable RX a Probability function Distribution function Density function b The elements in E have a probability distribution, this distribution is also associated to the values of the variable X. That is, all r.v. preserve the probability structure of the random experiment that generates it: 3 Characteristic measures of a random variable Mean, variance Other measures Pr( X = x) = Pr( s ∈ E : X ( s ) = x) 4 Transformation of random variables 7 Estadística, Profesora: María Durbán 8 Estadística, Profesora: María Durbán 2 Discrete and continuous random variables 2 Discrete and continuous random variables Examples Examplesof ofdiscrete discreterandom randomvariables variables The rank of a random variable una variable aleatoria is the set of possible values taken by the variable. Number Numberof offaults faultson onaaglass glasssurface surface Depending on the rank, the variables can be classified as: Proportion Proportionof ofdefault defaultparts partsininaasample sampleof of1000 1000 Generally count the number of times that something happens Number Numberof ofbits bitstransmited transmitedand andreceived receivedcorrectly correctly Discrete: Discrete:Those Thosethat thattake takeaafinite finiteor orinfinite infinite(numerable) (numerable)number numberof ofvalues values Examples Examplesof ofcontinuous continuous random randomvariables variables Continuous: Continuous:Those Thosewhose whoserank rankisisan aninterval intervalof ofreal realnumbers numbers Electric Electriccurrent current Longitude Longitude Generally measure a magnitude Temperature Temperature 9 Estadística, Profesora: María Durbán Weight Weight 10 Estadística, Profesora: María Durbán 2 Discrete random variables 2 Discrete random variables The values taken by a random variable change from one experiment to another, since the results of the experiment are different The properties of the probability function come from the axioms of probability: A r.v. is defined by 1. 3. The values that it takes. The probability of taking each value. 0≤P(A) ≤1 2. P(E)=1 P(AUB)=P(A)+P(B) si A∩B=Ø 0 ≤ p ( xi ) ≤ 1 p(xi) n ∑ p( x ) = 1 i =1 i a < b < c → A = {a ≤ X ≤ b} B = {b < X ≤ c} Pr(a ≤ X ≤ c) = Pr(a ≤ X ≤ b) + Pr(b < X ≤ c) This is a function that indicates the probability of each possible value x p ( xi ) = P ( X = xi ) x1 x2 x3 11 Estadística, Profesora: María Durbán x4 x5 x6 xn 12 Estadística, Profesora: María Durbán 2 Discrete random variables 2 Discrete random variables Experiment: Toss 2 coins. X=Number of tails. Experiment: Toss 2 coins. X=Number of tails. 0 E HH TH HT TT 1/ 4 1/ 2 Pr H H X P(X=x) H T 0 1/4 1 1/2 2 1/4 1 X RX 0 1 T H T T 2 13 Estadística, Profesora: María Durbán 2 Discrete random variables 2 Discrete random variables Experiment: Toss 2 coins. X=Number of tails. p(x) x=0 x=1 x=2 14 Estadística, Profesora: María Durbán Sometimes we might be interested on the probability that a variable takes a value less or equal to a quantity X P(X=x) 0 1/4 1 1/2 F (−∞) = 0 F (+∞) = 1 if X takes values x1 ≤ x 2 ≤ K ≤ x n : 2 1/4 F ( x1 ) = P( X ≤ x1 ) = p ( x1 ) F ( x0 ) = P ( X ≤ x0 ) F ( x2 ) = P( X ≤ x2 ) = p ( x1 ) + p ( x2 ) M X F ( xn ) = P( X ≤ xn ) = ∑ i =1 p( xi ) = 1 n 15 Estadística, Profesora: María Durbán 16 Estadística, Profesora: María Durbán 2 Discrete random variables 2 Discrete random variables Experiment: Toss 2 coins. X=Number of tails. Experiment: Toss 2 coins. X=Number of tails. p(x) x=0 x=1 x=2 X P(X=x) 0 1/4 1 1/2 2 1/4 F(x) X 1 0.75 0.5 0.25 X x=0 x=1 x=2 F(x) 0 1/4 1 3/4 2 1 X 17 Estadística, Profesora: María Durbán 18 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables When a random variable is continuous, it doesn’t make sense to sum: Density function describes the probability distribution of a continuous random variable. It is a function that satisfies: ∞ ∑ p( x ) = 1 i =1 i Since the set of of values taken by the variable is not numerable We can generalize f ( x) ≥ 0 ∑→ ∫ ∫ +∞ −∞ We introduce a new concept instead of the probability function of discrete random variables f ( x) dx = 1 b P(a ≤ X ≤ b) = ∫ f ( x) dx a 19 Estadística, Profesora: María Durbán 20 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables Density function describes the probability distribution of a continuous random variable. It is a function that satisfies: a P( X = a) = ∫ f ( x) dx = 0 a f ( x) ≥ 0 ∫ +∞ −∞ P ( a ≤ X ≤ b) = P ( a < X ≤ b) = P ( a ≤ X < b) f ( x) dx = 1 b P(a ≤ X ≤ b) = ∫ f ( x) dx a a = P ( a < X < b) b a Area below the curve 21 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables 0.5 If we measure a continuous variable and represent the values in a histogram: 0.4 The density function doesn’t have to be symmetric, or be define for all values 22 Estadística, Profesora: María Durbán the form of the curve will depend on one or more parameters Estadística, Profesora: María Durbán 0.0 0.1 0.2 x2 0.3 fX (x | β ) 0 5 10 15 20 25 30 If we make the intervals smaller and smaller: y 23 24 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables f ( x) 25 Estadística, Profesora: María Durbán 26 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables Example Example The density function of the use of a machine in a year (in hours x100): What is the probability that a machine randomly selected has been used less than 320 hours? f(x) ⎧ 0.4 ⎪ 2.5 x, ⎪⎪ 0.4 f ( x ) = ⎨0.8 − x, 2.5 ⎪ ⎪0, ⎩⎪ f(x) P ( X < 3. 2 ) = 0.4 0 < x < 2.5 2. 5 3. 2 0. 4 ⎞ ⎛ ⎛ 0. 4 ⎞ x ⎟dx + ∫ ⎜ 0.8 − x ⎟dx = ∫⎜ 2. 5 ⎠ 2. 5 ⎠ 0⎝ 2.5⎝ 2.5 ≤ x < 5 else elsewhere 2.5 5 x = 0.74 2.5 27 Estadística, Profesora: María Durbán 0.4 Estadística, Profesora: María Durbán 3.2 5 x 28 2 Continuous random variables 2 Continuous random variables As in the case of discrete random variables, we can define the distribution of a continuous random variables by means of the Distribution function: F ( x) = P( X ≤ x) = ∫ x −∞ f (u ) du As in the case of discrete random variables, we can define the distribution of a continuous random variables by means of the Distribution function: F ( x) = P( X ≤ x) = ∫ −∞ < x < ∞ x −∞ −∞ < x < ∞ f (u ) du In the discrete case, the Probability function is obtained as the difference of to adjoin values of F(x). In the case of continuous variables: P( X ≤ x) f ( x) = x dF ( x) dx 30 29 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables The Distribution function satisfies the following properties: The Distribution function satisfies the following properties: a < b ⇒ F (a ) ≤ F (b) It is non-decreasing F (−∞) = 0 F (+∞) = 1 It is right-continuous a < b ⇒ F (a ) ≤ F (b) F (−∞) = 0 F (+∞) = 1 If we define the following disjoint events: −∞ F (+∞) = Pr( X ≤ +∞) = ∫ +∞ −∞ { X ≤ a} {a < X ≤ b} → { X ≤ a} ∪ {a < X ≤ b} = { X ≤ b} Pr( X ≤ b) = Pr( X ≤ a ) + Pr(a < X ≤ b) ≤ F (b) F (−∞) = Pr( X ≤ −∞) = ∫ First axiom of probability −∞ f ( x)dx = 0 f ( x)dx = 1 ≥0 Estadística, Profesora: María Durbán Third axiom of probability 31 32 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables Example Example ⎧ 0.4 ⎪ 2.5 x, 0 < x < 2.5 ⎪ 0.4 ⎪ f ( x) = ⎨0.8 − x, 2.5 ≤ x < 5 2.5 ⎪ ⎪0, elsewhere ⎪ ⎩ The density function of the use of a machine in a year (en horas x100): f(x) 0.4 ⎧ 0.4 0 < x < 2.5 ⎪ 2.5 x, ⎪⎪ 0.4 f ( x ) = ⎨0.8 − x, 2.5 ≤ x < 5 2.5 ⎪ else elsewhere ⎪0, ⎪⎩ Pr(0 < X < 2.5) 2.5 5 x 33 Estadística, Profesora: María Durbán Pr(2.5 ≤ X < x) ⎧ x 0.4 0 < x < 2.5 u du ⎪ ∫0 ⎪ 2.5 x 0.4 ⎪ 2.5 0.4 u du, 2.5 ≤ x < 5 F ( x) = ⎨ ∫ u du + ∫ 0.8 − 0 2.5 2.5 2.5 ⎪ ⎪ ⎪ x≥5 Pr( X ≤ 5) ⎩1 34 Estadística, Profesora: María Durbán 2 Continuous random variables 2 Continuous random variables Example Example Example P(x<3.2) P(x<3.2) x=3.2 ⎧ ⎪0.08 x 2 0 < x < 2.5 ⎪⎪ 2 F ( x) = ⎨-1 + 0.8 x - 0.08 x 2.5 ≤ x < 5 ⎪ ⎪ x≥5 ⎪⎩1 Estadística, Profesora: María Durbán 35 36 Estadística, Profesora: María Durbán Introduction to Random Variables 3 Characteristic measures of a r.v. Central measures 1 Definition of random variable In the case of a sample of data, the sample mean allocates a weight of 1/n to each value: 1 1 1 x = x1 + x2 + K + xn n n n 2 Discrete and continuous random variable Probability function Distribution function Density function The mean μ or Expectation of a r.v. uses the probability as a weight: 33Characteristic Characteristicmeasures measuresofofaarandom randomvariable variable μ = E [ X ] = ∑ xi p( xi ) Mean, variance Other measures discrete r.v. i +∞ μ = E [ X ] = ∫ x f ( x) dx −∞ 4 Transformation of random variables continuous r.v. 37 Estadística, Profesora: María Durbán 38 Estadística, Profesora: María Durbán 3 Characteristic measures of a r.v. 3 Characteristic measures of a r.v. Central measures Example ⎧ 0.4 ⎪ 2.5 x, 0 < x < 2.5 ⎪ 0.4 ⎪ f ( x) = ⎨0.8 − x, 2.5 ≤ x < 5 2.5 ⎪ ⎪0, elsewhere ⎪ ⎩ Intuitively: Median = value that divides the total probability in to parts P( X ≤ m) = 0.5 F (m) ≥ 0.5 0.5 E[X ] = ∫ 0.5 +∞ −∞ 39 Estadística, Profesora: María Durbán What is the average time of use of the machines? xf ( x)dx = ∫ = 2.5 Estadística, Profesora: María Durbán 2.5 0 5 0.4 2 0.4 2 x dx + ∫ 0.8 x − x dx 2.5 2.5 2.5 40 3 Characteristic measures of a r.v. 3 Characteristic measures of a r.v. Other measures Example If we want to know the time of use such that 50% of the machines have a use less or equal to that value The percentil p of a random variable is the value xp that satisfies: F (m) = 0.5 ⎧0.08 x 2 ⎪ F ( x) = ⎨-1 + 0.8 x - 0.08 x 2 ⎪1 ⎩ 0 < x < 2.5 2.5 ≤ x < 5 x≥5 p( X < x p ) ≤ p y p( X ≤ x p ) ≥ p discrete r.v. F (xp ) = p continuous r.v. A special case are quartiles which divide the distribution in 4 parts 0.08 x 2 = 0.5 → m = 2.5 Q1 = p0.25 -1 + 0.8 x - 0.08 x 2 = 0.5 → m = 2.5 Q2 = p0.5 = Median Q3 = p0.75 41 Estadística, Profesora: María Durbán 42 Estadística, Profesora: María Durbán 3 Characteristic measures of a r.v. 3 Characteristic measures of a r.v. Medisures of dispersion Medisures of dispersion 2 Var [ X ] = E ⎡( X − E [ X ]) ⎤ ⎣ ⎦ 2 Var [ X ] = E ⎡( X − E [ X ]) ⎤ ⎣ ⎦ The sample variance of a set of data is given by: Var [ X ] = E ⎡⎣ X 2 ⎤⎦ − ( E [ X ]) 1 1 1 s2 = (x1 −x)2 + (x2 −x)2 +K+ (xn −x)2 n n n The Variance of a r.v. also uses the probability as a weight: σ = Var [ X ] = ∑ ( xi − μ ) p( xi ) 2 2 2 2 E ⎡( X − E [ X ]) ⎤ = E ⎡ X 2 + ( E [ X ]) − 2 XE [ X ]⎤ ⎣ ⎦ ⎣ ⎦ = E ⎡⎣ X 2 ⎤⎦ + ( E [ X ]) − 2 E [ X ] E [ X ] 2 discrete r.v. = E ⎡⎣ X 2 ⎤⎦ − ( E [ X ]) i +∞ σ 2 = Var [ X ] = ∫ ( x − μ ) 2 f ( x) dx Estadística, Profesora: María Durbán −∞ continuous r.v. 2 43 2 E [ X ] is a constant, does not depend on X It is a linear operator 44 Estadística, Profesora: María Durbán Introduction to Random Variables 4 Transformation of random variables In some situations we will need to know the probability distribution of a transformation of a random variable 1 Definition of random variable 2 Discrete and continuous random variable Examples Probability function Distribution function Density function Change units Use logarithmic scale aX + b sinXX sin 3 Characteristic measures of a random variable 11 XX Mean, variance Other measures 4 Transformation random variables 4 Transformation of of random variables Y = g( X ) e 45 |X | X log X Estadística, Profesora: María Durbán X2 X 46 Estadística, Profesora: María Durbán 4 Transformation of random variables 4 Transformation of random variables Example Let X be a r.v. If we change to Y=h(X), we obtain a new r.v.: A company packs microchips in lots. It is know that the probability distribution of the number of microchips per lots is given by: Distribution function Y = h( X ) FY ( y ) = Pr(Y ≤ y ) = Pr(h( X ) ≤ y ) = Pr( x ∈ A) A = { x, h( x) ≤ y} 47 Estadística, Profesora: María Durbán x p(x) F(x) 11 0.03 0.03 12 0.03 0.06 13 0.03 0.09 14 0.06 0.15 15 0.26 0.41 16 0.09 0.5 17 0.12 0.62 18 0.21 0.83 19 0.14 0.97 20 0.03 1 Estadística, Profesora: María Durbán ¿ Pr( X 2 ≤ 144)? Pr( X 2 ≤ 144) = Pr( x ∈ A) { A = x, x ≤ 144 } A = { x, x 2 ≤ 144} Pr ( X ≤ 12 ) = 0.06 48 4 Transformation of random variables 4 Transformation of random variables Density function Y = h( X ) If X is a continuous r.v. Y=h(X), where h is derivable and inyective In general: If h is continuous and monotonic increasing : fY ( y ) = f X ( x ) FY ( y ) = Pr(h( X ) ≤ y ) = Pr( X ≤ h −1 ( y )) = FX (h −1 ( y )) If x ⎧ ∂FX ( x) dx dx dy ∂FY ( y ) ∂Fx (h ( y )) ⎪⎪ fY ( y ) = = =⎨ ∂y ∂y ⎪ ∂ (1 − FX ( x)) dx ⎪⎩ dx dy h is continuous and monotonic decreasing: −1 dx dy increasing −1 −1 FY ( y ) = Pr(h( X ) ≤ y ) = Pr( X ≥ h ( y )) = 1 − FX (h ( y )) decreasing 49 Estadística, Profesora: María Durbán 50 Estadística, Profesora: María Durbán 4 Transformation of random variables 4 Transformation of random variables Example If X is a continuous r.v. Y=h(X), where h is derivable and inyective fY ( y ) = f X ( x ) The velocity of a gas particle is a r.v. V with density function dx dy (b 2 / 2)v 2 e − bv v > 0 fV (v) = 0 elsewhere For discrete r.v.: The kinetic energy of the particle is function of W? pY ( y ) = Pr(Y = y ) = ∑ h ( xi ) = y What is the density Pr( X = xi ) 51 Estadística, Profesora: María Durbán W = mV 2 / 2 52 Estadística, Profesora: María Durbán 4 Transformation of random variables 4 Transformation of random variables Example Example The velocity of a gas particle is a r.v. V with density function The velocity of a gas particle is a r.v. V with density function (b 2 / 2)v 2 e − bv v > 0 fV (v) = 0 elsewhere (b 2 / 2)v 2 e − bv v > 0 fV (v) = 0 elsewhere W = mV 2 / 2 → v = 2w / m v = − 2w / m dv 1 = dw 2mw fV (h −1 ( w)) = (b 2 / 2) ( )e 2 2w / m (b 2 / 2m) 2w / m e − b 2 w / m w > 0 fW ( w) = 0 elsewhere −b 2 w / m 53 Estadística, Profesora: María Durbán 54 Estadística, Profesora: María Durbán 4 Transformation of random variables 4 Transformation of random variables Expectation +∞ E [ h( X ) ] = Y = h( X ) ∫ ∑ −∞ Expectation +∞ h( x) f X ( x)dx xi , h ( xi ) = y E [ h( X ) ] = h( xi ) p ( X = xi ) ∫ ∑ −∞ h( x) f X ( x)dx xi , h ( xi ) = y increasing h( xi ) p ( X = xi ) Linear Transformations E [ y] = ∫ +∞ −∞ yf y ( y )dy = ∫ +∞ −∞ dx h( x) f X ( x) dy dy Y = a + bX E [Y ] = a + bE [ X ] Var [Y ] = b 2Var [ X ] 55 Estadística, Profesora: María Durbán 56 Estadística, Profesora: María Durbán