Statistics II & III

Chapter 8: The Binomial and Geometric distributions

Binomial Distributions

Binomial Formula

Binomial Setting

1) Each observation has only 2 outcomes, "success" and "failure"2) There is a fixed number of observations, n3) The n observations are all independent4) The probability of success, p, is the same for each observation.It is important to reconize which situations binomial distributions can and cannot be used.

Sampling Distribution of a count

Choose an SRS (simple random sample) of size n from a population with proportion p of successes. When the population is much larger than the sample, the count X of successes in the sample has approximately the binomial distribution with parameters n and p.

Binomial Probability

If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0,1,2...nP(X=k)=(n¦k) * (p)^k * (1-p)^(n-k)(n¦k) is known as the binomial coefficientThis counts the number of ways in which k successes can be distributed among n observations.

Probability distribution function (pdf)

Given a discrete random variable X, the pdf assigns a probability to each value of XCalculator function;tistat.binomPdf(n,p,X)

Example

Binomial Distribution.n=10000 ballsp=0.2 are white ballsSRS of 10 ballsWhat is probability there are exactly 2 white ballsP(x=2)= (10¦2) * (0.2)^2 * (0.8)^2= 0.30199tistat.binompdf(10,0.2,2)= 0.30199

Cumulative distribution function (cdf)

Given a random variable X, the cdf of X calculates the sum of probabilities for 0,1,2... till X. It calculates the probability of obtaining at most X successes in n trails.Calculator function;tistat.binomCdf(n,p,X)

Example

Binomial Distribution.p=0.06 are out of shapeSRS of 20 bearsWhat is probability there are more then 3 bearsP(x>3)= 1-P(x=0)-P(x=1)-P(x=2)-P(x=3)= 0.028966tistat.binomcdf(20,0.06,4,20)= 0.028966

Example

Large number of red and white balls25% are redIf balls are picked randomly what is the least number of balls to be picked so that hte probability of getting at least 1 red ball is greater than 0.95?x = no. of red ballsP(x≥1)=1-p(x=0)=1-(0.75)^n1-(0.75)^n>0.95(0.75)^n<0.05n>10.4133n=11

Binomial Mean and standard Deviation

µ=npσ=√(np(1-p))

Normal Approximation to Binomial Distributions

When n is large, the distribution of X is appoxmately NormalCan be used when np≥10 and n(1-p)≥10most accurate when p close to 0.5least accurate when p is near 0 or 1

Geometric Distributions

Conditions

1) Each observation has only 2 outcomes, "success" and "failure"2) The n observations are all independent3) The probability of success, p, is the same for each observation.4) The variable of interest, X, is the number of trails required to obtain the first successThe number of trails in a geometric setting is not fixed but is the variable of interest

Calculating Geometric Probabilities

the probability that the first success occurs on the nth trial;P(X=n)=(1-p)^(n-1)pp=probability of successCalculator function;tistat.geomPdf(p,n)tistat.geomCdf(p,n)

Geometric Mean and standard Deviation

mean = µ=1/pvariance = σ^2=(1-p)/p^2Probability that it takes more than n trials to see the first success isP(X>n)=(1-p)^n

Chapter 9: Sampling Distributions

Parameter and statistic

A parameter is a number that discribe the population;µ, p, σA statistic is a number that can be computed from the sample data without the use of any unknown parameters; x-bar, p-hat, s.

Sampling

Sampling distribution

Distribution of values taken by the statistic in all possible samples of the same size from the same population.

Example

9.5, 9.6, 9.7 (pages 571-573)When discribing a histogramCenter: center of distribution is very close to the true value of pShape: overall shape is roughly symmetric and approximately Normal.Spread: values of p-hat range from 0.2 to 0.55.Since spread is approx Normal, we can use standard deviation to describe its spread.

Sample proportion

We often take SRS of size n and use the p-hat to estimate the unknown parameter p.Mean of sampling distribution is given by pStandard deviation of sampling distribution is given by;√((p(1-p))/n)

Conditions

The formula for standard deviation of p-hat is only used when the population is at least 10 times as large as the sample.Normal approximation is used when np and n(1-p)≥10

Sample mean

Mean of x bar: µStandard deviation of x bar: σ/√n

Conditions

1. The formula for standard deviation of x-bar is only used when the population is at least 10 times as large as the sample.2. The facts above about the mean and standard deivation of x-bar are true no matter what the population distribution looks like.3. The shape of the distribution of x-bar depends on the shape of the population distribution. In particular, if the population distribution is Normal, then the population distribution of x-bar is also Normal.

Bias and variability

Bias: how far the mean of the sampling distribution is from the true value of the parameter being estimated.Variability: spread of its sampling distribution. Larger samples gives a smaller spread.

Central Limit Theorem

For large sample size n>30, the sampling distribution of x-bar is approximately Normal for any population with a finite standard deviation.The mean is given by µ and standard deviation by σ/√nThe sample size n needed depends on the poplatoin. More observations are required if the shape is skewed.

Chapter 10: Estimating with confidence

Confidence intervals

Range of plausible values that are likely to contain the unknown populatoin parameter.Generated using a set of sample data.

Confidence level

confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples.

Confidence interval for a population mean

Known σ

x.bar±z*σ/√n

Conditions

1. the sample must be an SRS from the population of interest.2. The sampling distribution of the sample mean x-bar is at least approximately Normal.If the population distribution is not Normal, central limit theorem tells us that is approximately Normal if n is large.3. Individual observations are independent. 4.The population size is at least 10 times as large as the sample size.

Procedure for inference with Confidence Intervals

1. State the parameter of interest.2. Name the inference procedure and check conditions.3. Calculate the confidence interval.4. Interpret results in the context of the problem.

Reducing Margin of error

The confidence level C decreases (z* gets smaller)The population standard deviation decreasesThe sample size increases

Unknown σ

x.bar±t*s/√ndf=n-1

Conditions

SRS: Data are SRS of size n from population of interest or come from a randomised experimentNormality: Observations from the population have a Normal distribution.It is enough that the distribution be symmetric and single-peaked.Independence: Individual observations are independent. The population size should be at least 10 times the sample size.

t-distributions

Substitute standard deviation σ for standard error s.The resultant distribution is not Normal. It is a t distribution. There is a different t distribution for each sample size n. We specify a particular t distribution by giving its degrees of freedom (df).The density curves of the t distributions are similar in shape to the standard Normal curve. The spread of the t distributions is a bit greater than that of the standard Normal distribution.As the degrees of freedom increases, the t(k) density curve approaches the N(0,1) curve ever more closely.This interval is exactly correct when the population distribution is Normal and approximately correct for large n in other cases.

Robustness of t procedures

Procedures that are not strongly affected by lack of Normality are called robust.t-procedures are not robust against outliersBut they are quite robust against non-Normality of the population, when there are no outliers, even if the distribution is asymmetric.Larger samples improve accuracy of critical values from the t distributions when the population is not Normal. This is because of the central limit theorem.

Using the t procedures

Except in the case of small samples, the assumption that the data are an SRS from the population of interest is more improtant than the assumption that the population distribution is Normal.Sample size less than 15. Use t procedures if the data are close to Normal.Samples size at least 15. The t procedures can be used except in the presence of outliers or strong skewness.Large samples. The t procedures can be used even for clearly skewed distribution when the sample is large (central limit theorem).

Paired t procedures

Matched pairs design or before-and-after measurements on the same subjects

Estimating a population proportion

p.hat±z*√((p.hat(1-p.hat))/n)

Conditions

SRS: The data are an SRS from the population of interestNormality: For a confidence interval, n is so large that both np-hat and n(1-p-hat) are 10 or moreIndependence: Individual observations are independent. When sampling without replacement, the population is at least 10 times as large as the sample.

Choosing the sample size

Margin of error involves the sample proportion of successes, we need to guess this value when choosing mThe guess is called p*Use a guess p* based on pilot study or past experiences with similar studies.Use p* = 0.5 as the guess. Margin of error is largest when p-hat = 0.5.

Chapter 11: Testing a claim

The Basics

Basic:

An outcome that would rarely happen if a claim was true is good evidence that the claim is false.The results of a test are expressed in terms of a probability that measures how well the data and the hypothesis agree.

P-value

Rule of thumb: alpha = 0.05 unless otherwise statedA result with a small P-value (less than alpha) is called statistically significant.

Large P-value

Large P-values fail to give evidence against H0.

Small P-value

Small P-values are evidence against H0 because they say that the observed trait is unlikely to occur just by chance.

Hypotheses

We can have one-sided or two-sided alternative hypotheses.

null hypothesis

The null hypothesis is the statement that this effect is not present in the population.

alternative hypothesis

The alternative hypothesis states that it is present in the population.

Conditions for significance tests

SRS from the population of interestNormality: np > 10 and n(1-p) > 10Independent observations.

Test statistic

The test is based on a statistic that compares the value of the parameter as stated in the null hypothesis with an estimate of the parameter from the sample data.Values of the estimate far from the parameter value in the direction specified by the alternative hypothesis give evidence against H0.Standardise the estimate: Test statistic = (estimate - hypothesised value) / standard deviation of estimate

Carrying out significance tests

General procedure

1. Hypotheses: Identify the population of interest and the parameter you want to draw conclusions about.2. Conditions: Choose the appropriate inference procedure. Verify the conditions for using it.3. Calculations: Calculate test statistic and the P-value.4. Interpretation: Interpret your results in context of the problem.

z-test for population mean

z=(x.bar - µ0)/(s/√n)

P-value

µ>µ0 P(Z>z)µ<µ0 P(Z<z)µ≠µ0 2P(Z≥|z|)

Interpretation

These P-values are exact if the population distribution is Normal and are approximately correct for large n in other cases.Failing to find evidence against H0 means only that the data are consistent with H0, not that we have clear evidence that H0 is true.

Confidence intervals and two-sided tests

A level α 2-sided significance test rejects a hypothesis exactly when the value µ0 falls outside a level 1-α confidence interval for µThe link between 2-sided significance tests and confidence intervals is called dualityFor a two-sided hypothesis test for mean, a significance test (level α) and a confidence interval (level C = 1-α) will yield the same conclusion.

Importance of Significance

Choosing a level of significance

There is no sharp border between "statistically significant" and "statistically insignificant" so giving the P-value allows each of us to decide individually if the evidence is sufficiently strong.

Statistical significance and practical importance

A statistically significant effect need not be practically important.Use confidence intervals to estimate the actual value for parameters as confidence intervals estimate the size of an effect rather than simply asking if it is too large to reasonably occur by chance alone.

Don't ignore lack of significance

There is a tendency to infer there is no effect whenever a P-value fails to attain the usual 5% standard.Lack of significance does not imply that H0 is true.In some areas of research, small effects that are detectable only with lage sample sizes can be of great practical significance.

Statistical inference is not valid for all sets of data

Badly designed surveys or experiments often produce invalid results.Faulty data collection, outliers in the data, and testing a hypothesis on the same data can invalidate a test.Beware of multiple analyses; many tests run at once will probably produce some significant results by chance alone, even if all the null hypotheses are true.

Using inference to make decisions

Type I error

reject Ho when Ho is actually true.

Significance

The significance level α of any fixed level test is the probability fo a Type I error.α is the probability that the test will reject the null hypothesis when it is in fact true.

Type II error

fail to reject Ho when Ho is false.

Probability

1. Calculate when the test stops accepting Ho.2. Use the critical value obtained and standardise using a curve based on alternative hypothesis to find the probability.

Power

The probability that a fixed level α test will reject Ho when a particular alternative value of the parameter is true is called the power of the tests against that alternative.

Increasing power

Increase alpha.Consider a particular alternative further away from the meanIncrease the sample size; decreases standard errorDecrease σ

Chapter 12: Tests about a population mean

One-proportion z test

z = (p.hat - p0)/√((p0-(1-p0))/n)

Conditions

Normality condition: np and n(1-p) ≥ 10

Alternate hypotheses

p>p0 P(Z>z)p<p0 P(Z<z)p≠p0 2P(Z≥|z|)

One-sample t test

t=(x.bar - µ0)/(s/√n)

Alternate hypotheses

µ>µ0 P(T>t)µ<µ0 P(T<t)µ≠µ0 2P(T≥|t|)

Chapter 13: Comparing two population parameters

Conditions

SRS: We have two SRSs, from two distinct populatoins. Independence: The samples are independent. That is, one sample has no influence on the other.When sampling without replacement, each population must be at least 10 times as large as the corresponding sample size.Normality: Both populations are Normally distributed.

Two-sample tests

Two-sample z statistic

z= (x.bar_1-x.bar_2 (μ_1-μ_2))/√((σ_1^2)/n_1 +(σ_2^2)/n_2 )

Two-sample t procedure

(x.bar_1-x.bar_2 )±t* √((s_1^2)/n_1 +(s_2^2)/n_2 )

Two-proportion z interval

(p.hat_1-p.hat_2 )±z* √((p.hat_1 (1-p.hat_1))/n_1 +(p.hat_2 (1-p.hat_2))/n_2 )

Two-proportion z test

z=(p.hat_1-p.hat_2)/√(p.hat_c (1-p.hat_c )(1/n_1 +1/n_2 ))

Robustness

More robust than the one-sample t methods, particularly when the distributions are not symmetric.Choose equal sample sizes if possible.n1 and n2 must both be at least 5.If n1+n2 > 30, the two-sample t procedure can be used even for skewed distributions.

Chapter 14: Chi-square procedures

Chi-square test for goodness of fit

Hypothese

Ho: The actual population proportions are equal to the hypothesised proportions.Ha: At least one of the actual population proportions differ from the hypothesised proportions.

Calculations

x^2=∑(O-E)^2/Edf = k -1P-value = P(X^2>x^2)

Chi-square distributions

The total area under a chi-square curve is equal to 1.Each chi-square curve (except when degrees of freedom = 1) begins at 0 or the horizontal axis, increase to a peak, and approaches the horizontal axis asymptotically from above.Each chi-square curve is skewed to the right. As the number of degrees of freedom icnreases, the curve becomes more symmetric and looks more like a Normal curve.

Chi-square test and the z test

We can compare the 2 proportions using the z test.The Chi-square statistic is the square of the z statistic, and the P-value for Chi-square is the same as the two-sided P-value for z.

Uses

Used to compare two proportions because it gives the choice of a one-sided test and is related to a confidence interval for p1-p2.

Chi-square test for homogeneity of populations

Select SRS from each c populations. Each individual is classified in a sample according to a categorical response variable with r possible values. There are c different sets of proportions to be compared, one for each population.

Hypothese

null hypothes is is that the distribution of the response variable is the same in all c populations. alternative hypothesis is that these c distributions are not all the same.

Implications

If Ho is accepted, the Chi-square statistic has approximately a chi-square distribution with (r-1)(c-1) degrees of freedom.

Conditions

No more of 20% of the expected counts are less than 5, all individual expected counts are at least 1.All counts in a 2 x 2 table should be at least 5.Expected count = (row total x column total) / n

Chi-square test of association & independence

Hypotheses

Ho: There is no association between two categorical variables.Ha: There is an association between two categorical variables.

Uses

2-way table from a single SRSeach individualy classified according to both categorical variables.

Chapter 15: Inference for regression

The slope b and intercept a of the least-squares regression line are statistics.

Conditions

Repeated response y are independent of each other.Scatterplot: overall pattern is roughly linear. Residual plot has a random pattern.The standard deviation σ of y (σ is unknown) is the same for all values of x.For any fixed value of x, the response y varies according to a Normal distribution.

Calculations

degrees of freedom = n -2Residual = observed y - predicted yStandard error= s=√(∑(y-y.hat)^2 )/(n-2))confidence interval = b ± t* s/√(∑(x-x.bar)^2 )

Significant tests for regression slope

Hypotheses

Ho: β = 0This Ho says that there is no true linear relationship between x and y.This Ho also says that there is no correlation between x and y.The testing correlation makes sense only if the observations are a random sample.

t-statistics

t= (b√(∑〖(x-x.bar)〗^2 ))/s

Poisson Distribution

Conditions

1) The events occur singly and randomly.2) The events occur uniformly.3) The events occur independently.4) The probability of occurrence of an event within a small fixed interval is negligible.

Mean & variance

If X~Po(λ), then E(X) =λ and Var(X) =λλ=average number of ocurrance

Additive Property of the poisson distribution

If X and Y are independent Poisson random variable and X~Po(λ), Y~Po(µ),Then;X+Y~po(λ+µ)

Approximating Binomial distribution with poisson distribution

Given X~B(n,p) such that n is large (>50) and np<5 (normally p<0.1), then the binomial distribution can be approximated usingthe poison distribution with mean λ=npIt is more accurate when n gets larger and p gets smaller

Chapter 1: Exploring data

Categorical data

Pie charts, dotplots, bar charts

Qualitative categories

Quantitative data

Numerical data

Histogram

Area represents the size of dataRelative frequencies on vertical axis

Stemplots

Cumulative frequency plot (Ogive)

Description of graphical display

Mode

Center

MeanMedian

Spread

RangePercentileInterquartile Range (IQR)Box plots (five number summary)VarianceStandard deviation, s

Clusters

Gaps

Outliers

Shape

SymmetricSkewed (spreads far and thinly)UniformedBell-shaped

Resistant measure

Median is not affected by outlier values

Changing units of measure

Linear transformations

Transformed variables

mean a+bxmedian a+bMstandard deviation bsIQR bR

Comparing distributions

Side-by-side graphsBack-to-back stemplotsNarrative comparisons

Chapter 2: Describing location in a distribution

Z-score

z=(x-x.bar)/s

Percentile

pth percentile of a distribution is the value with p% of the observation less than or equal to it.

Chebyshev's inequality

Density curves

Mathematical model for the distributionA curve that is always or above the x axisArea underneath it is always exactly 1.

Mean, µ,σ

% of observations falling within k standard deviations of the mean is at least (100)(1-1/k^2)Median is the equal-areas pointMean is the balance pointIn a symmetric density curve the mean and median are the same

Normal distribution

Probablity density function given as:f(x)=1/(σ√(2π )) e^(-〖(x-µ)〗^2/〖2σ〗^2 )

Empirical rule

68% fall within 1 standard deviation of the mean95% fall within 2 standard deviation of the mean99.7% fall within 3 standard deviation of the mean

Standard Normal distribution

For any Normal distribution we can perform a linear transformation to obtain standard Normal distributionIf the variable x has Normal distribution N(µ, σ), then the standard variable has Normal distribution N(0,1);z=(x-µ)/σThe area under the standard Normal curve can be found from a standard Normal table, or the GC.

Assessing Normality

Close to straight line - NormalSystematic deviations - non-Normal

Chapter 3: Examining Relationship

Response and explanatory variables

Response (dependent) - measures outcome of studyExplanatory (independent) - explains or influences changes in response variable

Scatterplot

Scatterplot shows relationship between 2 quantitative variablesExplanatory on x-axisResponse on y-axis

Interpreting a scatterplot

1. Look for overall pattern and for striking deviations from that pattern.2. Describe the pattern by the direction, form and strength of the relationship3. Look for outliers.

Associations

Postive - above-average values of one tend to accompany above-average values of the other, and vice versa.Negative - above-average values of one tend to accompany below-average values of the other.

Correlation

Correlation measures the direction and strength of the linear relationship between two quantitative variables.∑((x_i-x.bar)/s_x )((y_i-y.bar)/s_y )

Least Squares Regression Line

Regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes.You can use a regression line to predict the value of y for any value of x.y.hat = a+bx

Residuals & Residual plot

Residual = observed y - predicted y. Sum of residuals = 0.Residual plot should show no obvious pattern.

Standard deviation

Measure error size: compare Standard deviation of residuals to actual data points.

Coefficient of determination

r^2=1-(∑ (y-y.hat)^2 )/(∑(y-y.bar)^2 )

Outliers and influential observations

Outlier - observation that lies outside the overall pattern of the other osbervations.Influential - if removing it would markedly change the result of the calculation.

Lurking variable

A variable that is not among the explanatory or response variabels in a study and yet may influence the interpretation of relationships among these variables.

Chapter 4: Relationships between two variables

Transforming to achieve linearity

able to apply the least squares regression line.

Exponential growth model

Exponential growht increases by a fixed percent of the previous total in each equal time period.y=ab^xln y=ln(ab^x)ln a = cx ln b= mPlot ln y against x to obtain a straight line with gradient ln b.

Power law model

y=ax^pln y=ln a+ p ln xPlot ln y vs ln x to obtain straight line with gradient p.

Relationship between categorical variables

Two-way table organizes data about 2 categorical variables.Row & column totals - marginal distributions or marginal frequencies.Find the condition distribution of the row variable for one specific value of the column variable, look only at the one column in the table. Find each entry in the column as a percent of the column total.

Simpson's paradox

Simpson's paradox (or the Yule-Simpson effect) is a statistical paradox wherein the successes of groups seem reversed when the groups are combined.

Explaining association

Causation - change in x causes the direct change in y.Common response - the observed association between x and y can be explained by lurking variable z.Confounding effect - variable effects cannot be distinguished from each other.To explain causation, we need to conduct carefully-designed experiments.

Chapter 5: Producing data

Observational study

Designing samples

population-Entire group we want information about Sample - Part of the population that we examine.Sampling - studying a part in order to gain information about the whole.Census - attempts to contact every individual.Voluntary response sample - people who choose themselves.Convenience sampling - choosing individuals who are easiest to reach.

Simple Random Sample (SRS)

Consists of n individuals chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected.1. Label. Assign a numerical label to every individual in the population.2. Table. Use the random number table to select labels at random.3. Stopping rule. Indicate when you should stop sampling.4. Identify sample. Use the labels to identify subjects selected to be in the sample.

Other sampling methods

Probability sample - sample chosen by chance.Stratified random sampling - first divided nto strata, then SRS from the stratas.Cluster sample - divide population into clusters, then randomly select some clusters.Multi-stage sampling design.

Cautions

Undercoverage - Some groups in the population are left out.Non-response - Individuals do not respond or cooperate.Response bias - lyingWording of questions - confusing & misleading questions.

Experiment

Definition

Deliberately impose some treatment on individuals in order to observe their responses.Individuals - experimental units or subjects (humans).Treatment - experimental condition applied.Factors - explanatory variables.

Control

Effort to minimise variability in the way experimental units are obtained and treated.Helps reduce problems from confounding and lurking variables.One group receives the treatment while the other group does not. Compare responses between 2 groups.

Placebo

See if there is any placebo effects which could have affected the reults

Replication

Even with control, there is still natural variability.Replication reduces the role of chance variation and increase the sensitivity of the experiment to differences between the treatments.

Randomisation

Treatment groups are essentially similar and there is no systematic differences between them.

Designs

Block design

Block - group of experimental subjects that are known to be similar in some way that is expected to systematically affect the response to the treatments.Blocks are a form of control.Blocks are chosen to reduce variability based on the likelihood that the blocking variable is related to the response.Blocks should be formed based on the most important unavoidable sources of variability among the experimental units.Blocking allows us to draw separate conclusions about each block.

Matched pairs design

Example of block design.Compare two treatments and the subjects are matched in pairs.

Cautions

Double-blind experiment

Neither subjects nor those who measure the response know which treatment a subject received.Controls the placebo effect.

Lack of realism

Cannot duplicate exact conditions that we want to studyLimits our ability to apply conclusions to the settings of greater interest.Statistical analysis cannot tell us how far the results will generalise to other settings.

Chapter 6: Permutation, combination and probability

Permutation and combination

Addition principle

the number of ways of selecting a single objects; m+n.

Multiplication principle

If you can do task 1 in m number of ways, and task 2 in n number of ways,both tasks can be done in m*n number of ways.

Permutation

The order of objects is important.If there are n distinct objects, we have n! ways of arranging all the objects in a row.

Identical objects

p identical objects and q identical objects,from a total of n objectsarrange the r objects in a row in:n!/p! or n!/(p!q!)

Distinct

r objects from n distinct objects〖(_^n)P〗_r=n!/(n-r)!

Combination

The unordered selection of objects from a set.If there are n distinct objects, then we can select r objects in ways: 〖(_^n)C〗_r

Circular permutation

When objects are arranged in a circle, since each object has the same neighbours, they can be rotated.We have (n-1)! ways to arrange n distinct objects in a circle.

Probability

Random - individual outcomes are uncertain but there is a regular distribution of outcomes in a large number of repetitions.

Probabilty models

Sample space of a random phenomenon - set of all possible outcomesEvent - any outcome or set of outcomesProbability model - mathematical description of random phenomenon.

Probability of event with equal likely outcomes

P(A)=(n(A))/(n(S))where n(A) represents the number of outcomes in event A and n(S) represents number of outcomes in space S.

Independent events

Two events A and B are independent if the chance of one event happening or not happening does not change the probability that the other event occurs.If A and B are independent, then P(A and B) = P(A)*P(B).

Probability Tree

Diagrammatic representation of possible outcomes of series of events. A probability tree to calculate the chances of flipping a coin and coming up heads three times in a row would have three levels. The first reflects the chances of throwing either heads or tails; the second level reflects the chances of throwing heads or tails after throwing heads the first time, and the chances of throwing heads or tails after throwing tails the first time: the third level shows the chances of throwing heads or tails after all the possible outcomes of the first two throws. The series of probabilities can be multiplied to give the overall probability of a possible event occurring.Probabilities add up to 1.

Conditional probability

Conditional probability is the probability of some event A, given the occurrence of some other event B. Conditional probability is written P(A|B), and is read "the probability of A, given B".P(A|B)=(P(A n B))/(P(B))If A and B are mutually exclusive, then P(A | B) = 0.If A and B are independent, then P(A | B) = P(A).

Chapter 7: Random Variables

A variable whose value is a random numerical outcome.

Discrete random variable

values that might be observed are restricted to being within a pre-defined list of possible values

Conditions

All probabilities must add up to 1.0≤P_k≤1

Probability Histogram

Probability distributions of real-valued random variables

Equations

µ_x=∑_(i=1)^k (x_i p_i )σ_x^2=∑_(i=1)^k (x_i-µ_x )^2 p_iVar(x)=E[(x-µ_x)^2]

Continuous random variable

Takes all values in an interval of numbersFor all continuous probability distributions, P(any individual outcome) = 0

Probability distribution of X is described by a probability distribution function

Total area under graph is 1f(x)≥0∫_a^b f(x)dx

Cumulative distribution function

∫_-∞^x f(t)dt

Properties for expectations and variances

E(a) = aE(aX + b) = aE[X] + bE(X + Y) = E[X] + E[Y]Var(a) = 0Var(aX + b) = a^2 Var(X)Var(X + Y) = Var(X) + Var(Y)Var(X - Y) = Var(X) + Var(Y)