Apr. 5 Probability and Distributions

Introduction Assignment

Due Date: April 12th, 9:30am

Read: Sampling and Allocation

Guided Summary:

- Define “inferential statistics” in your own words.

Inferential statistics describe the many ways in which statistics derived from observations on samples from study populations can be used to deduce whether or not those populations are truly different. - Define “probability” as you understand it.

It is the ratio of the number of outcomes in an exhaustive set of equally likely outcomes that produce a given event to the total number of possible outcomes. b : a branch of mathematics concerned with the study of probabilities. - Review–Briefly define the following:

Cluster sampling Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample. The clusters should ideally each be mini-representations of the population as a whole.

Quota sampling Quota sampling is a type of non-probability sampling method. This means that elements from the population are chosen on a non-random basis and all members of the population do not have an equal chance of being selected to be a part of the sample group.

Convenience sampling Convenience sampling involves using respondents who are “convenient” to the researcher. There is no pattern whatsoever in acquiring these respondents—they may be recruited merely asking people who are present in the street, in a public building, or in a workplace, for example. - Why does allocation need to be random?Randomization is important because it is almost the only way to assign all the other variables equally except for the factor (A and B) in which we are interested. However, some very important confounding variables can often be assigned unequally to the two groups.

Read: Inference and Confidence Intervals (stop at page 12)

Guided Summary:

- Is a larger or smaller confidence interval more precise?

A large confidence interval suggests that the sample does not provide a precise representation of the population mean, whereas a narrow confidence interval demonstrates a greater degree of precision. - What are the two most common confidence intervals?

There is much confusion over the interpretation of the probability attached to confidence intervals.

How do you determine the degrees of freedom of a sample? What would the degrees of freedom be for a sample of 80 people?

To understand it we have to resort to the concept of repeated sampling. Imagine taking repeated samples of the same size from the same population.

Which one changes as the sample size increases?

For each sample calculate a 95% confidence interval. Since the samples are different, so are the confidence intervals. We know that 95% of these intervals will include the population parameter.