STATS 250

Categorical variable

A variable which can fall in one of several discrete categories, where there is no notion of ordering. **Ex.** yes/no questions, drink sizes, shirt sizes.

Distribution

The values a variable can take and how often it takes those values.

Mean

The numerical average of a sample. The sum of the data points divided by the number of data points. Used to describe the center of symmetric distributions.

Median

The middle element when the elements are sorted in increasing order. In the case of an even number of data points, the median is the average of the two centermost elements.

Modality

The number of peaks that a histogram has.

Outliers

Data points that do not fit in the typical pattern for the data set. They should either be explained, fixed (if typographical errors on behalf of the statistician), or further studied. They should never be thrown away.

Parameter

A summary measure of *population* data.

Percentiles

The \\(p^{\text{th}}\\) percentile is a value such that \\(p\\) percent of the observations fall at or below that value

Population data

Data collected from an *entire* population.

Quantitative variable

A variable which can be compared to other variables in the same category, where there are "greater than" or "less than" relationships. Can either be discrete or continuous. **Ex.** height, GPA, age.

Sample data

Data collected from a *subset* of a larger population.

Statistic

A summary measure of *sample* data.

Statistics

Collection of procedures and principles for gathering and analyzing information, in order to help make decisions.

Variable

A characteristic that differs from one individual to the next. Can be *quantitative* or *categorical*.