Data Analysis With Graph
Raw Data - The crude informations collected
from a qualitative analysis
Variable - The quota being calculated
range
Continuous variable - Any value within a given
Discrete variable - Is restricted to having certain
separate number, usually integers
Histogram - Bar graph at which bars are
proportional to the frequencies of the values
of the variable
Frequency polygon - Plot frequency vs. variable
and join the two lines. It shows the same information
as a histogram
Intervals - When the given values are large numbers,
they are broken in to classes or intervals
Relative-frequency - Shows the frequency data groups
as a fraction or precent of the whole data set
Indices
Index - Relates the value of variables to a base level,
which is often the value of on a particular date
Time-series Graph - They show changes over time
Consumer Price Index (CPI) - Measures changes in the price level of a weighted average market basket of consumer goods. It is an important measure of inflations
Inflations - An overall increase in pice, which
corresponds to a decrease in value of money
Subtopic
Sampling Techniques
Simple Random Sample - A subset of individuals chosen from a larger set.
Systematic Sample - Selection of elements from an ordered sampling frame.
Cluster Sample- The researcher divides the population into separate groups, called clusters. Then, a simple random sample of clusters is selected from the population. The researcher conducts his analysis on data from the sampled clusters
Multi-stage Sample - Use several levels of random
sampling
Voluntary-response Sampling - A sample made up of volunteers. Compared to a random sample, these types of samples are always biased
Convenience Sample - a type of non-probability sampling that involves the sample being drawn from that part of the population that is close to hand
Bias in Survey
Non-Response Bias - When a particular group is
under-represented due to choice
Response Bias - When the participants provide
false or misleading answers
Measure of Central Tendancy
Mean - Sum of the values of a variable divided by
the number of values
Median - The middle value of the data when they are ranked
from highest to lowest
Mode - Is the value that occurs most often in a distribution
Outliers - Values that are distant from the majority of the data
Measures of Spread
Dispersion - Set quantities that show how closely a set
data clusters around the center
Population Standard Deviation -
σ = sqrt[ Σ ( Xi – μ )2 / N ]
Population Variance -
σ2 = Σ ( Xi – μ )2 / N
Quantities And Interquartile Ranges
Interquartile Ranges - Q3 : Q1
Semi-interquartile Range - Is one half of the
Interquartile ranges
Scatter Plots and Linear Correlation
Linear Correlation - Changes in one variable tend to
be proportional to changes in the other
Perfect Positive- If Y increases at a constant rate
as X increases
Perfect Negative - If Y decreases at a constant rate
as X increases
Scatter Plot - Shows the relations mentioned above
graphically
Line of Best Fit - A straight line that passes as close
as possible to all points on a scatter plot
Linear Regression
Regression- Analytic technique for determining
the relationship between independent and dependent
variables
Least-Square Fit - statistical procedure to find the best fit for a set of data points by minimizing the sum of the offsets or residuals of points from the plotted curve.
Non-linear Regression
Non-linear Regression - Analytic technique for finding
a curve of best fit for data from relationships
Coefficient of Determination -
R2 = { ( 1 / N ) * Σ [ (xi - x) * (yi - y) ] / (σx * σy ) }2
Exponential Regression - the process of finding the equation of the exponential function that fits best for a set of data. As a result, we get an equation of the form y=abx
y=ab^x where
a≠0
Cause and Effect
Cause-and-Effect Relationship - A change in X produces
a change in Y
Reverse Cause-and-Effect Relationship - The dependent
and independent variables are reversed in the process
of establishing causality
Common-Cause Factor - An external variable causes
two variables to change in the same way
Accidental Relationship - A correction exists without
any causal relationship between variables
Presumed Relationship - A correction does not seem
to be accidental even though no cause-and-effect relationship or common-cause factor is apparent