Analytics Test Answers
·
What is true about these 5 numbers? -2, -1, 0,
5, 10
The median is larger than the mean.
Cannot compare means and medians.
The median and mean are equal to each other.
The mean is larger than the median.
Where would the outliers be if a distribution
had a skewness of -1?
No outliers
Right
Left
Far right
A normal distribution generally takes the form
of a:
None of these
square ruit
asymptote
bell curve
What is the median of: 5, 10, 15?
15
10
5
9.5
What power is used in the formula for variance?
4
2
1
3
Where would the outliers be if a distribution
had a skewness of -50?
Far left
Far right
Right
No outliers
Where would the outliers be if a distribution
had a skewness of +1?
Left
Far right
No outliers
Right
True or false? Qualitative data is strictly
numerical.
False
True
The median grade on a midterm exam in a math
class is 72. The teacher feels this is too low, so they award 10 extra points
to every student in the class. What is the new median grade for the class?
62
Not enough information.
72
82
Why is the primary key of a dataset important?
It's the most descriptive field
It's the unique identifier of every record
It's the key to the dataset
It's the most important field
The average grade on a midterm exam in a math
class is 72. The teacher feels this is too low, so they award 10 extra points
to every student in the class. What is the new average grade for the class?
82
72
62
Not enough information.
Your college professor standardizes everyone's
test scores. Your standardized score is -1.35. Which of the following
statements is true?
You scored within one standard deviation of the
average test score.
Your test score was above the average.
Your test score was below the average.
Your test score had the highest standard deviation.
Which describes the dispersion of a data set?
None of these
standard deviation
median
mean
What is the difference between the highest and
lowest scores in a data set?
the sample
the mode
the mean
the range
Where would the outliers be if a distribution
had a skewness of 0?
Far right
Right
Left
No outliers
What is the mean of: 5, 10, 15?
10
1
5
11
What is the average of the following 5
numbers? 1, 2, 3, 4, 10
10
3
1
4
Which of the following is a measure of spread?
Lower Quartile
Range
Median
Mean
Discrete and continuous data are both forms
of:
qualitative data
quantative data
incomplete data
(None of these)
Using previous games to predict the score of a
game is an example of ________ statistics.
incomplete
descriptive
None of these
inferential
As a market researcher which of the following
algorithms would you use to divide the general population of consumers into
market segments and to better understand the relationships between groups of
potential customers
Recommendation analysis
Bayesian Estimation
Linear Regression
Cluster Analysis
Over the last 360 days of winter in Raleigh,
NC (5 winters) we have had snow on 36 days. What is the probability that we
will have snow on any random winter day this year?
0.05
0.01
0.1
0.2
If you were surveying the United States
population, New York City would be a:
subset
population (N)
standard deviation
What is the mode of: 5, 10, 10, 15, 17?
11.2
10
12
5
What is the median of the following 5 numbers?
1, 2, 3, 4, 10
1
3
10
4
Which of the following is not a measure of
spread?
Standard Deviation
Upper Quartile
Range
Variance
Symbolism: What does the small sigma represent
(without any other symbols) in statistics?
Standard Deviation
Mean
Skewness
Variance
A ______ is a numerical characteristic of a population
rectang
category
parameter
constraint
Which of the following best describes a mean
calculation?
non-parametric
Neither of these
parametric
You flip an unbiased coin 2 times - what is
the probability of getting 2 heads?
75%
100%
50%
25%
What is the probability of rolling a fair dice
and getting an even number?
1/2
1/3
4/6
1/6
A forestry researcher recorded many variables
on the trees of a large forest. These variables include the height (in meters),
the diameter (in centimeters), the species (pine, oak, etc.), and if the tree
had Dutch Elm disease. In this study which variables that were recorded were
quantitative?
Only height and diameter.
Only height.
All of the variables.
Only species and height.
How can you convert a variance to a standard
deviation?
Take the square root of the variance
Take the log of the variance
Square the variance
Take the cubed root of the variance
Bob is a high school basketball player, who is
a 72% free throw shooter. Bob has missed his first four free throws of the
game. What is the probability that Bob makes his fifth free throw?
90%
100%
0%
72%
Which of the following is NOT a characteristic
of a Normal distribution?
Defined completely by mean and variance
Unimodal
Right Skewed
Symmetric
True or false? A continuous variable can only
have a whole value.
True
False
You calculate the standard deviation of a data
set and find that it is -1.23. From this you can determine which of the
following is true?
The mean must be negative.
All of the values in the data set are negative.
You made an arithmetic mistake because standard
deviation cannot be negative.
Every value in the data set is the same.
What is true about these 5 numbers? -10, -5,
0, 1, 2
The mean is larger than the median.
Cannot compare means and medians.
The median and mean are equal to each other.
The median is larger than the mean.
What is the skewness of a normal distribution?
0
1
2
3
The variable X is the value of an uneven dice
after one roll. It produced the following probability distribution P(X): P(1) =
0.05 P(2) = 0.28 P(3) = 0.12 P(4) = 0.23 P(5) = ? P(6) = ? What is the
probability that X = 5 or X = 6?
0.23
0.55
0.42
0.32
The number of cars that went through a car
wash during the noon hour over each of the past 8 days are the following: 5, 9,
2, 3, 3, 9, 8, 6 What is the range of this data?
8
7
9
5.6
The process of adding latitudes and longitudes
to customer site locations is called:
Customer Segmentation
Geocoding
Spatial analysis
Location analysis
_____ are collections of observations.
Random variables
Snippets
Populations
Data
What is the probability of rolling a fair dice
and getting a 1 and flipping a fair coin getting a head?
3/12
1/6
1/2
1/12
Which of the following is an example of
mutually exclusive events?
Having it rain on the same day that the sun comes
out in the same city
Being late to a meeting and being early to the same
meeting
Having one product off the assembly line be
defective, but another product on that same assembly line work properly.
Ordering a burger at a fast food restaurant and
ordering fries at that same restaurant
Suppose a fair die is tossed twice. What is
the probability of rolling two fours?
1/6
1/36
2/6
2/36
What is the probability of rolling a fair dice
and getting an even number and flipping a fair coin getting a head?
0.75
0.5
0.25
0
In a simple linear regression analysis, why do
we call the best line the "least squares" regression line? 1. The
line minimizes the sum of the squared errors in the data. 2. The line maximizes
the squared error in the data. 3. The model uses squared predictor variables.
1 only
3 only
1 and 2 only
2 only
Within a normal curve, ___% fall within 1
standard deviation of the mean.
25%
100%
68%
0%
The variable X is the value of an uneven dice
after one roll. It produced the following probability distribution P(X): P(1) =
0.05 P(2) = 0.28 P(3) = 0.12 P(4) = 0.23 P(5) = ? P(6) = ? What is the
probability that X = 2 or X = 3?
0.28
0.4
0.12
0.45
The average price of a car in a used car lot
is $18,000. These prices are Normally distributed with a standard deviation of
$3,000. What is the probability that any random car is below $18,000?
42%
68%
50%
95%
How can you convert a standard deviation to a
variance?
Take the square root of the standard deviation
Square the standard deviation
Take the cubed root of the standard deviation
Take the log of the standard deviation
When does a Type I error occur?
You reject the null hypothesis when it is true
There is no such term as a "Type I Error"
You fail to reject the null hypothesis when it is
false
None of the other choices
A survey has 3 multiple choice questions on
it. What type of survey is this?
closed
All of these
combination
open
Where would the outliers be if a distribution
had a skewness of +50?
Far right
Right
No outliers
Left
Which of the following statements are true? 1.
Categorical variables are the same as qualitative variables. 2. Categorical
variables are the same as quantitative variables. 3. Quantitative variables can
be continuous variables.
1 and 3 only
1 only
3 only
2 only
What is heteroskedastic?
There is no adjustment factor (e.g. epsilon)
The adjustment factor changes (e.g. epsilon)
The adjustment factor doesn't change (e.g. epsilon)
A single 6-sided die is rolled. What is the
probability of rolling a prime number?
0.5
1
0.17
0.33
What is the median of the following 5 numbers?
10, 2, 4, 3, 1
10
1
4
3
If the sample mean of a dataset is 10 and the
standard deviation is 6, what percent of the data would you expect to fall
between 4 and 16 assuming the data distribution is normal?
68%
99.7%
81.5%
95%
What type of SQL join is needed when you wish
to include rows that do not have matching values?
Natural join
Equi-join
Outer join
Inner join
Experts rank athletic teams 1 through 10. What
type of data is this?
ordinal
discrete
categorical
quantative
A manager of a large bank wants to compute the
average interest rates across all bonds that the bank invests in. The manager
randomly sampled 127 bonds that the bank invests in and calculated the average
interest rate over the past year of the sample was 2.47%. What is the parameter
of interest in this study?
All bonds that the bank invests in.
The average interest rate of all bonds that the
bank invests in.
The 127 bonds used in the calculation.
2.47%
What is the mean of a normalized standard
distribution?
1
50
0
0.5
What is derived from the second moment of
distribution?
Variance
Kurtosis
Skewness
Mean
Which is an example of nonmetric data?
Ordinal
Sample
Parametric
Quantative
Which of the following statements are true
about confidence intervals for means? 1. The center of the confidence interval
is always 0. 2. The bigger the confidence interval, the smaller the margin of
error. 3. The bigger your sample, the smaller the margin of error.
1 and 2 only
2 only
1 only
3 only
A survey was conducted to find the average
weight of students living in the dorms or a university. To help improve the
accuracy of the study, an equal number of students were randomly selected from
each dorm for the sample. This sample is an example of what?
Standard Random Sample
Block Design
Experiment
Stratified Random Sample
As CEO of a casino, I'm trying to understand
which of my customers would respond to a certain ad campaign. So I randomly
select a 1000 customers, from each demographic, income group and region to
conduct a survey.This experiment uses which of the following sampling
techniques
Systematic
Panel sampling
Stratified sampling
Simple random
What is synonymous with the coefficient of
determination?
Sigma
Mu
R
R-Squared
Which of the following is a characteristic of
an F-distribution?
No no lower or upper bound
Right Skewed
Symmetric
Bimodal
What is the first moment of distribution?
Skewness
Mean
Standard Deviation
Kurtosis
Compute the following iterative square root
√(2+ (√2+(√2+……..)
2
4
2.4
3
Assume the coefficient on the predictor
variable in the simple linear regression model is 2. What is the interpretation
of this coefficient? The predictor variable will be called x and the response
variable will be called y.
If x increases by one unit, then y must decrease by
2 units.
If x increases by one unit, then y must increase by
2 units.
On average, if x increases by one unit, y decreases
by 2 units.
On average, if x increases by one unit, y increases
by 2 units.
The median grade on a midterm exam in a math
class of 60 students is 85. The teacher gives an additional 5 bonus points to
the 3 students who scored the highest on the exam. What is the new median grade
for the class?
Not enough information.
90
85
80
Which of the following must be true for the
standard deviation of a set of observations to be 0?
All observations have the same value
All observations are 0
All observations are equally dispersed from the
mean
The mean of the set is 0
If you have a hypothesis test with a
significance level of 0.05 and a p-value of 0.01, what is the result of your
hypothesis test?
You accept the null hypothesis
Not enough information.
You reject the null hypothesis
You fail to reject the null hypothesis
Symbolism: What does the Greek letter mu
represent (without any other symbols) in statistics?
Sample mean
Population mean
Sample standard deviation
Population standard deviation
What does R^2 (R-Squared) calculate?
The coefficient of variation^2
The squared covariance
The slope of the regression
The closeness of a regression to the underlying
data
As manager of your sales organization you want
to compare the mean number of sales calls made per salesman in a week using two
different messaging techniques. Two hundred employees from your company are
randomly selected and each is randomly assigned to one of the two messaging
techniques. After teaching 100 salesmen one technique and 100 salesmen the
other technique, you record the number of succesful calls each salesman makes
in one month. Which of the following tests would you use for comparison?
Chi square test
Two sample z test
Two sample t test
Bayesian test
A multivariate regression model exhibits a
highly significant F-statistic, but each predictor's individual t-statistic is
insignificant. What phenomenon explains this?
Heteroskedasticity
Multicollinearity
Homoskedasticity
E & G
What is homoskedastic?
The adjustment factor changes (e.g. epsilon)
I do not know
The adjustment factor doesn't change (e.g. epsilon)
There is no adjustment factor (e.g. epsilon)
Which of the following holds true for the
correlation coefficient R?
-1 <= R <= 1
0 <= R <= 1
R > = 0
- R1 >= R <= 1r
60% of the population in DC supports the
Redskins, 40% support the Ravens, and 20% support both. The proportion of the
populace that supports neither is:
20%
30%
50%
40%
For an airline, many times small cities have
limited flights that go into their airports. To get a flight to Columbia, SC
you must go through one of three cities: Raleigh, NC, Atlanta, GA, or
Charlotte, NC. Two customers from Orlando, FL are trying to get to Columbia, SC
with only one stop (one of the three above mentioned cities). Assume that they
are equally likely to go through any of the above cities. What is the probability
neither of the customers fly through Charlotte, NC?
4/9
1/3
2/3
2/9
For an airline, many times small cities have
limited flights that go into their airports. To get a flight to Columbia, SC
you must go through one of three cities: Raleigh, NC, Atlanta, GA, or
Charlotte, NC. Two customers from Orlando, FL are trying to get to Columbia, SC
with only one stop (one of the three above mentioned cities). Assume that they
are equally likely to go through any of the above cities. What is the
probability one of the customers fly through Charlotte, NC, while the other
does not fly through Charlotte, NC?
2/9
4/9
2/3
1/3
What power is used in the formula for
skewness?
1
4
3
2
What is the coefficient of variation?
Kurtosis minus standard deviation
Normalized measure of dispersion of a probability
distribution
Mean^2
Standard Deviation
A card is drawn randomly from an ordinary deck
of playing cards. You win a prize if the card is a heart or the card is an ace.
What is the probability that you will win the prize?
13/52
1/13
17/52
16/52
What is the formula for the Pearsonian
Coefficient of Skewness?
3*(population mean - population mode) / Variance
3*(population mean - population mode) / Standard
deviation
5*(population mean - population mode) / Standard
deviation
2*(population mean - population mode) / Variance
What power is used in the formula for
kurtosis?
1
4
3
2
Consider 2 tables A & B which have 10 and
6 rows respectively. A and B have one field ID, in common. ID also happens to
be the primary key for both tables, and therefore cannot be null. If I did a
cartesian join between them on the field ID, how many rows will my result have?
120
60
10
16
Which of the following numbers (measures of
kurtosis of a distribution) would represent a leptokurtic distribution?
4
1
2
3
You appear for two tests as part of a history
class. In test 1 you score 60 points out of a maximum of 100. The class mean is
50 and the standard deviation is 10.On test 2 you score 60 points out of a
total of 70. The class mean is 40 and the standard deviation is 20. In which
test did you do better relative to the rest of the class?
Test 1
You did equally well
Can't tell without class size
Test 2
The probability that a vendor is able to a
sell X number of cupcakes daily is represented by the function P(X) P(0) = 0.3
P(10) = 0.4 P(50) = 0.2 P(100) = 0.1 If each cupcake sells for $3 what's the
expected daily revenue?
$46
$60
$50
$72
What is the expected value of rolling an
unbiased die (6 sided)?
3
2.5
3.5
4.5
What is synonymous with the correlation
coefficient?
Sigma
Mu
R-Squared
R
ANOVA (analysis of variance) is used to
determine if a relationship exists between which two types of variables?
Continuous response variable, Continuous predictor
variable
Categorical response variable, Categorical
predictor variable
Categorical predictor variable, Continuous response
variable
Categorical response variable, Continuous predictor
variable
Which of the following is not a supervised
learning algorithm?
Naïve Bayes
Self organizing map
Logistic Regression
Support Vector Machines
The variance of X is 15. Y=X+5. What is the
Variance of Y?
40
15
17.5
20
Consider 3 overlapping sets A,B & C. U
represents the Union operation and ∩ represents the intersection, n(A)
represent the number of elements in set A. Which of the following would give
you the number of elements in exactly one set
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2n(A ∩ C) – 2n(B
∩ C)
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2 n(A ∩ C) – 2
n(B ∩ C) + 3 n(A ∩ B ∩ C)
n(A) + n(B) + n(C) + 3n(A ∩ B ∩ C)
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2n(A ∩ C) – 2n(B
∩ C) + 2n(A ∩ B ∩ C)
How many social security numbers are possible
before they run out? A Social Security number is a 9 digit random number that
can have repeating digits
A billion
A million
100 million
A Trillion
Which of the following numbers (measures of
kurtosis of a distribution) would represent a platykurtic distribution?
2
3.5
20
10
If I tossed a fair coin 3 times, what is the
probability of it landing on tails only once?
1/8
3/8
1/4
1/2
Consider 2 tables A & B which have 10 and
6 rows respectively. A and B have one field ID, in common. ID also happens to
be the primary key for both tables, and therefore cannot be null. How many rows
will A right outer join B on the field ID yield?
4
6
10
60
What kind of statistics primarily uses
Chi-square?
parametric
non-parametric
none of these
descriptive
What is the cumulative probability at -1
standard deviation?
50%
84.13%
15.87%
34.13%
Consider 2 tables A & B which have 10 and
6 rows respectively. A and B have one field ID, in common. ID also happens to
be the primary key for both tables, and therefore cannot be null. How many rows
will A left outer join B on the field ID, yield?
10
6
4
60
Which of the following is not an unsupervised
machine learning algorithm?
Blind signal separation
Linear Regression
Clustering
Principal component analysis
What is the kurtosis of a normal distribution?
3
1
0
2
Which of the following situations would entail
the use of dummy variables?
When qualitative predictors have to be modeled
When categorical predictors have to be modeled
When residual analysis needs to be peformed
D& E
Which of the following databases should you
not normalize?
Read-only database
Updateable database
Transactional database
D & G
Which of the following is not the
characteristic of a data warehouse?
Non volatile
Time invariant
Capable of integrating data from a variety of
sources
Subject oriented
Which of the following can a de-identified
patient database contain?
Social security numbers
Patient expenses
All of these
Patient addresses
Which of the following correlation
coefficients indicates the strongest correlation?
-0.7
1.5
0.66
2.5
Consider the following set of data points:
7,5,7,4,5,6,5,5,10 What does the (Mean - Median + Mode) compute to?
5
0
3
6
Consider a table "Team" with the
columns : {Teamname, Location, State, Wins, Losses, NetPoints}. If there are 10
rows in "Team" - one row for each team, whats the SQL code to
retrieve the team(s) that scored the most points?
select Teamname from team where points =
max(netpoints)
select Teamname from team where points = (select
max(netpoints) from team)
select Teamname from team where count(*) =
max(netpoints)
select Teamname from team where count(*) =
max(count(netpoints))
Consider a table "Team" with the
columns : {Teamname, Location, State, Wins, Losses, NetPoints}.There are 10
rows in "Team" - one row for each team. You've been given another
table "Game" with one row for each game played in the regular season
(Assume all teams play every other team only once). How many rows will
"Game" have?
45
65
35
60
Consider 3 overlapping sets A,B & C. U
represents the Union operation and ∩ represents the intersection, n(A)
represent the number of elements in set A. Which of the following would give
you the number of elements in two or more sets
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2 n(A ∩ C) – 2
n(B ∩ C) + 2 n(A ∩ B ∩ C)
n(A ∩B ∩ C )
n(A) + n(B) + n(C) + 2 n(A ∩B ∩ C )
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2 n(A ∩ C) – 2
n(B ∩ C) + 3 n(A ∩ B ∩ C)
The variance of X is 3. Y=3X. What is the
variance of Y?
1
3
27
9
Which formula can determine sample size?
Slovin's
Nash's
Tetrahedral's
Moore's
In a star schema what's the cardinality of the
relationship between a fact table and it's dimensions?
Many to Many
One to many
All of these
Many to one
What type of SQL join would return rows that
have matching values?
All of these
Equi-join
Natural join
Outer join
Suppose a fair die is tossed three times. What
is the probability of rolling exactly 2 fours?
1/36
2/36
1/12
1/6
What topology does a back propagation neural
network use?
All of these
Feed backward
Feed forward
Feed either
Consider a table "Team" with the
columns : {Teamname, Location, State, Wins, Losses, NetPoints}. There are 10
rows in "Team" - one row for each team. You've been given another
table "Game" with one row for each game played in the regular season
(Assume all teams play every other team only once). Now you've been asked to
add playoff data to "Game". In the playoffs, the top 5 teams of the
regular season play again between themselves, followed by semi finals and a
final. How many more rows would you expect as a result of adding playoff data?
15
10
9
13
An unfair coin is flipped 4 times and has a
probability of .6517 of getting 3 or 4 heads. What is the probability of
getting heads on a single flip.
.8
.85
.6
.75
.7
Subscribe to:
Posts (Atom)