Analytics Test Answers



What is true about these 5 numbers? -2, -1, 0, 5, 10
The median is larger than the mean.
Cannot compare means and medians.
The median and mean are equal to each other.
The mean is larger than the median.


Where would the outliers be if a distribution had a skewness of -1?
No outliers
Right
Left
Far right


A normal distribution generally takes the form of a:
None of these
square ruit
asymptote
bell curve


What is the median of: 5, 10, 15?
15
10
5
9.5


What power is used in the formula for variance?
4
2
1
3


Where would the outliers be if a distribution had a skewness of -50?
Far left
Far right
Right
No outliers


Where would the outliers be if a distribution had a skewness of +1?
Left
Far right
No outliers
Right


True or false? Qualitative data is strictly numerical.
False
True


The median grade on a midterm exam in a math class is 72. The teacher feels this is too low, so they award 10 extra points to every student in the class. What is the new median grade for the class?
62
Not enough information.
72
82


Why is the primary key of a dataset important?
It's the most descriptive field
It's the unique identifier of every record
It's the key to the dataset
It's the most important field


The average grade on a midterm exam in a math class is 72. The teacher feels this is too low, so they award 10 extra points to every student in the class. What is the new average grade for the class?
82
72
62
Not enough information.


Your college professor standardizes everyone's test scores. Your standardized score is -1.35. Which of the following statements is true?
You scored within one standard deviation of the average test score.
Your test score was above the average.
Your test score was below the average.
Your test score had the highest standard deviation.


Which describes the dispersion of a data set?
None of these
standard deviation
median
mean


What is the difference between the highest and lowest scores in a data set?
the sample
the mode
the mean
the range


Where would the outliers be if a distribution had a skewness of 0?
Far right
Right
Left
No outliers


What is the mean of: 5, 10, 15?
10
1
5
11


What is the average of the following 5 numbers? 1, 2, 3, 4, 10
10
3
1
4


Which of the following is a measure of spread?
Lower Quartile
Range
Median
Mean


Discrete and continuous data are both forms of:
qualitative data
quantative data
incomplete data
(None of these)


Using previous games to predict the score of a game is an example of ________ statistics.
incomplete
descriptive
None of these
inferential


As a market researcher which of the following algorithms would you use to divide the general population of consumers into market segments and to better understand the relationships between groups of potential customers
Recommendation analysis
Bayesian Estimation
Linear Regression
Cluster Analysis


Over the last 360 days of winter in Raleigh, NC (5 winters) we have had snow on 36 days. What is the probability that we will have snow on any random winter day this year?
0.05
0.01
0.1
0.2


If you were surveying the United States population, New York City would be a:
subset
population (N)
standard deviation


What is the mode of: 5, 10, 10, 15, 17?
11.2
10
12
5


What is the median of the following 5 numbers? 1, 2, 3, 4, 10
1
3
10
4


Which of the following is not a measure of spread?
Standard Deviation
Upper Quartile
Range
Variance


Symbolism: What does the small sigma represent (without any other symbols) in statistics?
Standard Deviation
Mean
Skewness
Variance


A ______ is a numerical characteristic of a population
rectang
category
parameter
constraint


Which of the following best describes a mean calculation?
non-parametric
Neither of these
parametric


You flip an unbiased coin 2 times - what is the probability of getting 2 heads?
75%
100%
50%
25%


What is the probability of rolling a fair dice and getting an even number?
1/2
1/3
4/6
1/6


A forestry researcher recorded many variables on the trees of a large forest. These variables include the height (in meters), the diameter (in centimeters), the species (pine, oak, etc.), and if the tree had Dutch Elm disease. In this study which variables that were recorded were quantitative?
Only height and diameter.
Only height.
All of the variables.
Only species and height.


How can you convert a variance to a standard deviation?
Take the square root of the variance
Take the log of the variance
Square the variance
Take the cubed root of the variance


Bob is a high school basketball player, who is a 72% free throw shooter. Bob has missed his first four free throws of the game. What is the probability that Bob makes his fifth free throw?
90%
100%
0%
72%


Which of the following is NOT a characteristic of a Normal distribution?
Defined completely by mean and variance
Unimodal
Right Skewed
Symmetric


True or false? A continuous variable can only have a whole value.
True
False


You calculate the standard deviation of a data set and find that it is -1.23. From this you can determine which of the following is true?
The mean must be negative.
All of the values in the data set are negative.
You made an arithmetic mistake because standard deviation cannot be negative.
Every value in the data set is the same.


What is true about these 5 numbers? -10, -5, 0, 1, 2
The mean is larger than the median.
Cannot compare means and medians.
The median and mean are equal to each other.
The median is larger than the mean.


What is the skewness of a normal distribution?
0
1
2
3


The variable X is the value of an uneven dice after one roll. It produced the following probability distribution P(X): P(1) = 0.05 P(2) = 0.28 P(3) = 0.12 P(4) = 0.23 P(5) = ? P(6) = ? What is the probability that X = 5 or X = 6?
0.23
0.55
0.42
0.32


The number of cars that went through a car wash during the noon hour over each of the past 8 days are the following: 5, 9, 2, 3, 3, 9, 8, 6 What is the range of this data?
8
7
9
5.6


The process of adding latitudes and longitudes to customer site locations is called:
Customer Segmentation
Geocoding
Spatial analysis
Location analysis


_____ are collections of observations.
Random variables
Snippets
Populations
Data


What is the probability of rolling a fair dice and getting a 1 and flipping a fair coin getting a head?
3/12
1/6
1/2
1/12


Which of the following is an example of mutually exclusive events?
Having it rain on the same day that the sun comes out in the same city
Being late to a meeting and being early to the same meeting
Having one product off the assembly line be defective, but another product on that same assembly line work properly.
Ordering a burger at a fast food restaurant and ordering fries at that same restaurant


Suppose a fair die is tossed twice. What is the probability of rolling two fours?
1/6
1/36
2/6
2/36


What is the probability of rolling a fair dice and getting an even number and flipping a fair coin getting a head?
0.75
0.5
0.25
0


In a simple linear regression analysis, why do we call the best line the "least squares" regression line? 1. The line minimizes the sum of the squared errors in the data. 2. The line maximizes the squared error in the data. 3. The model uses squared predictor variables.
1 only
3 only
1 and 2 only
2 only


Within a normal curve, ___% fall within 1 standard deviation of the mean.
25%
100%
68%
0%


The variable X is the value of an uneven dice after one roll. It produced the following probability distribution P(X): P(1) = 0.05 P(2) = 0.28 P(3) = 0.12 P(4) = 0.23 P(5) = ? P(6) = ? What is the probability that X = 2 or X = 3?
0.28
0.4
0.12
0.45


The average price of a car in a used car lot is $18,000. These prices are Normally distributed with a standard deviation of $3,000. What is the probability that any random car is below $18,000?
42%
68%
50%
95%


How can you convert a standard deviation to a variance?
Take the square root of the standard deviation
Square the standard deviation
Take the cubed root of the standard deviation
Take the log of the standard deviation


When does a Type I error occur?
You reject the null hypothesis when it is true
There is no such term as a "Type I Error"
You fail to reject the null hypothesis when it is false
None of the other choices


A survey has 3 multiple choice questions on it. What type of survey is this?
closed
All of these
combination
open


Where would the outliers be if a distribution had a skewness of +50?
Far right
Right
No outliers
Left


Which of the following statements are true? 1. Categorical variables are the same as qualitative variables. 2. Categorical variables are the same as quantitative variables. 3. Quantitative variables can be continuous variables.
1 and 3 only
1 only
3 only
2 only


What is heteroskedastic?
There is no adjustment factor (e.g. epsilon)
The adjustment factor changes (e.g. epsilon)
The adjustment factor doesn't change (e.g. epsilon)


A single 6-sided die is rolled. What is the probability of rolling a prime number?
0.5
1
0.17
0.33


What is the median of the following 5 numbers? 10, 2, 4, 3, 1
10
1
4
3


If the sample mean of a dataset is 10 and the standard deviation is 6, what percent of the data would you expect to fall between 4 and 16 assuming the data distribution is normal?
68%
99.7%
81.5%
95%


What type of SQL join is needed when you wish to include rows that do not have matching values?
Natural join
Equi-join
Outer join
Inner join


Experts rank athletic teams 1 through 10. What type of data is this?
ordinal
discrete
categorical
quantative


A manager of a large bank wants to compute the average interest rates across all bonds that the bank invests in. The manager randomly sampled 127 bonds that the bank invests in and calculated the average interest rate over the past year of the sample was 2.47%. What is the parameter of interest in this study?
All bonds that the bank invests in.
The average interest rate of all bonds that the bank invests in.
The 127 bonds used in the calculation.
2.47%


What is the mean of a normalized standard distribution?
1
50
0
0.5


What is derived from the second moment of distribution?
Variance
Kurtosis
Skewness
Mean


Which is an example of nonmetric data?
Ordinal
Sample
Parametric
Quantative


Which of the following statements are true about confidence intervals for means? 1. The center of the confidence interval is always 0. 2. The bigger the confidence interval, the smaller the margin of error. 3. The bigger your sample, the smaller the margin of error.
1 and 2 only
2 only
1 only
3 only


A survey was conducted to find the average weight of students living in the dorms or a university. To help improve the accuracy of the study, an equal number of students were randomly selected from each dorm for the sample. This sample is an example of what?
Standard Random Sample
Block Design
Experiment
Stratified Random Sample


As CEO of a casino, I'm trying to understand which of my customers would respond to a certain ad campaign. So I randomly select a 1000 customers, from each demographic, income group and region to conduct a survey.This experiment uses which of the following sampling techniques
Systematic
Panel sampling
Stratified sampling
Simple random


What is synonymous with the coefficient of determination?
Sigma
Mu
R
R-Squared


Which of the following is a characteristic of an F-distribution?
No no lower or upper bound
Right Skewed
Symmetric
Bimodal


What is the first moment of distribution?
Skewness
Mean
Standard Deviation
Kurtosis


Compute the following iterative square root √(2+ (√2+(√2+……..)
2
4
2.4
3


Assume the coefficient on the predictor variable in the simple linear regression model is 2. What is the interpretation of this coefficient? The predictor variable will be called x and the response variable will be called y.
If x increases by one unit, then y must decrease by 2 units.
If x increases by one unit, then y must increase by 2 units.
On average, if x increases by one unit, y decreases by 2 units.
On average, if x increases by one unit, y increases by 2 units.


The median grade on a midterm exam in a math class of 60 students is 85. The teacher gives an additional 5 bonus points to the 3 students who scored the highest on the exam. What is the new median grade for the class?
Not enough information.
90
85
80


Which of the following must be true for the standard deviation of a set of observations to be 0?
All observations have the same value
All observations are 0
All observations are equally dispersed from the mean
The mean of the set is 0


If you have a hypothesis test with a significance level of 0.05 and a p-value of 0.01, what is the result of your hypothesis test?
You accept the null hypothesis
Not enough information.
You reject the null hypothesis
You fail to reject the null hypothesis


Symbolism: What does the Greek letter mu represent (without any other symbols) in statistics?
Sample mean
Population mean
Sample standard deviation
Population standard deviation


What does R^2 (R-Squared) calculate?
The coefficient of variation^2
The squared covariance
The slope of the regression
The closeness of a regression to the underlying data


As manager of your sales organization you want to compare the mean number of sales calls made per salesman in a week using two different messaging techniques. Two hundred employees from your company are randomly selected and each is randomly assigned to one of the two messaging techniques. After teaching 100 salesmen one technique and 100 salesmen the other technique, you record the number of succesful calls each salesman makes in one month. Which of the following tests would you use for comparison?
Chi square test
Two sample z test
Two sample t test
Bayesian test


A multivariate regression model exhibits a highly significant F-statistic, but each predictor's individual t-statistic is insignificant. What phenomenon explains this?
Heteroskedasticity
Multicollinearity
Homoskedasticity
E & G


What is homoskedastic?
The adjustment factor changes (e.g. epsilon)
I do not know
The adjustment factor doesn't change (e.g. epsilon)
There is no adjustment factor (e.g. epsilon)


Which of the following holds true for the correlation coefficient R?
-1 <= R <= 1
0 <= R <= 1
R > = 0
- R1 >= R <= 1r


60% of the population in DC supports the Redskins, 40% support the Ravens, and 20% support both. The proportion of the populace that supports neither is:
20%
30%
50%
40%


For an airline, many times small cities have limited flights that go into their airports. To get a flight to Columbia, SC you must go through one of three cities: Raleigh, NC, Atlanta, GA, or Charlotte, NC. Two customers from Orlando, FL are trying to get to Columbia, SC with only one stop (one of the three above mentioned cities). Assume that they are equally likely to go through any of the above cities. What is the probability neither of the customers fly through Charlotte, NC?
4/9
1/3
2/3
2/9


For an airline, many times small cities have limited flights that go into their airports. To get a flight to Columbia, SC you must go through one of three cities: Raleigh, NC, Atlanta, GA, or Charlotte, NC. Two customers from Orlando, FL are trying to get to Columbia, SC with only one stop (one of the three above mentioned cities). Assume that they are equally likely to go through any of the above cities. What is the probability one of the customers fly through Charlotte, NC, while the other does not fly through Charlotte, NC?
2/9
4/9
2/3
1/3


What power is used in the formula for skewness?
1
4
3
2


What is the coefficient of variation?
Kurtosis minus standard deviation
Normalized measure of dispersion of a probability distribution
Mean^2
Standard Deviation


A card is drawn randomly from an ordinary deck of playing cards. You win a prize if the card is a heart or the card is an ace. What is the probability that you will win the prize?
13/52
1/13
17/52
16/52


What is the formula for the Pearsonian Coefficient of Skewness?
3*(population mean - population mode) / Variance
3*(population mean - population mode) / Standard deviation
5*(population mean - population mode) / Standard deviation
2*(population mean - population mode) / Variance


What power is used in the formula for kurtosis?
1
4
3
2


Consider 2 tables A & B which have 10 and 6 rows respectively. A and B have one field ID, in common. ID also happens to be the primary key for both tables, and therefore cannot be null. If I did a cartesian join between them on the field ID, how many rows will my result have?
120
60
10
16


Which of the following numbers (measures of kurtosis of a distribution) would represent a leptokurtic distribution?
4
1
2
3


You appear for two tests as part of a history class. In test 1 you score 60 points out of a maximum of 100. The class mean is 50 and the standard deviation is 10.On test 2 you score 60 points out of a total of 70. The class mean is 40 and the standard deviation is 20. In which test did you do better relative to the rest of the class?
Test 1
You did equally well
Can't tell without class size
Test 2


The probability that a vendor is able to a sell X number of cupcakes daily is represented by the function P(X) P(0) = 0.3 P(10) = 0.4 P(50) = 0.2 P(100) = 0.1 If each cupcake sells for $3 what's the expected daily revenue?
$46
$60
$50
$72


What is the expected value of rolling an unbiased die (6 sided)?
3
2.5
3.5
4.5


What is synonymous with the correlation coefficient?
Sigma
Mu
R-Squared
R


ANOVA (analysis of variance) is used to determine if a relationship exists between which two types of variables?
Continuous response variable, Continuous predictor variable
Categorical response variable, Categorical predictor variable
Categorical predictor variable, Continuous response variable
Categorical response variable, Continuous predictor variable


Which of the following is not a supervised learning algorithm?
Naïve Bayes
Self organizing map
Logistic Regression
Support Vector Machines


The variance of X is 15. Y=X+5. What is the Variance of Y?
40
15
17.5
20


Consider 3 overlapping sets A,B & C. U represents the Union operation and ∩ represents the intersection, n(A) represent the number of elements in set A. Which of the following would give you the number of elements in exactly one set
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2n(A ∩ C) – 2n(B ∩ C)
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2 n(A ∩ C) – 2 n(B ∩ C) + 3 n(A ∩ B ∩ C)
n(A) + n(B) + n(C) + 3n(A ∩ B ∩ C)
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2n(A ∩ C) – 2n(B ∩ C) + 2n(A ∩ B ∩ C)


How many social security numbers are possible before they run out? A Social Security number is a 9 digit random number that can have repeating digits
A billion
A million
100 million
A Trillion


Which of the following numbers (measures of kurtosis of a distribution) would represent a platykurtic distribution?
2
3.5
20
10


If I tossed a fair coin 3 times, what is the probability of it landing on tails only once?
1/8
3/8
1/4
1/2


Consider 2 tables A & B which have 10 and 6 rows respectively. A and B have one field ID, in common. ID also happens to be the primary key for both tables, and therefore cannot be null. How many rows will A right outer join B on the field ID yield?
4
6
10
60


What kind of statistics primarily uses Chi-square?
parametric
non-parametric
none of these
descriptive


What is the cumulative probability at -1 standard deviation?
50%
84.13%
15.87%
34.13%


Consider 2 tables A & B which have 10 and 6 rows respectively. A and B have one field ID, in common. ID also happens to be the primary key for both tables, and therefore cannot be null. How many rows will A left outer join B on the field ID, yield?
10
6
4
60


Which of the following is not an unsupervised machine learning algorithm?
Blind signal separation
Linear Regression
Clustering
Principal component analysis


What is the kurtosis of a normal distribution?
3
1
0
2


Which of the following situations would entail the use of dummy variables?
When qualitative predictors have to be modeled
When categorical predictors have to be modeled
When residual analysis needs to be peformed
D& E


Which of the following databases should you not normalize?
Read-only database
Updateable database
Transactional database
D & G


Which of the following is not the characteristic of a data warehouse?
Non volatile
Time invariant
Capable of integrating data from a variety of sources
Subject oriented


Which of the following can a de-identified patient database contain?
Social security numbers
Patient expenses
All of these
Patient addresses


Which of the following correlation coefficients indicates the strongest correlation?
-0.7
1.5
0.66
2.5


Consider the following set of data points: 7,5,7,4,5,6,5,5,10 What does the (Mean - Median + Mode) compute to?
5
0
3
6


Consider a table "Team" with the columns : {Teamname, Location, State, Wins, Losses, NetPoints}. If there are 10 rows in "Team" - one row for each team, whats the SQL code to retrieve the team(s) that scored the most points?
select Teamname from team where points = max(netpoints)
select Teamname from team where points = (select max(netpoints) from team)
select Teamname from team where count(*) = max(netpoints)
select Teamname from team where count(*) = max(count(netpoints))


Consider a table "Team" with the columns : {Teamname, Location, State, Wins, Losses, NetPoints}.There are 10 rows in "Team" - one row for each team. You've been given another table "Game" with one row for each game played in the regular season (Assume all teams play every other team only once). How many rows will "Game" have?
45
65
35
60


Consider 3 overlapping sets A,B & C. U represents the Union operation and ∩ represents the intersection, n(A) represent the number of elements in set A. Which of the following would give you the number of elements in two or more sets
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2 n(A ∩ C) – 2 n(B ∩ C) + 2 n(A ∩ B ∩ C)
n(A ∩B ∩ C )
n(A) + n(B) + n(C) + 2 n(A ∩B ∩ C )
n(A) + n(B) + n(C) – 2 n (A ∩ B) – 2 n(A ∩ C) – 2 n(B ∩ C) + 3 n(A ∩ B ∩ C)


The variance of X is 3. Y=3X. What is the variance of Y?
1
3
27
9


Which formula can determine sample size?
Slovin's
Nash's
Tetrahedral's
Moore's


In a star schema what's the cardinality of the relationship between a fact table and it's dimensions?
Many to Many
One to many
All of these
Many to one


What type of SQL join would return rows that have matching values?
All of these
Equi-join
Natural join
Outer join


Suppose a fair die is tossed three times. What is the probability of rolling exactly 2 fours?
1/36
2/36
1/12
1/6


What topology does a back propagation neural network use?
All of these
Feed backward
Feed forward
Feed either


Consider a table "Team" with the columns : {Teamname, Location, State, Wins, Losses, NetPoints}. There are 10 rows in "Team" - one row for each team. You've been given another table "Game" with one row for each game played in the regular season (Assume all teams play every other team only once). Now you've been asked to add playoff data to "Game". In the playoffs, the top 5 teams of the regular season play again between themselves, followed by semi finals and a final. How many more rows would you expect as a result of adding playoff data?
15
10
9
13


An unfair coin is flipped 4 times and has a probability of .6517 of getting 3 or 4 heads. What is the probability of getting heads on a single flip.
.8
.85
.6
.75
.7