Data Science for Non-Programmers Educative Quiz Answers

Get Data Science for Non-Programmers Educative Quiz Answers

Ready to move past Excel for complex business analysis? Then you’ll find this course very helpful.

This hands-on introductory Data Science course is aimed at professionals and students who don’t have any experience with programming. It will help you advance your career by preparing you to conduct meaningful data analysis in Python on any dataset — large or small.

You’ll begin with the fundamentals of Python, with focus on CSV files in Python, covering concepts like data preprocessing and Exploratory Data Analysis (EDA). In the second half, you’ll focus on predictive and inferential analysis using statistical and machine learning techniques, and learn how these techniques can help solve business problems.

Enroll on Educative

Exercise 1: Average of a List

Answer:

def average(input_list):
    sum_list = 0
    for i in input_list:
        sum_list = sum_list + i 

    avg = sum_list/len(input_list)
    return avg
    


Exercise 2: Factorial of a Number

Answer:

def factorial(n):
    if n==0 or n==1:
        return 1
    if n < 1:
        return -1
    
    product = 1
    while(n > 1):
        product = product * n
        n = n-1
    
    return product

Quiz 1:

Q1. A Dataframe is a 2-Dimensional object to store tabular data.

  • True
  • False

Q2. Suppose we have a Gender column in our dataframe (df) which has the values Male and Female. Which of these will give us a filtered dataframe of males. Select all answers you think are correct.

  • Option 1
df = df['Male']
  • Option 2
df = df[df['Male']]
  • Option 3
condition = df['Gender'] == 'Male'
df = df[condition]
  • Option 4
df = df[ df['Gender'] == 'Male']
  • Option 5
condition = df['Gender'] != 'Female'
df = df[condition]

Q3. Which of these can be used to set the value of the first cell in the Age column to 2323 if Age is the first column in the dataset? Select all answers you think are correct.

  • Option 1
df[0,'Age'] = 23
  • Option 2
df.loc[0,'Age'] = 23
  • Option 3
df.iloc[0,'Age'] = 23
  • Option 4
df.iloc[0,0] = 23

Q4. Which of the following are aggregation functions, i.e., functions that take in a series and return a single value? Select all answers you think are correct.

  • min
  • mean
  • sum
  • groupby

Q5.The apply function is used to apply custom functions to the data.

  • True
  • False

Q6. We can NOT group data for more than one variable.

  • True
  • False

Q7. Both groupby and pivot_table are used for summarizing data.

  • True
  • False

Q8.

df.plot(kind = 'box',subplots = True, sharex=False, sharey = False)

In the above use of the plot function, subplots=True tells the function to arrange all boxplots in rows and columns inside a group of plots.

  • True
  • False

Exercise 3: Cleaning NYC Property Sales

Change values:

Answer:

def change_values(df):
    
    condition = df['BOROUGH'] == 1
    df.loc[condition,'BOROUGH'] = 'Manhattan'

    condition = df['BOROUGH'] == 2
    df.loc[condition,'BOROUGH'] = 'Bronx'

    condition = df['BOROUGH'] == 3
    df.loc[condition,'BOROUGH'] = 'Brooklyn'

    condition = df['BOROUGH'] == 4
    df.loc[condition,'BOROUGH'] = 'Queens'

    condition = df['BOROUGH'] == 5
    df.loc[condition,'BOROUGH'] = 'Staten Island'
    
    return df
        


Missing values:

Answer:

def remove_missing(df):
    present = df['SALE PRICE'].notnull()
    df = df[present]
    return df

Duplicate values:

Answer:

def remove_duplicates(df):
    df = df.drop_duplicates(subset=df.columns)
    return df

Outliers:

Answer:

def remove_outliers(df):
        # Retrieve only outlier columns
    new_df = df[['RESIDENTIAL UNITS', 'COMMERCIAL UNITS','TOTAL UNITS', 'LAND SQUARE FEET','GROSS SQUARE FEET','YEAR BUILT']]

    # find max and min using IQR
    Q1 = new_df.quantile(0.10)
    Q3 = new_df.quantile(0.90)
    IQR = Q3-Q1
    minimum = Q1 - 1.5*IQR
    maximum = Q3 + 1.5*IQR

    # condition on which to filter 
    condition = (new_df <= maximum) & (new_df >= minimum)
    condition = condition.all(axis=1)

    # Filter rows that have outliers
    df = df[condition]

    return df

Quiz 2: Analyzing Individual Quantities

Q1. What is the mean of LIMIT_BAL?

  • 176488
  • 160000
  • 167488
  • 170000

Q2. How many times do LIMIT_BAL values appear in the interval (100000.0, 200000.0] ?

  • 7882
  • 5061
  • 2054

Q3. What is the 75% percentile of LIMIT_BAL?

  • 50000
  • 140000
  • 240000

Q4. What is the skew value of LIMIT_BAL?

  • 0.50
  • 1.99
  • 2.53
  • 0.99

Quiz 3: Exploring Categorical Quantities

Q1. How many married persons have defaulted in our dataset?

  • 10455
  • 5209
  • 3206
  • 3342

Q2. How many single persons have NOT defualted in our dataset?

  • 12628
  • 5206
  • 3342
  • 10455

Q3. What is the probability of a married person defaulting next month?

  • 0.24
  • 0.23
  • 0.20
  • 0.21

Q4. A single person is more likely to default the next month than a married person in our dataset.

  • True
  • False

Quiz 4: Exploring Numerical Quantities

Q1. How many people lie in the interval (0, 100000] of LIMIT_BAL who have defaulted?

  • 3684
  • 8817
  • 3454

Q2. What is the probability of people defaulting who get LIMIT_BAL in the interval (100000, 200000] ?

  • 0.24
  • 0.13
  • 0.19
  • 0.34

Q3. As the LIMIT_BAL given to a person increases, the probability of the person defaulting decreases.

  • True
  • False

Exercise 4: Exploring E-Commerce

Answer:

def exercise_1(df):
  temp = df.groupby('CustomerID').size()
  temp = temp.sort_values(ascending=False)
  temp = temp.iloc[:5]
  return temp

def exercise_2(df):
  temp = df.groupby('CustomerID').sum()
  temp = temp['AmountSpent']
  temp = temp.sort_values(ascending=False)
  temp = temp.iloc[:5]
  return temp

def exercise_3(df):
  temp = df.groupby('Country').size()
  temp = temp.sort_values(ascending=False)
  temp = temp.iloc[:5]
  return temp
    
def exercise_4(df):
  condition = df['PurchaseYear'] == 2011
  temp = df[condition]
  temp = temp.groupby('PurchaseMonth').size()
  return temp

def exercise_5(df):
  temp = df.groupby('Description').sum()
  temp = temp['Quantity']
  temp = temp.sort_values(ascending=False)
  temp = temp.iloc[:10]
  return temp 


Exercise 5: Churn Prediction

Answer:

def churn_predict_acc(X,Y,test_inputs,test_outputs):
    # Write code here

    lr = LogisticRegression()
    lr.fit(X,Y)
    
    preds = lr.predict(test_inputs)
    acc = accuracy_score(y_true = test_outputs,y_pred = preds)
    return acc


Quiz 5: Machine Learning in Python

Q1. Artificial Intelligence is a sub domain of Machine Learning.

  • True
  • False

Q2. Decision Trees capture non linear relationships between variables.

  • True
  • False

Q3. Linear Regression models can NOT capture non linear relationships.

  • True
  • False

Q4. Out of the following algorithms:

  1. Decision Trees
  2. Support Vector Machines

Which performs better?

  • Decision Trees
  • Support Vector Machines
  • Depends on the problem and the dataset

Q5. Random Forest is a boosting algorithm.

  • True
  • False

Q6. In bagging, individual models train on data that is sampled _____.

  • without replacement
  • with replacement

Q7. Which of the following algorithms can be used for unsupervised learning? Check all answer that you think are correct.

  • SVMs
  • KMeans
  • Mean Shift
  • Random Forests
  • AdaBoost

Q8. PCA is used for

  • clustering
  • dimensionality reduction
  • none of these

Q9.

km = KMeans(n_clusters = 2)
km.fit(data)
result = km.predict(data)

In the above code, what is being stored in result?

  • The cluster centers
  • The cluster numbers to which each observation in data belongs to
  • None of these

Q10. Clustering can NOT be used to segment customer groups.

  • True
  • False
Conclusion:

I hope this Data Science for Non-Programmers Educative Quiz Answers would be useful for you to learn something new from this problem. If it helped you then don’t forget to bookmark our site for more Coding Solutions.

This Problem is intended for audiences of all experiences who are interested in learning about Data Science in a business context; there are no prerequisites.

Keep Learning!

More Coding Solutions >>

LeetCode Solutions

Hacker Rank Solutions

CodeChef Solutions

Leave a Reply

Your email address will not be published.