**Physical Address**

304 North Cardinal St.

Dorchester Center, MA 02124

Master the skills that can get you a $100K+ salary even if you bunked your statistics classes.

No need to waste hours and hours on browsing from one article to the next and piecing together the info you need to grasp important topics. No need to get overwhelmed by the information overload. Find easy to follow, hands-on, and fun explanations of all the essential topics in one place so you can quickly and efficiently learn what you need to thrive as a data scientist.

“Is this course right for me?” Continue to read to decide for yourself!

-“I want to understand this data science concept. Let me Google it”. Then after hours of surfing, reading random articles, and invoking the heavens, you are more confused than before.

– “Data science is the sexiest and highest paying job of the 21st century. I want to become a data scientist too”.

– “I have a basic knowledge of Python, willingness to learn, and commitment to become a great data scientist.”

Is that you? If yes, you are at the right place.

Answer:

`Z = np.zeros(10)`

Answer:

`arr = np.arange(10)`

Answer:

`arr = np.random.random((3,3,3))`

Answer:

```
arr4 = np.random.random((10,10))
min_val = arr4.min()
max_val = arr4.max()
```

Answer:

`grid = np.arange(1, 10).reshape((3, 3))`

Answer:

```
arr6 = np.arange(10)
arr6[arr6.argmax()] = -1
```

Answer:

```
# Input
arr7 = np.arange(9).reshape(3,3)
# Solution
arr7 = arr7[::-1]
```

Answer:

```
arr8 = np.random.rand(3, 10)
transformed_arr8 = arr8 - arr8.mean(axis=1, keepdims=True)
```

Answer:

```
df = pd.DataFrame(data, index=labels)
print(df)
```

Answer:

```
row = df.iloc[0] # or df.loc['AG']
col = df['Listeners']
```

Answer:

`pop_artists = df[df['Genre'] == "Pop" ]`

Answer:

`top_pop = df[((df['Genre'] == "Pop") & (df['Listeners'] > 2000000))]`

Answer:

`grouped = df.groupby('Genre').sum()`

Q1. Suppose we did an experiment to study the chemical reaction of two substances A and B. We measured the quantities of A and B, and then we also observed the temperature and color of the product resulting from their chemical reaction. Now we want to visualize all these four variables in a meaningful way. Which of the following plots would be a good choice to get a good overall picture of our experiment?

**Scatter Plot**- Box Plot
- Histogram

Q2. One of the big shops in your city have increased their product price. Which of the plots would be a good choice to show the sales trend against product prices for the past 5 months?

- Histogram
**Line Plot**- Bar Plot

Q3. Suppose you want to conduct a survey to find the kind of sports your friends like best. Which of the following plots could you you use to visualize this best?

**Bar Plot**- Box Plot
- Histogram

Q4. Suppose we have IELTS test scores and we want to find the following information:

- highest score
- average score
- lowest score
- the inner quartile range

Which of the following plots should we use?

- Bar Plot
**Box Plot**- Line Plot

Answer:

```
mean = np.mean(data)
median = np.median(data)
standard_deviation = np.std(data)
```

Q1. The HR committee of a company wants to determine the average number of employees per team in their company. There are 50 teams in the company. They divide the total number of employees by 50 and determine that the average number of employees per team is 4.2. Which of the following must be true?

**There are a total of 210 employees in the company**- The most common number of employees per team is 4.2.
- Half of the teams have more than 4 children.

Q1. There is a 3% chance that women who smoke will have a mean pregnancy duration of 266 days.

- Valid
**Invalid**

Q2. If smoking has no effect on the duration of pregnancy, there is a 3% random chance that a sample of 40 smokers will have pregnancies lasting less than 260 days.

**Valid**- Invalid

Q1. Suppose you have the record of number of rainy days in October for the last 20 years. What is the best model to estimate the number of rainy days for current October?

**Linear Regression**- Logistic Regression

Q2. In order to train a model to identify a cat and a dog, which of the following regressions will be used?

- Linear Regression
**Logistic Regression**

Q3. Which of the following models allow regression and classification both?

**Decision Trees**- Linear Regression
- K- Nearest Neighbor

Q4. Suppose I am building a search application and I’m looking for a model to find items similar to my new item. What is the best model you would suggest me to use?

- Decision Tree
- Support Vector Machine
**K Nearest Neighbor**

Q5. Suppose you have unlabeled training data and you have to group data points of similar kind. Which is the best algorithm below that would solve your problem?

- Decision Tree
- K Nearest Neighbor
**K Means**

Q6. Dimensional reduction means to:

**reduce the variable in your dataset by getting the set of principial variables**- reduce the size of data by stripping off the rows in the dataset

Q7. The _______ of a dataset represents the number of features in the dataset.

- Resolution
- Density
**Dimensionality**- Coarseness

Q8. Suppose you are working on stock market prediction. Typically tens of millions of shares of Microsoft stock are traded each day. You would like to predict the number of Microsoft shares that will be traded tomorrow. Would you treat this as a classification or a regression problem?

**Regression**- Classification

Q9. “Regression” technique is a group of algorithms that are used for:

**predicting a continuous value, for example predicting price of a house based on its characteristics.**- prediction of class/category, for example a cell is benign or malignant, or a customer will buy your product or not.
- finding items/events that often occur together, for example grocery items that are usually bought together by a customer.

Q10. Which of the following is NOT TRUE about Machine Learning?

- Machine Learning was inspired by human being learning process.
- Machine Learning models iteratively learn from data, and allow computers to find hidden insights.
- Machine Learning models help us in tasks such as object recognition, summarization, and recommendation.
**Machine learning gives computers the ability to make decision by writing down rules and methods and being explicitly programmed.**

Q1. What is the recall, specificity and precision of the confusion matrix below?

**Recall = 20%**

Specificity = 30%

Precision = 22%- Recall = 30%

Specificity = 20%

Precision = 22%

I hope this Grokking Data Science Educative Quiz Answers would be useful for you to learn something new from this problem. If it helped you then don’t forget to bookmark our site for more Coding Solutions.

This Problem is intended for audiences of all experiences who are interested in learning about Data Science in a business context; there are no prerequisites.

Keep Learning!

**More Coding Solutions >>**