**Physical Address**

304 North Cardinal St.

Dorchester Center, MA 02124

If you’re a software engineer looking to add Machine Learning to your skillset, this is the place to start.

This course will teach you to write useful code and create impactful Machine Learning applications immediately. From the start, you’ll be given all the tools that you need to create industry-level machine learning projects. Rather than reading through dense theory, you’ll learn practical skills and gain actionable insights. Topics covered include data analysis/visualization, feature engineering, supervised learning, unsupervised learning, and deep learning. All topics are taught with industry standard frameworks: NumPy, pandas, scikit-learn, XGBoost, TensorFlow, and Keras.

Basic knowledge about Python is a prerequisite to this course.

This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.

Q1. What does the `arange`

function do?

**Returns a 1-D array of numbers based on a a given range and interval**- Returns a random array
- Squares each value of an array
- Returns the largest element of an array

Q2. What happens if we try to use `np.save`

on a file without a .npy extension?

- The function raises an exception
- The data is saved to the file without the .npy extension
**The function automatically adds the .npy extension**- The function does nothing

Q3. What is the difference between `np.sum`

and `np.cumsum`

?

- The former is computationally more expensive than the latter
- Both perform the same thing
- The former is significantly slower than the latter
**The former produces the overall sum while the latter calculates cumulative sums**

Q4. Consider the 2-D array, `arr`

. What is the output of `np.sum(arr, axis=1)`

?

- A 1-D array containing the column sums of
`arr`

**A 1-D array containing the row sums of****arr**- A 2-D array containing the cumulative column sums of
`arr`

- A 1-D array containing the cumulative row sums of
`arr`

Q1. Which of the following are methods for indexing into a DataFrame?

- Use the
`loc`

and`iloc`

functions - Directly index columns similar to a Python dictionary
- Using slices to retrieve a set of rows
**All of the above**

Q2. What is the difference between a Series and a DataFrame?

**A Series is 1-D while a DataFrame is 2-D**- The DataFrame is the replacement of the deprecated Series
- A DataFrame is faster but has less functionality
- They are both aliases for the same object

Q3. What is a categorical feature?

**A data feature that can be placed into one of several different categories**- A list-like data feature that has many different categories of sub-data
- A data feature that categorizes other data features
- A data feature that is the accumulation of many other data features

Q4. What is an indicator feature?

- A feature that indicates whether or not the data has been pre-processed
**A feature that represents categorical data, using 1’s and 0’s to denote which categories are present**- An indicator of whether or not a category is quantitative or categorical
- None of the above

Q1. What is the main purpose of standardizing data?

**It lets us view data in terms of standard deviations from the average case (i.e. mean)**- It lets us use much smaller data values, which makes computation much quicker
- It makes it easier to calculate the mean of each column in the data
- It removes outliers from the dataset

Q2. What is data imputation?

- Combining multiple datasets into one large dataset
- Removing certain features from a dataset
- Subtracting the column-wise means from the values in a dataset
**Replacing missing data in the dataset with substituted values**

Q3. Why do we use PCA?

- To increase the number of features in a dataset
- To group together correlated features
**To remove correlation from dataset features and perform dimensionality reduction**- To remove repeated observations from the dataset

Q4. What is a good practice for plotting data separated by class label?

- Plot a percentage of points from each class based on the class distribution
**Use different colors and include a legend to distinguish data by class label**- Put each class’ points on separate plots, never on the same plot
- None of the above

Q1. Why is regularization important in regression modeling?

- It helps the model take into account each of the outlier points in the dataset
- It lets the model use larger weight values, which improves the model’s performance
**It prevents the model from being too sensitive to random errors or outliers in the dataset**- None of the above

Q2. What is the purpose of cross-validation?

- It provides an additional evaluation metric on the test set
**It gives us a better evaluation of the model without needing a third dataset split**- It gives us the best possible split for the training and testing sets
- It automatically chooses the best hyperparameters for our model

Q3. What is logistic regression used for?

**Classification**- Regression
- Cross-validation
- None of the above

Q4. What do the number of folds in a K-fold cross-validation represent?

- The number of subsets of the training set used to train the model each iteration
**The number of total subsets that the training set is split into**- The number of subsets of the training set used to validate the model each iteration
- The number of total subsets that the test set is split into

Q1. What does cosine similarity measure (in terms of a dataset)?

- The number of identical feature values between two data observations
- The similarity in a particular feature column for two data observations
- The similarity between two feature values for a particular data observation
**The proportional similarity in feature values between two data observations**

Q2. What does the *K* represent in K-means clustering?

**The number of clusters specified prior to starting the algorithm**- The number of clusters automatically determined by the algorithm
- The average number of data points in each cluster
- The minimum number of data points in each cluster

Q3. Which of the following algorithms automatically determines the number of clusters?

- K-means clustering
**DBSCAN clustering**- Adjusted Rand index
- None of the above

Q4. Which of the following is inaccurate about adjusted Rand index?

- The values range from -1 to 1
- It is not affected by permutations in the labeling
- The ARI function in scikit-learn is symmetric (i.e. can swap argument order)
**It is corrected-for-chance: random labelings always get a score of -1**

Q1. What type of model does XGBoost use?

**Gradient boosted decision trees**- Multilayer perceptrons
- Gradient boosted neural networks
- Long short-term memory

Q2. What is the main benefit of using XGBoost models over scikit-learn models?

**They’re much faster than scikit-learn models**- They’re much more accurate than scikit-learn models
- They’re easier to use than scikit-learn models
- There is no benefit, it’s a matter of personal preference

Q3. How do we save models in XGBoost?

- Using the
`dump`

function of the`joblib`

module - Using the
`save_model`

function for Booster objects **Both A and B are viable (Correct)**- None of the above

Q4. What does the `feature_importances_`

property of an XGBoost model tell us?

- The F1 score of each data feature in making model predictions
- The magnitude of values for each data feature
- The number of non-zero values for each data feature
**The relative importance of each data feature in making model predictions**

Q1. What is a benefit to using more hidden layers in a neural network?

**Additional hidden layers allow the model to learn more complex decision boundaries**- Using more hidden layers will speed up model training
- A neural network with only a single hidden layer is more computationally expensive
- Using more hidden layers helps regularize the neural network

Q2. Why do we use the sigmoid function for binary classification?

- The sigmoid function converts the model’s output into a real number
- The sigmoid function extracts a bounded absolute value from the model’s output
**The sigmoid function converts the model’s output into a probability**- The sigmoid function is faster to calculate than other functions

Q3. Which of the following is an example of overfitting?

- A model is trained until its weights converge, at which point training is halted
**A model is trained until it reaches 99% accuracy on the training set, but performs poorly on the test set**- Both A and B
- None of the above

Q4. Which of the following statements is true?

- A single layer (i.e. no hidden layers) neural network can learn non-linear decision boundaries
**The weights of a neural network can be either positive or negative**- Cross entropy loss is the only loss function used to train neural networks
- Logits are the output of a neural network, with values strictly between 0 and 1

Q1. What is the main benefit of Keras over TensorFlow?

- It is better for writing industry-level production code
- It provides more model architectures to use for different applications
**The API is simpler to use, especially for model training and evaluation**- There is no benefit, since Keras is built on top of TensorFlow

Q2. What does the `Dense`

object represent in Keras?

- A neuron in a neural network
- A weighted connection between two neurons
- A multilayer perceptron
**A fully-connected neural network layer**

Q3. For multiclass classification, what is the output of the `predict`

function of a Keras `Sequential`

model?

**A 2-D NumPy array of class probabilities**- A 1-D NumPy array of class predictions
- A dictionary of class predictions
- A 2-D NumPy array of logits

I hope this Machine Learning for Software Engineers Educative Quiz Answers would be useful for you to learn something new from this problem. If it helped you then don’t forget to bookmark our site for more Coding Solutions.

This Problem is intended for audiences of all experiences who are interested in learning about Data Science in a business context; there are no prerequisites.

Keep Learning!

**More Coding Solutions >>**