Spark Fundamentals I Quiz Answers – Cognitive Class

Get Spark Fundamentals I Quiz Answers

Learn the fundamentals of Spark, the technology that is revolutionizing the analytics and big data world! Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical MapReduce program cannot provide, Spark is the way to go.

Enroll on Cognitive Class

Module 1 – Introduction to Spark

Question: What gives Spark its speed advantage for complex applications?

  • Spark extends the MapReduce model
  • Spark makes extensive use of in-memory computations
  • Various libraries provide Spark with additional functionality
  • Spark can cover a wide range of workloads under one system
  • All of the above

Question: For what purpose would an Engineer use Spark? Select all that apply.

  • Analyzing data to obtain insights
  • Programming with Spark’s API
  • Transforming data into a useable form for analysis
  • Developing a data processing system
  • Tuning an application for a business use case

Question: Which of the following statements are true of the Resilient Distributed Dataset (RDD)? Select all that apply.

  • There are three types of RDD operations.
  • RDDs allow Spark to reconstruct transformations
  • RDDs only add a small amount of code due to tight integration
  • RDD action operations do not return a value
  • RDD is a distributed collection of elements parallelized across the cluster.

Module 2 – Resilient Distributed Dataset and DataFrames

Question: Which of the following methods can be used to create a Resilient Distributed Dataset (RDD)? Select all that apply.

  • Creating a directed acyclic graph (DAG)
  • Parallelizing an existing Spark collection
  • Referencing a Hadoop-supported dataset
  • Using data that resides in Spark
  • Transforming an existing RDD to form a new one

Question: What happens when an action is executed?

  • A cache is created for storing partial results in memory
  • Data is partitioned into different blocks across the cluster
  • Executors prepare the data for operation in parallel
  • The driver sends code to be executed on each block
  • All of the above

Question: Which of the following statements is true of RDD persistence? Select all that apply.

  • Persistence through caching provides fault tolerance
  • Future actions can be performed significantly faster
  • Each partition is replicated on two cluster nodes
  • RDD persistence always improves space efficiency
  • By default, objects that are too big for memory are stored on the disk

Module 3 – Spark application programming

Question: What is SparkContext?

  • An object that represents the connection to a Spark cluster
  • A tool for linking to nodes
  • A tool that provides fault tolerance
  • The built-in shell for the Spark engine
  • A programming language for applications

Question: Which of the following methods can be used to pass functions to Spark? Select all that apply.

  • Transformations and actions
  • Passing by reference
  • Static methods in a global singleton
  • Import statements
  • Anonymous function syntax

Question: Which of the following is a main component of a Spark application’s source code?

  • SparkContext object
  • Business Logic
  • Import statements
  • Transformations and actions
  • All of the above

Module 4 – Introduction to the Spark libraries

Question: Which of the following is NOT an example of a Spark library?

  • Spark Streaming
  • GraphX
  • Hive
  • Spark SQL
  • MLlib

Question: From which of the following sources can Spark Streaming receive data? Select all that apply.

  • Kafka
  • JSON
  • Parquet
  • HDFS
  • Hive

Question: In Spark Streaming, processing begins immediately when an element of the application is executed. True or false?

  • True
  • False

Module 5 – Spark configuration, monitoring and tuning

Question: Which of the following is a main component of a Spark cluster? Select all that apply.

  • Driver Program
  • SparkContext
  • Cluster Manager
  • Worker node
  • Cache

Question: What are the main locations for Spark configuration? Select all that apply.

  • The SparkConf object
  • The Spark Shell
  • Executor Processes
  • Environment variables
  • Logging properties

Question: Which of the following techniques can improve Spark performance? Select all that apply.

  • Scheduler Configuration
  • Memory Tuning
  • Data Serialization
  • Using Broadcast variables
  • Using nested structures

Final Exam

Question: Which of the following is a type of Spark RDD operation? Select all that apply.

  • Parallelization
  • Action
  • Persistence
  • Transformation
  • Evaluation

Question: Spark must be installed and run on top of a Hadoop cluster. True or false

  • True
  • False

Question: Which of the following operations will work improperly when using a Combiner?

  • Minimum
  • Maximum
  • Average
  • Count
  • All of the above operations will work properly

Question: Spark supports which of the following libraries?

  • Spark SQL
  • Spark Streaming
  • MLlib
  • GraphX
  • All of the above

Question: Spark supports which of the following programming languages?

  • C++ and Python
  • Scala, Java, C++, Python, Perl
  • Scala, Perl, Java
  • Java and Scala
  • Scala, Python, Java, R

Question: A transformation is evaluated immediately. True or false?

  • True
  • False

Question: Which storage level does the cache() function use?

  • MEMORY_ONLY_SER
  • MEMORY_ONLY
  • MEMORY_AND_DISK
  • MEMORY_AND_DISK_SER

Question: Which of the following statements does NOT describe accumulators?

  • They are read-only
  • They can only be added through an associative operation
  • They can only be read by the driver
  • Programmers can extend them beyond numeric types
  • They implement counters and sums

Question: You must explicitly initialize the SparkContext when creating a Spark application. True or false?

  • True
  • False

Question: The “local” parameter can be used to specify the number of cores to use for the application. True or false?

  • True
  • False

Question: Spark applications can ONLY be packaged using one, specific build tool. True or false?

  • True
  • False

Question: Which of the following parameters of the “spark-submit” script determine where the application will run?

  • –conf
  • –class
  • –master
  • –deploy-mode
  • None of the above

Question: Which of the following is NOT supported as a cluster manager?

  • YARN
  • Mesos
  • Spark
  • Helix
  • All of the above are supported

Question: Spark SQL allows relational queries to be expressed in which of the following?

  • HiveQL only
  • SQL only
  • Scala, SQL, and HiveQL
  • Scala and SQL
  • Scala and HiveQL

Question: Spark Streaming processes live streaming data in real-time. True or false?

  • True
  • False

Question: The MLlib library contains which of the following algorithms?

  • Dimensionality Reduction
  • Regression
  • Classification
  • Clustering
  • All of the above

Question: What is the purpose of the GraphX library?

  • To generate data-parallel models
  • To create a visual representation of the data
  • To perform graph-parallel computations
  • To convert from data-parallel to graph-parallel algorithms
  • To create a visual representation of a directed acyclic graph (DAG)

Question: Which list describes the correct order of precedence for Spark configuration, from highest to lowest?

  • Properties set on SparkConf, flags passed to spark-submit, values in spark-defaults.conf
  • Properties set on SparkConf, values in spark-defaults.conf, flags passed to spark-submit
  • Values in spark-defaults.conf, properties set on SparkConf, flags passed to spark-submit
  • Flags passed to spark-submit, values in spark-defaults.conf, properties set on SparkConf
  • Values in spark-defaults.conf, flags passed to spark-submit, properties set on SparkConf

Question: Spark monitoring can be performed with external tools. True or false?

  • True
  • False

Question: Which serialization libraries are supported in Spark? Select all that apply.

  • Apache Avro
  • Java Serialization
  • Protocol Buffers
  • Kyro Serialization
  • TPL

Conclusion:

We hope you know the correct answers to Spark Fundamentals I. If Queslers helped you to find out the correct answers then make sure to bookmark our site for more Course Quiz Answers.

If the options are not the same then make sure to let us know by leaving it in the comments below.

Course Review:

In our experience, we suggest you enroll in this and gain some new skills from Professionals completely free and we assure you will be worth it.

This course is available on Cognitive Class for free, if you are stuck anywhere between quiz or graded assessment quiz, just visit Queslers to get all Quiz Answers and Coding Solutions.

More Courses Quiz Answers >>

Building Cloud Native and Multicloud Applications Quiz Answers

Accelerating Deep Learning with GPUs Quiz Answers

Blockchain Essentials Cognitive Class Quiz Answers

Deep Learning Fundamentals Cognitive Class Quiz Answers

Hadoop 101 Cognitive Class Answers

Machine Learning With R Cognitive Class Answers

Machine Learning with Python Cognitive Class Answers

Leave a Reply

Your email address will not be published.