POWER OF APACHE SPARK API
Spark is a distributed engine that provides support for various languages such as Java, Python, Scala, and Sql. It provides flexibility for a programmer to write code in any supported language.
In this article, we will discuss the API support provided by Apache Spark which provides great ease and flexibility in interacting with data.
Spark RDD(Resilient distributed dataset):- This API provide the following support
(i) An RDD is an dataset and fundamental data structure.
(ii) No row , column or schema enforcement.
(iii) Resilient support fault tolerance in an api.
(iv) RDD partition can be recreated and reprocessed anywhere in the cluster.
Catalyst Optimizer(Spark SQL engine):- This API provide the following support
(i) Analysis
(ii) Logical Optimization
(iii) Physical Planning
(iv) Code Generation
Note: Please follow the below article to read more details on Catalyst Optimizer.