Today we will learn about Spark Lazy Evaluation. We will learn about what it is, why is it required, how spark implements them, and what […]
Category: Spark Tutorial
This category contains blogs on Spark Tutorial. Easily understand Spark topics in this blog.
Spark Dataframe Actions
When we call an Action on a Spark dataframe all the Transformations gets executed one by one. This happens because of Spark Lazy Evaluation which […]
Spark Dataframe drop rows with NULL values
The data we normally deal with may not be clean. In such cases we may need to clean the data by applying some logic . […]
Spark Dataframe withColumn
Using Spark withColumn() function we can add , rename , derive, split etc a Dataframe Column. There are many other things which can be achieved […]
SPARK DATAFRAME Union AND UnionAll
Using Spark Union and UnionAll you can merge data of 2 Dataframes and create a new Dataframe. Remember you can merge 2 Spark Dataframes only […]
SPARK distinct and dropDuplicates
Both Spark distinct and dropDuplicates function helps in removing duplicate records. One additional advantage with dropDuplicates() is that you can specify the columns to be […]
SPARK FILTER FUNCTION
Using Spark filter function you can retrieve records from the Dataframe or Datasets which satisfy a given condition. People from SQL background can also use […]
SPARK DATAFRAME SELECT
Today we will learn how to Select columns from a Spark Dataframe. While selecting we can show complete list of columns or select only few […]
Hive/Spark – Find External Tables in hive from a List of tables
Let say that there is a scenario in which you need to find the list of External Tables from all the Tables in a Hive […]
Show full column content of Spark Dataframe
When we do a dataframe.show() , it does now show full column content. It shows only 20 records which is the default number of rows […]