Using Spark filter function you can retrieve records from the Dataframe or Datasets which satisfy a given condition. People from SQL background can also use […]
Category: Spark Tutorial
This category contains blogs on Spark Tutorial. Easily understand Spark topics in this blog.
SPARK DATAFRAME SELECT
Today we will learn how to Select columns from a Spark Dataframe. While selecting we can show complete list of columns or select only few […]
Hive/Spark – Find External Tables in hive from a List of tables
Let say that there is a scenario in which you need to find the list of External Tables from all the Tables in a Hive […]
Show full column content of Spark Dataframe
When we do a dataframe.show() , it does now show full column content. It shows only 20 records which is the default number of rows […]
Spark Difference between Cache and Persist
If we are using an RDD multiple number of times in our program, the RDD will be recomputed everytime. This is a performance issue. To […]
Spark – Difference between Coalesce and Repartition in Spark
Before we understand the difference between Coalesce and Repartition we first need to understand what Spark Partition is.Simply put Partitioning data means to divide the […]
Spark – Create Dataframe From List
One can create dataframe from List or Seq using the toDF() functions. To use toDF() we need to import spark.implicits._ Here the column names are […]