Spark SQL has count function which is used to count the number of rows of a Dataframe or table. We can also count for specific […]
Category: Spark Tutorial
This category contains blogs on Spark Tutorial. Easily understand Spark topics in this blog.
Spark Escape Double Quotes in Input File
Here we will see how Spark Escape Double Quotes in Input File. Ideally having double quotes in a column in file is not an issue. […]
How to Create Empty Dataframe in Spark Scala
Today we will learn how to create empty dataframe in Spark Scala. We will cover various methods on how to create empty dataframe with no […]
Repartition in SPARK
Repartition in Spark does a full shuffle of data and splits the data into chunks based on user input. Using this we can increase or […]
Spark Sql Inner Join
In this blog, we will understand how to join 2 or more Dataframes in Spark. Inner Join in Spark works exactly like joins in SQL. […]
ArrayType Column in Spark SQL
Apart from the basic Numeric, String, Datetime etc datatypes , Spark also has ArrayType Column in Spark SQL. This Type is not limited to only […]
correct column order during insert into Spark Dataframe
We need to maintain the correct column order during insert into Spark Dataframe. If we don’t maintain the order then data can get inserted into […]
How to drop columns in dataframe using Spark scala
Here we will learn how to drop columns in dataframe using Spark scala. For this we will make use of drop() method. We will see […]
Spark Read JSON file
In this blog we will understand how to read a Json file using Spark and load it into a dataframe. All the code examples is […]
Spark Broadcast Variable explained
Broadcast variable helps the programmer to keep a read only copy of the variable in each machine/node where Spark is executing its job. The variable […]