In this blog, we will understand how to join 2 or more Dataframes in Spark. Inner Join in Spark works exactly like joins in SQL. […]
Blogs
ArrayType Column in Spark SQL
Apart from the basic Numeric, String, Datetime etc datatypes , Spark also has ArrayType Column in Spark SQL. This Type is not limited to only […]
correct column order during insert into Spark Dataframe
We need to maintain the correct column order during insert into Spark Dataframe. If we don’t maintain the order then data can get inserted into […]
How to drop columns in dataframe using Spark scala
Here we will learn how to drop columns in dataframe using Spark scala. For this we will make use of drop() method. We will see […]
Spark Read JSON file
In this blog we will understand how to read a Json file using Spark and load it into a dataframe. All the code examples is […]
Spark Broadcast Variable explained
Broadcast variable helps the programmer to keep a read only copy of the variable in each machine/node where Spark is executing its job. The variable […]
Hive Create new table using existing table metadata
In this post we will see how Hive Create new table using existing table metadata. There are various ways in which this can be achieved […]
Spark Read multiline (multiple line) CSV file with Scala
Spark DataFrame API allows us to read CSV file type using [spark.read.csv()]. If the CSV file contains multiple lines then they can be read using […]
Scala Try Catch Finally
Using Try Catch Finally construct, Scala catches and manages exceptions. In short it is used for exception handling. If you are not aware what “exception” […]
Scala String
Scala String is a sequence of characters. It is immutable object which means it cannot be changed once created. In this blog we will look […]