Today we will learn Different ways of creating delta table in Databricks. We will check how the tables can be created using the existing apache […]
Blogs
Spark SQL Count Function
Spark SQL has count function which is used to count the number of rows of a Dataframe or table. We can also count for specific […]
Spark Escape Double Quotes in Input File
Here we will see how Spark Escape Double Quotes in Input File. Ideally having double quotes in a column in file is not an issue. […]
Spark UDF to Check Count of Nulls in each column
In this blog we will create a Spark UDF to Check Count of Nulls in each column. There could be a scenario where we would […]
Spark Function to check Duplicates in Dataframe
Here we will create a function to check if dataframe has duplicates Here we will not only create one method but will try and create […]
How to Create Empty Dataframe in Spark Scala
Today we will learn how to create empty dataframe in Spark Scala. We will cover various methods on how to create empty dataframe with no […]
Repartition in SPARK
Repartition in Spark does a full shuffle of data and splits the data into chunks based on user input. Using this we can increase or […]
Spark Sql Inner Join
In this blog, we will understand how to join 2 or more Dataframes in Spark. Inner Join in Spark works exactly like joins in SQL. […]
ArrayType Column in Spark SQL
Apart from the basic Numeric, String, Datetime etc datatypes , Spark also has ArrayType Column in Spark SQL. This Type is not limited to only […]
correct column order during insert into Spark Dataframe
We need to maintain the correct column order during insert into Spark Dataframe. If we don’t maintain the order then data can get inserted into […]