Spark SQL has count function which is used to count the number of rows of a Dataframe or table. We can also count for specific […]
Category: SparkSQL
Spark Sql Inner Join
In this blog, we will understand how to join 2 or more Dataframes in Spark. Inner Join in Spark works exactly like joins in SQL. […]
ArrayType Column in Spark SQL
Apart from the basic Numeric, String, Datetime etc datatypes , Spark also has ArrayType Column in Spark SQL. This Type is not limited to only […]
How to drop columns in dataframe using Spark scala
Here we will learn how to drop columns in dataframe using Spark scala. For this we will make use of drop() method. We will see […]
Spark Read JSON file
In this blog we will understand how to read a Json file using Spark and load it into a dataframe. All the code examples is […]
Spark Read multiline (multiple line) CSV file with Scala
Spark DataFrame API allows us to read CSV file type using [spark.read.csv()]. If the CSV file contains multiple lines then they can be read using […]
Hive/Spark – Find External Tables in hive from a List of tables
Let say that there is a scenario in which you need to find the list of External Tables from all the Tables in a Hive […]
Spark – Create Dataframe From List
One can create dataframe from List or Seq using the toDF() functions. To use toDF() we need to import spark.implicits._ Here the column names are […]