Latest from the Blog

Spark SQL Count Function

Spark SQL has count function which is used to count the number of rows of a Dataframe or table. We can also count for specific rows. People who having exposure to SQL should already be familiar with this as the implementation is same. Let’s see the syntax and example. But before that lets create a […]

Spark Escape Double Quotes in Input File

Here we will see how Spark Escape Double Quotes in Input File. Ideally having double quotes in a column in file is not an issue. But we face issue when the content inside the double quotes also have double quotes along with file separator. Let’s see an example for this. Below is the data we […]

Spark UDF to Check Count of Nulls in each column

In this blog we will create a Spark UDF to Check Count of Nulls in each column. There could be a scenario where we would need to find the number of [nulls , ‘NA’ , “” , etc] in each column . This could help in analysis of the quality of data. Let us see […]

Get new content delivered directly to your inbox.