Latest from the Blog
Spark SQL has count function which is used to count the number of rows of a Dataframe or table. We can also count for specific rows. People who having exposure to SQL should already be familiar with this as the implementation is same. Let’s see the syntax and example. But before that lets create a […]
Here we will see how Spark Escape Double Quotes in Input File. Ideally having double quotes in a column in file is not an issue. But we face issue when the content inside the double quotes also have double quotes along with file separator. Let’s see an example for this. Below is the data we […]
In this blog we will create a Spark UDF to Check Count of Nulls in each column. There could be a scenario where we would need to find the number of [nulls , ‘NA’ , “” , etc] in each column . This could help in analysis of the quality of data. Let us see […]
Get new content delivered directly to your inbox.