This is a comprehensive hive database management cheat sheet.

Creating a New Database:

Basic Syntax:

CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[MANAGEDLOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, …)];

Options:

  • CREATE DATABASE/SCHEMA: The keywords DATABASE and SCHEMA are interchangeable.
  • IF NOT EXISTS: Option to skip creation if the database already exists.
  • database_name: Name of the database to create.
  • COMMENT: Optional comment describing the database.
  • LOCATION: HDFS path where the database will be stored. This is for external tables.
  • MANAGEDLOCATION: HDFS path for managed tables. Recommended to be within metastore.warehouse.dir.
  • WITH DBPROPERTIES: A clause to set database properties.

Usage Notes:

  • MANAGEDLOCATION was added in Hive 4.0.0 and refers to the default directory for managed tables.
  • LOCATION now refers to the default directory for external tables.
  • It’s recommended that MANAGEDLOCATION be within metastore.warehouse.dir for common governance policies.
  • REMOTE databases support was added in Hive 4.0.0 for Data connectors.

Examples:

Creating a New Database:

CREATE DATABASE IF NOT EXISTS sales
COMMENT 'Database for sales data'
LOCATION '/user/hive/sales'
MANAGEDLOCATION '/user/hive/warehouse/sales'
WITH DBPROPERTIES ('creator'='John Doe', 'created_at'='2023-01-01');

Creating a Database with Default Locations:

CREATE SCHEMA sales;

ALTER DATABASE Syntax:

The ALTER DATABASE statement in Hive is used to change the properties of a database.

  • Set Database Properties
ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, …); 
This command is used to set the properties of a database.
  • Change Database Owner
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; 
This command changes the owner of the database to the specified user or role.
  • Change Database Location
ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; 
This command changes the default location of the database. It does not move existing data.
  • Change Managed Location
ALTER (DATABASE|SCHEMA) database_name SET MANAGEDLOCATION hdfs_path; 
This command changes the default location for managed tables within the database.

Usage Examples:

*Set Database Properties*
ALTER DATABASE sales SET DBPROPERTIES ('creator'='John', 'date'='2012-01-02');

*Change Database Owner to User*
ALTER DATABASE sales SET OWNER USER john;

*Change Database Owner to Role*
ALTER DATABASE sales SET OWNER ROLE admin;

*Change Database Default Location*
ALTER DATABASE sales SET LOCATION 'hdfs://user/hive/sales';

*Change Managed Location*
ALTER DATABASE sales SET MANAGEDLOCATION 'hdfs://user/hive/managed/sales';

DROP DATABASE Syntax:

The DROP DATABASE statement in Hive is used to delete a database and potentially all the tables within it

  • Basic Syntax
DROP (DATABASE|SCHEMA) database_name; This command drops a database. 
It will fail if the database is not empty unless the CASCADE option is used.
  • Drop Database with Cascade
DROP (DATABASE|SCHEMA) database_name CASCADE; 
This command drops the database and all its contents (all tables within the database).
  • Drop Database if Exists
DROP (DATABASE|SCHEMA) IF EXISTS database_name [RESTRICT|CASCADE]; This command drops the database only if it exists. The RESTRICT option (default) does not drop the database if it contains tables, while CASCADE will drop all tables first.

Usage Examples:

*Drop an Empty Database*
DROP DATABASE sales;

*Drop a Database with Tables*
DROP DATABASE sales CASCADE;

*Drop a Database if it Exists*
DROP DATABASE IF EXISTS sales CASCADE;

USE DATABASE Syntax:

The USE statement in Hive is used to set the current database for all subsequent HiveQL statements. The syntax for the USE command is straightforward:

USE database_name;

Or, to revert to the default database, you can use:

USE DEFAULT;

To check which database is currently being used, you can run the following query:

SELECT current_database();

This command was added in Hive 0.6.

DESCRIBE DATABASE:

The DESCRIBE DATABASE statement in Hive provides metadata about a specified database. Here’s a comprehensive cheat sheet for using the DESCRIBE DATABASE statement:

  • Basic Syntax:
DESCRIBE DATABASE database_name;
*Returns the metadata of the specified database*

Database Name: sales
Comment: 'Database for sales records'
Location: 'hdfs://user/hive/warehouse/sales.db'
Owner: User 'admin'
  • Extended Syntax:
DESCRIBE DATABASE EXTENDED database_name;
*Returns the metadata along with the database properties*

Database Name: sales
Comment: 'Database for sales records'
Location: 'hdfs://user/hive/warehouse/sales.db'
Managed Location: 'hdfs://user/hive/warehouse/sales_managed.db'
Owner Name: admin
Owner Type: USER
Database Properties: '{'createdBy'='admin', 'createdOn'='2023-01-01'}'

Metadata Information Returned:

  1. Database Name: The name of the database.
  2. Comment: Any comment provided during the database creation.
  3. Location: The HDFS path where the database is stored.
  4. Managed Location: The default location for managed tables (Hive 4.0.0+).
  5. Owner Name: The user who owns the database.
  6. Owner Type: The type of owner (USER or ROLE).
  7. Database Properties: A set of user-defined key-value pairs.

SHOW DATABASES:

The SHOW DATABASES statement in Hive is used to list all the databases or schemas in the Hive metastore.

  • Basic Syntax:
SHOW DATABASES;
*Lists all databases present in the Hive metastore.*
Output:
default
sales
marketing
hr
  • Pattern Matching:
SHOW DATABASES LIKE 'pattern';
*Lists databases that match the given pattern*
Example: SHOW DATABASES LIKE 'sales%';
Output:
sales_east
sales_west
sales_central

The pattern can include SQL wildcards such as % (matches any number of characters) and _ (matches a single character).

Understanding the Output:

  • The command returns a list of database names.
  • The output is usually in alphabetical order for ease of reading.
  • If the pattern is used, the output will only include database names that match the pattern.

🙂 kudos for learning something new 🙂

Hive Download Link

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from UnderstandingBigData

Subscribe now to keep reading and get access to the full archive.

Continue reading