DuckDB Summarize Query
This snippet demonstrates how to use the SUMMARIZE
function in DuckDB to calculate aggregate statistics for a dataset.
-- summarize a specific table
SUMMARIZE my_table
-- summarize a specific column
SUMMARIZE my_table.my_column
The SUMMARIZE
command in DuckDB provides a comprehensive overview of your data by computing various aggregates for each column:
min
andmax
: The minimum and maximum values in the column.approx_unique
: An approximation of the number of unique values.avg
: The average value for numeric columns.std
: The standard deviation for numeric columns.q25
,q50
,q75
: The 25th, 50th (median), and 75th percentiles.count
: The total number of rows.null_percentage
: The percentage of NULL values in the column.
This command is particularly useful for quick data exploration and understanding the distribution of values across your dataset.
You can read more about the SUMMARIZE
command in the DuckDB documentation here.