Spaces:
Running
Running
| id: "duckdb-summarize" | |
| title: "Summarize" | |
| slug: "duckdb-summarize-query" | |
| description: "Summarize a specific table or columns for a quick overview of the dataset's structure and statistics." | |
| code: | | |
| -- summarize a specific table | |
| SUMMARIZE my_table | |
| -- summarize a specific column | |
| SUMMARIZE my_table.my_column | |
| # DuckDB Summarize Query | |
| This snippet demonstrates how to use the `SUMMARIZE` function in DuckDB to calculate aggregate statistics for a dataset. | |
| ```sql | |
| -- summarize a specific table | |
| SUMMARIZE my_table | |
| -- summarize a specific column | |
| SUMMARIZE my_table.my_column | |
| ``` | |
| The `SUMMARIZE` command in DuckDB provides a comprehensive overview of your data by computing various aggregates for each column: | |
| - `min` and `max`: The minimum and maximum values in the column. | |
| - `approx_unique`: An approximation of the number of unique values. | |
| - `avg`: The average value for numeric columns. | |
| - `std`: The standard deviation for numeric columns. | |
| - `q25`, `q50`, `q75`: The 25th, 50th (median), and 75th percentiles. | |
| - `count`: The total number of rows. | |
| - `null_percentage`: The percentage of NULL values in the column. | |
| This command is particularly useful for quick data exploration and understanding the distribution of values across your dataset. | |
| You can read more about the `SUMMARIZE` command in the DuckDB documentation [here](https://duckdb.org/docs/guides/meta/summarize.html). |