Data Stack

5 SQL Functions Every Analyst Should Actually Use

Quick answer

If an analyst only masters a small SQL toolkit, these five functions cover a lot of real work: use DATE_TRUNC for time grain, COALESCE for null handling, ROW_NUMBER for deduping or latest-row logic, LAG for period comparisons, and SUM() OVER() for running totals or share-of-total analysis.

Analysts do not need every advanced SQL feature to be effective. They need a small set of functions that keep showing up in reporting, metric validation, and exploratory analysis. This video centers on five that are genuinely useful in day-to-day work because they solve recurring problems without forcing you into overly complex query patterns.

Video walkthrough

A walkthrough of five practical SQL functions for analytics work: DATE_TRUNC(), COALESCE(), ROW_NUMBER(), LAG(), and SUM() OVER().

Watch on YouTube

Use DATE_TRUNC to report at the right time grain

Analysts constantly move between daily, weekly, and monthly reporting. DATE_TRUNC is one of the fastest ways to normalize timestamps into the time grain a dashboard or KPI review actually needs.

This is especially useful when the source data is event-level but the stakeholder question is monthly revenue, weekly orders, or daily active users. Instead of fighting raw timestamps throughout the query, collapse them once and group cleanly from there.

select
  date_trunc('month', order_date) as order_month,
  sum(revenue) as monthly_revenue
from fct_orders
group by 1
order by 1
  • Use DATE_TRUNC as early as possible when the final output is aggregated by time period.
  • Keep the chosen grain consistent between the select, group by, and order by clauses.
  • Name the derived field clearly so downstream dashboards do not confuse a raw timestamp with a reporting period.

Use COALESCE to keep nulls from breaking analytical logic

Null handling shows up everywhere in analytics work. COALESCE gives you a simple way to replace missing values before they distort arithmetic, filtering, or presentation logic.

The function is practical because it keeps intent visible. A reader can immediately see whether you want a fallback numeric value, a backup text field, or a default category for missing data.

select
  order_id,
  coalesce(discount_amount, 0) as discount_amount,
  coalesce(marketing_channel, 'Unknown') as marketing_channel
from fct_orders
  • Use COALESCE before calculations when nulls should behave like zero or a known fallback.
  • Be explicit about business meaning because replacing null with zero is not always correct.
  • Apply the fallback once in the model layer when many dashboards depend on the same rule.

Use ROW_NUMBER to dedupe records or isolate the latest row

ROW_NUMBER is one of the most practical window functions for analysts because it solves a common warehouse problem: multiple rows exist for the same entity, and you need the most recent or highest-priority one.

Instead of relying on a fragile join back to a max timestamp, assign a row number inside each entity partition and filter to the top row. That pattern is easier to audit and usually easier to adapt when the sort logic changes.

with ranked_customers as (
  select
    customer_id,
    status,
    updated_at,
    row_number() over (
      partition by customer_id
      order by updated_at desc
    ) as row_num
  from dim_customer_history
)
select *
from ranked_customers
where row_num = 1
  • Partition by the entity you want one final row for, such as customer_id or order_id.
  • Order by the field that reflects the business priority, not just the newest timestamp by default.
  • Use ROW_NUMBER when you want exactly one row back, even if ties exist.

Use LAG for period-over-period comparisons

LAG is what turns a plain trend table into a comparison table. Once the current period and previous period sit next to each other, it becomes much easier to calculate month-over-month change, week-over-week growth, or any other sequential comparison.

For analysts, that matters because stakeholders usually want movement, not just totals. LAG helps you keep the comparison logic in SQL instead of pushing it into a spreadsheet or dashboard workaround later.

with monthly_revenue as (
  select
    date_trunc('month', order_date) as order_month,
    sum(revenue) as revenue
  from fct_orders
  group by 1
)
select
  order_month,
  revenue,
  lag(revenue) over (order by order_month) as prior_month_revenue
from monthly_revenue
order by order_month
  • Build the base aggregation first, then apply LAG to the summarized result.
  • Order the window by the reporting period so the comparison is chronologically correct.
  • Use the lagged value to compute absolute change or percentage change in one more calculated field.

Use SUM() OVER() for running totals and contribution analysis

SUM() OVER() is the windowed version of aggregation that analysts reach for once they need context beyond one row. It is useful for running totals, percent of total, cumulative revenue, and other cases where each row needs access to a larger frame of data.

This is one of the functions that starts to make a query feel more analytical instead of just tabular. You are no longer collapsing rows with a group by alone. You are keeping row-level detail while adding summary context alongside it.

select
  order_month,
  revenue,
  sum(revenue) over (
    order by order_month
    rows between unbounded preceding and current row
  ) as running_revenue
from monthly_revenue
order by order_month
  • Use an ordered window for running totals and a partitioned window for percent-of-category analysis.
  • Be explicit about the window frame when you want a cumulative calculation.
  • Check whether you need row-level detail preserved before choosing a window function over a standard group by.

These five functions cover a large share of analyst SQL work

The reason this list is practical is not that it is exhaustive. It is that these functions solve recurring analytical tasks: setting time grain, handling missing values, deduping entities, comparing periods, and layering summary context onto detailed results.

If an analyst gets comfortable combining them, a lot of reporting logic becomes simpler to build and easier to QA. A monthly revenue model might start with DATE_TRUNC, use COALESCE for missing channel values, rely on LAG for period comparisons, and then add SUM() OVER() for cumulative tracking. That is a compact toolkit with a lot of leverage.

  • Master the patterns that recur in real reporting work before chasing more obscure SQL features.
  • Keep example queries around so the team can reuse proven window-function patterns.
  • Move repeated logic into modeled tables once the same calculations start showing up in multiple dashboards.

Frequently asked questions

What SQL functions should analysts learn first?

A strong starting set is DATE_TRUNC, COALESCE, ROW_NUMBER, LAG, and SUM() OVER(). Together they cover time-based reporting, null handling, deduping, period comparisons, and windowed aggregation, which are common tasks in analytical SQL.

What is DATE_TRUNC used for in analytics?

DATE_TRUNC is used to standardize timestamps to a reporting grain such as day, week, or month. Analysts use it when raw event timestamps need to roll up into clean KPI tables or dashboard-ready trend views.

When should I use ROW_NUMBER in SQL?

Use ROW_NUMBER when you need to rank rows within a group and then keep exactly one of them, such as the latest customer status row or the highest-priority event per account. It is especially useful for deduping warehouse tables with history.

Why are window functions important for analysts?

Window functions matter because they let analysts compare, rank, and summarize data without losing the row-level detail needed for investigation. Functions like LAG and SUM() OVER() are how many production KPI tables handle trend and cumulative analysis cleanly.

Related service

Need help standardizing analyst SQL, reducing reporting-query sprawl, or turning fragile dashboard logic into maintainable warehouse models?

Explore SQL analytics consulting

Available for new projects

Let's work
together

Ready to transform your analytics operations? Get in touch with our team to discuss how we can help unlock the value in your data.

Location

44 Montgomery St
San Francisco, CA 94104

Ready to get started?

Schedule a consultation

Discuss your analytics priorities and build an actionable roadmap with our team.

Schedule a Consultation

© 2026 DF Insights. All rights reserved.