Skip to content

Support aggregates and constant filters in QUALIFY #17210

@alamb

Description

@alamb

@haohuaijin added support for QUALIFY in

@Vedin has pointed out some follow on work here

Hi @haohuaijin, I accidentally worked on the same feature. The approach is the same. So, I don't want anyhow conflict with your contribution and will just wait until your PR is merged. I just want to highlight 2 cases that probably are not currently covered in your PR.

  1. Aggregate functions with QUALIFY
CREATE TABLE qt (i INT, p VARCHAR, o INT) AS VALUES
  (1, 'A', 1),
  (2, 'A', 2),
  (3, 'B', 1),
  (4, 'B', 2);

SELECT p, SUM(o) AS s
FROM qt
GROUP BY p
QUALIFY RANK() OVER (ORDER BY s DESC) = 1
ORDER BY p;
  1. Constant filter + QUALIFY
CREATE TABLE web_base_events_this_run (
  domain_sessionid VARCHAR,
  app_id VARCHAR,
  page_view_id VARCHAR,
  derived_tstamp TIMESTAMP,
  dvce_created_tstamp TIMESTAMP,
  event_id VARCHAR
) AS SELECT * FROM VALUES
  ('ds1', 'appA', NULL, '2025-01-01 10:00:00'::timestamp, '2025-01-01 10:05:00'::timestamp, 'e1'),
  ('ds1', 'appA', NULL, '2025-01-01 11:00:00'::timestamp, '2025-01-01 11:00:00'::timestamp, 'e2'),
  ('ds1', 'appA', 'pv', '2025-01-01 12:00:00'::timestamp, '2025-01-01 12:00:00'::timestamp, 'e3'),
  ('ds2', 'appB', NULL, '2025-01-01 09:00:00'::timestamp, '2025-01-01 09:10:00'::timestamp, 'e4'),
  ('ds2', 'appB', NULL, '2025-01-01 09:05:00'::timestamp, '2025-01-01 09:09:00'::timestamp, 'e5');
  
  SELECT domain_sessionid, app_id
FROM web_base_events_this_run
WHERE page_view_id IS NULL
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY domain_sessionid
  ORDER BY derived_tstamp, dvce_created_tstamp, event_id
) = 1
ORDER BY domain_sessionid;

I covered the first one by adding qualify expressions to aggr_expr_haystack:

        let aggr_expr_haystack = select_exprs
            .iter()
            .chain(having_expr_opt.iter())
            .chain(qualify_expr_opt_pre_aggr.iter());

The second one required me to change the logic in common_subexpr_eliminate.rs. You can check it out here (Embucket#34). Maybe you'll come up with a better solution.
Hope this helps.

Originally posted by @Vedin in #16933 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestsqlSQL Planner

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions