Summary
query66 from TPC-DS exhibits two distinct bugs at scale=1 (1G):
- Row drop: MO returns 4 rows; MySQL 8.4.2 and PostgreSQL 13.5 both return 5 rows. The missing row corresponds to the warehouse with
w_warehouse_name='' and w_warehouse_sq_ft IS NULL.
- Non-deterministic numeric values: Across 10 consecutive runs, MO returns 10 different result hashes (same 4 rows, but the aggregated numeric columns differ each time).
Environment
- MatrixOne: 8.0.30 (standalone)
- Dataset: TPC-DS scale = 1 (1G), standard 25 tables
- Database:
tpcds
- Reference engines: MySQL 8.4.2, PostgreSQL 13.5 (both stable, both return 5 rows)
Reproduction
Bug 1: Row drop
# MO returns 4 rows, MySQL/PG both return 5
mysql -h<MO_HOST> -P6001 -u<USER> -p<PWD> tpcds -N -B \
< query66.sql 2>/dev/null | wc -l
Bug 2: Non-determinism
# Row count is stable (always 4), but content hash differs every run
for i in {1..10}; do
mysql -h<MO_HOST> -P6001 -u<USER> -p<PWD> tpcds -N -B \
< query66.sql 2>/dev/null | sort | md5sum | awk '{print $1}'
done | sort | uniq -c
Observed:
1 0846396bcb46f6fc864baa157e338cfe
1 28998c0dcbfa1879b54902acf7483af5
1 2da4214ec33249476896e4c7c5479203
1 2ef6403e139f3ff13cc5095d73c521f5
1 390e8676810f9bdee7cb487733306950
1 71f02348d9af0c900032870dc56dcea9
1 75822de0f3df12655ff6dd45732840e7
1 b0403117e234176cba90309be7ec3a60
1 b9e212efcdc4e3e40dafa503fba0e267
1 f8b2735210c2d3963a085f75a0a7b9de
10 runs → 10 distinct content hashes.
The missing row
The row that MO drops, as returned by both MySQL and PG:
| Engine |
First column (w_warehouse_name) |
w_warehouse_sq_ft |
Rest |
| MySQL |
'' (empty) |
0 |
retained |
| PG |
'' (empty) |
NULL |
retained |
| MO |
— |
— |
missing |
The corresponding underlying data in MO:
mysql> select w_warehouse_sk, w_warehouse_name, w_warehouse_sq_ft, w_city from warehouse;
+----------------+----------------------+-------------------+----------+
| w_warehouse_sk | w_warehouse_name | w_warehouse_sq_ft | w_city |
+----------------+----------------------+-------------------+----------+
| 1 | Conventional childr | 977787 | Fairview |
| 2 | Important issues liv | 138504 | Fairview |
| 3 | Doors canno | 294242 | Fairview |
| 4 | Bad cards must make. | 621234 | Fairview |
| 5 | (empty) | NULL | Fairview | <-- this row's group is dropped
+----------------+----------------------+-------------------+----------+
Important: PG also stores w_warehouse_sq_ft = NULL for this row, yet PG retains the GROUP BY group. So the bug is not caused by the NULL value itself — it is in MO's GROUP BY / aggregation pipeline.
Why this is critical
- Data completeness: A whole result row silently disappears. End users have no signal that anything is wrong.
- Non-determinism: Even the rows that are returned have different aggregated values on every run, so the result of this query is unreliable as input for any downstream system.
- Cross-engine consistency: Both MySQL and PG agree on the row count (5) and on stable numeric output across runs. MO is the outlier on both axes.
- Self-contained minimal repro could not trigger the bug: I built a stripped-down 5-table test with hand-written rows mirroring the same shape (
w_warehouse_name='', w_warehouse_sq_ft=NULL, full outer/inner aggregation, || carriers, etc.) and MO returned the correct 2 rows. This strongly indicates the bug is scale- and parallel-execution related (only manifests at TPC-DS 1G volume across web_sales ≈ 720K rows joined with warehouse/date_dim/time_dim/ship_mode).
Suggested investigation direction
- Parallel Hash GROUP BY behavior on composite keys that include
(empty_string, NULL)
- Whether one worker's local NULL bucket is being discarded during the merge step
- Whether
SUM(x / NULL) in the projection list interacts with the GROUP BY pipeline in a way that affects which groups survive
- Whether the inner subquery's GROUP BY emits the row but the outer GROUP BY drops it
SQL (TPC-DS query66, unchanged)
The full SQL is the standard TPC-DS query66; available in the mo-auto-test repo at query_1g_mo/query66.sql.
Key shape:
SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, ..., ship_carriers, year,
SUM(jan_sales), ..., SUM(jan_sales / w_warehouse_sq_ft), ...,
SUM(jan_net), ...
FROM (
SELECT w_warehouse_name, w_warehouse_sq_ft, w_city, ...,
'ORIENTAL' || ',' || 'BOXBUNDLES' AS ship_carriers,
d_year AS year,
SUM(CASE WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity ELSE 0 END) AS jan_sales,
...
FROM web_sales, warehouse, date_dim, time_dim, ship_mode
WHERE ws_warehouse_sk = w_warehouse_sk
AND ws_sold_date_sk = d_date_sk
AND ws_sold_time_sk = t_time_sk
AND ws_ship_mode_sk = sm_ship_mode_sk
AND d_year = 2001
AND t_time BETWEEN 42970 AND 42970 + 28800
AND sm_carrier IN ('ORIENTAL', 'BOXBUNDLES')
GROUP BY w_warehouse_name, w_warehouse_sq_ft, w_city, ..., d_year
UNION ALL
-- second arm: catalog_sales, same shape
...
) x
GROUP BY w_warehouse_name, w_warehouse_sq_ft, w_city, ..., ship_carriers, year
ORDER BY w_warehouse_name, w_warehouse_sq_ft, w_city, ...
LIMIT 100;
Data preparation reference
Standard TPC-DS 1G dataset; load scripts available in https://github.com/matrixorigin/mo-auto-test (load_tpcds_1g.sh).
Summary
query66from TPC-DS exhibits two distinct bugs at scale=1 (1G):w_warehouse_name=''andw_warehouse_sq_ft IS NULL.Environment
tpcdsReproduction
Bug 1: Row drop
Bug 2: Non-determinism
Observed:
10 runs → 10 distinct content hashes.
The missing row
The row that MO drops, as returned by both MySQL and PG:
w_warehouse_name)w_warehouse_sq_ft''(empty)0''(empty)NULLThe corresponding underlying data in MO:
Important: PG also stores
w_warehouse_sq_ft = NULLfor this row, yet PG retains the GROUP BY group. So the bug is not caused by the NULL value itself — it is in MO's GROUP BY / aggregation pipeline.Why this is critical
w_warehouse_name='',w_warehouse_sq_ft=NULL, full outer/inner aggregation,||carriers, etc.) and MO returned the correct 2 rows. This strongly indicates the bug is scale- and parallel-execution related (only manifests at TPC-DS 1G volume acrossweb_sales≈ 720K rows joined withwarehouse/date_dim/time_dim/ship_mode).Suggested investigation direction
(empty_string, NULL)SUM(x / NULL)in the projection list interacts with the GROUP BY pipeline in a way that affects which groups surviveSQL (TPC-DS query66, unchanged)
The full SQL is the standard TPC-DS query66; available in the mo-auto-test repo at
query_1g_mo/query66.sql.Key shape:
Data preparation reference
Standard TPC-DS 1G dataset; load scripts available in https://github.com/matrixorigin/mo-auto-test (
load_tpcds_1g.sh).