feat: add map contains when map used #1497

knudtty · 2025-12-17T21:04:15Z

Closes HDX-3070

Adding mapContains allows a bloom filter index to be used to not search a granule if a key for a given map is not present in that granule. In some testing I've done it yielded 40% less rows scanned

changeset-bot · 2025-12-17T21:04:19Z

🦋 Changeset detected

Latest commit: 6fed8ca

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
@hyperdx/common-utils	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2025-12-17T21:04:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
hyperdx-v2-oss-app	Ready	Preview, Comment	Dec 18, 2025 11:14pm

claude · 2025-12-17T21:05:59Z

Code Review

Critical Issues

SQL Injection vulnerability in renderChartConfig.ts:510 - Map keys are not escaped before string interpolation. Use backticks or proper SQL escaping for mapName and escape quotes in keyName
Feature flag logic error in renderChartConfig.ts:15-17 - || operator causes optimization to be ALWAYS enabled (truthy in non-production). Should be: process.env.NEXT_PUBLIC_MAP_CONTAINS_OPTIMIZATION_ENABLED === 'true' || process.env.NODE_ENV !== 'production'
Missing null safety in renderChartConfig.ts:467 - cur.default_expression could be undefined/null for materialized columns. Add null check before .indexOf()

Important Issues

Inconsistent error handling in renderChartConfig.ts:378, 533 - Empty catch blocks silently swallow all errors including TypeErrors. Consider logging errors in development or catching specific parser errors only
Missing validation in extractIdent:319 - node.array_index[0].index.value needs type validation - should verify it's a string before using as keyName

Minor Issues

Tests are comprehensive and cover edge cases well
CTE detection logic is correct
Deduplication using objectHash is appropriate

Recommendation: Fix the SQL injection and feature flag logic before merge.

github-actions · 2025-12-17T21:10:06Z

E2E Test Results

✅ All tests passed • 46 passed • 3 skipped • 816s

Status	Count
✅ Passed	46
❌ Failed	0
⚠️ Flaky	1
⏭️ Skipped	3

Tests ran across 4 shards in parallel.

View full report →

packages/common-utils/src/core/renderChartConfig.ts

This reverts commit 23ace04.

knudtty · 2025-12-18T21:22:59Z

packages/common-utils/src/__tests__/__snapshots__/renderChartConfig.test.ts.snap

-exports[`renderChartConfig HAVING clause should render HAVING clause with SQL language 1`] = `"SELECT count(),severity FROM default.logs WHERE (timestamp >= fromUnixTimestamp64Milli(1739318400000) AND timestamp <= fromUnixTimestamp64Milli(1739491200000)) GROUP BY severity HAVING count(*) > 100"`;
+exports[`renderChartConfig HAVING clause should render HAVING clause with SQL language 1`] = `"SELECT count(),severity FROM default.logs WHERE (timestamp >= fromUnixTimestamp64Milli(1739318400000) AND timestamp <= fromUnixTimestamp64Milli(1739491200000)) GROUP BY severity HAVING COUNT(*) > 100"`;

-exports[`renderChartConfig HAVING clause should render HAVING clause with granularity and groupBy 1`] = `"SELECT count(),event_type,toStartOfInterval(toDateTime(timestamp), INTERVAL 5 minute) AS \`__hdx_time_bucket\` FROM default.events WHERE (timestamp >= fromUnixTimestamp64Milli(1739318400000) AND timestamp <= fromUnixTimestamp64Milli(1739491200000)) GROUP BY event_type,toStartOfInterval(toDateTime(timestamp), INTERVAL 5 minute) AS \`__hdx_time_bucket\` HAVING count(*) > 50 ORDER BY toStartOfInterval(toDateTime(timestamp), INTERVAL 5 minute) AS \`__hdx_time_bucket\`"`;
+exports[`renderChartConfig HAVING clause should render HAVING clause with granularity and groupBy 1`] = `"SELECT count(),event_type,toStartOfInterval(toDateTime(timestamp), INTERVAL 5 minute) AS \`__hdx_time_bucket\` FROM default.events WHERE (timestamp >= fromUnixTimestamp64Milli(1739318400000) AND timestamp <= fromUnixTimestamp64Milli(1739491200000)) GROUP BY event_type,toStartOfInterval(toDateTime(timestamp), INTERVAL 5 minute) AS \`__hdx_time_bucket\` HAVING COUNT(*) > 50 ORDER BY toStartOfInterval(toDateTime(timestamp), INTERVAL 5 minute) AS \`__hdx_time_bucket\`"`;


These cases never ran it through the sql parser previously, so the sql parser just capitalizes a few things. Otherwise the tests didn't change

knudtty · 2025-12-18T21:23:15Z

packages/common-utils/src/__tests__/renderChartConfig.test.ts

+      expect(actual.toLowerCase()).toContain(
+        'avg(response_time) > 500 and count(*) > 10',
+      );


just case changes

knudtty · 2025-12-18T21:24:27Z

packages/common-utils/src/core/renderChartConfig.ts

+  };
+};
+
+const optimizeMapAccessWhere = ({


Builds ast, traverses extracting each ident, adds mapContains to the ast if the proper conditions are met, and builds back into sql

knudtty · 2025-12-18T21:25:39Z

packages/common-utils/src/core/renderChartConfig.ts

+        case 'column_ref': {
+          const ident = extractIdent(node as ColumnRef);
+          if (ident) {
+            idents.push({ doesContain: true, ident });


Either a column or map, we want both since a materialized column could be a map key optimization in disguise

knudtty · 2025-12-18T21:27:32Z

packages/common-utils/src/core/renderChartConfig.ts

+    // replace materialized idents with map ident
+    for (const curIdent of idents) {
+      if (curIdent.ident.type === 'column') {
+        const materializedMapIdent = materializedColumnToMapIdent.get(
+          curIdent.ident.name,
+        );
+        if (materializedMapIdent) {
+          curIdent.ident = materializedMapIdent;
+        }
+      }
+    }


This allows us to add mapContains even if that map entry is materialized, which is advantageous to still use the map key index

I'm not sure if the materialized field would be beneficial from this optimization. but good to know

wrn14897 · 2025-12-19T02:37:11Z

packages/common-utils/src/core/renderChartConfig.ts

 import { CustomSchemaSQLSerializerV2, SearchQueryBuilder } from '@/queryParser';

+const MAP_CONTAINS_OPTIMIZATION_ENABLED =
+  process.env.NEXT_PUBLIC_MAP_CONTAINS_OPTIMIZATION_ENABLED ||


style: better to follow the pattern like (process.env.NEXT_PUBLIC_MAP_CONTAINS_OPTIMIZATION_ENABLED ?? 'false') === 'true'
and move it to config.ts

wrn14897 · 2025-12-19T02:42:05Z

packages/common-utils/src/core/renderChartConfig.ts

+    const maps = new Set(
+      columns.filter(v => v.type.startsWith('Map')).map(v => v.name),
+    );


perf: We can move this out of the method, and we don’t need to traverse the AST if it’s empty. Also maps is a bit ambiguous, maybe mapFieldNames ?

wrn14897 · 2025-12-19T02:46:02Z

packages/common-utils/src/core/renderChartConfig.ts

+
+    return parser.sqlify(ast);
+  } catch {
+    // ignore


Should we log errors during development for debugging purposes?

wrn14897 · 2025-12-19T02:49:58Z

packages/common-utils/src/core/renderChartConfig.ts

+    }
+
+    // add map idents to AST
+    const addIdentToAst = (ident: SQLMapValueIdent, doesContain: boolean) => {


style: We can probably move this function out instead of manipulating the AST directly

wrn14897 · 2025-12-19T02:51:39Z

packages/common-utils/src/core/renderChartConfig.ts


+const MAP_CONTAINS_OPTIMIZATION_ENABLED =
+  process.env.NEXT_PUBLIC_MAP_CONTAINS_OPTIMIZATION_ENABLED ||
+  process.env.NODE_ENV !== 'production';


I'd suggest removing this (in case somehow NODE_ENV is not set properly in prod) and add NEXT_PUBLIC_MAP_CONTAINS_OPTIMIZATION_ENABLED to your .env.local instead

knudtty added 2 commits December 17, 2025 16:00

feat: add mapContains to where clause if map is used

23ace04

add changeset

82190a5

knudtty requested review from a team and wrn14897 and removed request for a team December 17, 2025 21:04

vercel bot deployed to Preview December 17, 2025 21:05 View deployment

wrn14897 reviewed Dec 17, 2025

View reviewed changes

packages/common-utils/src/core/renderChartConfig.ts Outdated Show resolved Hide resolved

knudtty added 2 commits December 18, 2025 10:14

Revert "feat: add mapContains to where clause if map is used"

7b1ab10

This reverts commit 23ace04.

feat: add mapContains to where clause if map is used

9e9aa9e

vercel bot deployed to Preview December 18, 2025 21:23 View deployment

knudtty commented Dec 18, 2025

View reviewed changes

wrap in feature flag

8f3baa8

vercel bot deployed to Preview December 18, 2025 23:03 View deployment

enable during CI

b0b8196

vercel bot deployed to Preview December 18, 2025 23:10 View deployment

add claude recommends

6fed8ca

vercel bot deployed to Preview December 18, 2025 23:14 View deployment

wrn14897 reviewed Dec 19, 2025

View reviewed changes

feat: add map contains when map used #1497

Are you sure you want to change the base?

feat: add map contains when map used #1497

Uh oh!

Conversation

knudtty commented Dec 17, 2025

Uh oh!

changeset-bot bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Critical Issues

Important Issues

Minor Issues

Uh oh!

github-actions bot commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Test Results

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wrn14897 Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wrn14897 Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

changeset-bot bot commented Dec 17, 2025 •

edited

Loading

vercel bot commented Dec 17, 2025 •

edited

Loading

claude bot commented Dec 17, 2025 •

edited

Loading

github-actions bot commented Dec 17, 2025 •

edited

Loading

wrn14897 Dec 19, 2025 •

edited

Loading

wrn14897 Dec 19, 2025 •

edited

Loading