End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails by Copilot · Pull Request #141 · microsoft/PowerPlatform-DataverseClient-Python

Copilot · 2026-03-13T16:12:41Z

???SQL developers working with Dataverse shouldn't need to learn OData or the Web API. This PR closes that gap by making the Python SDK's SQL endpoint usable end-to-end -- from schema discovery, through querying (including SELECT *, JOINs, aggregates), to writing results back via DataFrames.

Live-validated against Aurora VM: 66/66 SQL tests + 15-table JOIN confirmed + all anti-patterns guarded.

What's new

Schema discovery APIs

list_columns(table), list_relationships(), list_table_relationships(table) (OneToMany + ManyToOne + ManyToMany)

SELECT * auto-expansion

SDK transparently rewrites SELECT * into explicit column names (server blocks * directly). Warns when used with JOINs.

DataFrame SQL bridge

client.dataframe.sql() -- SQL to pandas DataFrame. Full bidirectional: SQL read -> pandas transform -> SDK write-back.

SQL helpers (`client.query`)

sql_columns, sql_select, sql_joins, sql_join -- auto-discover columns and JOINs from metadata.

OData helpers (`client.query`)

odata_select, odata_expands, odata_expand, odata_bind -- auto-discover nav properties, build @odata.bind.

SQL safety guardrails (3 layers of protection)

Layer 1 -- Server rejects (user cannot do these at all):

JOIN without ON -- server error
INSERT/UPDATE/DELETE -- server error
SELECT * -- server error (SDK rewrites transparently)
TOP > 5000 -- server error
RIGHT/FULL OUTER/CROSS JOIN -- server error
Subqueries -- server error

Layer 2 -- SDK blocks or warns (before hitting the server):

INSERT/UPDATE/DELETE -- ValidationError with clear message
FROM a, b (cartesian) -- UserWarning (server allows but produces N*M intermediate rows)
LIKE '%value' -- UserWarning (full table scan advisory)
SELECT * with JOIN -- UserWarning (partial expansion advisory)

Layer 3 -- Documentation warns (sql_examples.py Section 32):

Cartesian products, leading-wildcard LIKE, unfiltered system tables, SELECT * on wide tables, deep JOINs without TOP

Anti-patterns and best practices (Section 32 of examples)

Dedicated section documenting 5 SQL anti-patterns with clear explanations of WHY they're dangerous (shared database impact) and HOW to fix them. The SDK's guardrails automatically catch patterns 1 and 2.

Key findings from live testing

SQL JOINs use raw attribute name (parentcustomerid), not _value suffix
No server depth limit on JOINs -- tested up to 15 tables (all PASS, ~0.6s)
JOINs with proper ON clauses use indexed FK lookups -- safe for production
Server enforces: 5000-row cap, ON clause required for JOINs, blocks write SQL
This exceeds OData's 10-level \ limit (SQL JOINs are flat, not nested)

Test coverage

756 unit tests passing
66 live validation tests (custom + built-in tables + all helpers)
15-table JOIN depth confirmed on Aurora VM
SQL vs OData comparison with live benchmark in examples

Comprehensive examples (`sql_examples.py` -- 34 sections, 1194 lines)

saurabhrb · 2026-03-13T16:24:08Z

@copilot
update the PR description to even show case where no columns are selected, i.e., with wild card *

Copilot · 2026-03-13T16:25:29Z

@copilot
update the PR description to even show case where no columns are selected, i.e., with wild card *

Updated the PR description to show the wildcard case — calling list_columns("account") with no select parameter returns all column properties (equivalent to *). The description now clearly distinguishes between "all properties" and "narrowed via select".

Copilot

Pull request overview

Adds public schema discovery methods under client.tables so SDK users can programmatically discover table columns (attributes) and relationship definitions, which is particularly helpful when authoring client.query.sql() queries.

Changes:

Added TableOperations.list_columns(), list_relationships(), and list_table_relationships() public APIs that delegate to data-layer helpers.
Implemented the corresponding data-layer methods _ODataClient._list_columns() and _RelationshipOperationsMixin._list_relationships() / _list_table_relationships().
Added unit tests and README examples for the new discovery APIs.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/PowerPlatform/Dataverse/operations/tables.py`	Adds the three new public `client.tables.*` schema discovery methods and docstrings/examples.
`src/PowerPlatform/Dataverse/data/_odata.py`	Implements `_list_columns()` to fetch `EntityDefinitions({MetadataId})/Attributes`.
`src/PowerPlatform/Dataverse/data/_relationships.py`	Implements `_list_relationships()` and `_list_table_relationships()` (1:N + N:N) metadata discovery.
`tests/unit/test_tables_operations.py`	Verifies the new public methods delegate correctly to the underlying `_odata` methods.
`tests/unit/data/test_relationships.py`	Adds data-layer tests for `_list_relationships()` and `_list_table_relationships()`.
`tests/unit/data/test_list_columns.py`	New data-layer tests for `_ODataClient._list_columns()`.
`README.md`	Documents usage examples for the new schema discovery APIs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ith tests and README examples Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

…me bridge, guardrails, examples - Schema discovery APIs: list_columns, list_relationships, list_table_relationships - SELECT * auto-expansion via list_columns (server blocks SELECT *) - client.dataframe.sql() for SQL -> DataFrame bridge - SQL safety guardrails: block writes, auto-inject TOP 5000, warn on leading-wildcard LIKE and implicit cross joins - Comprehensive sql_examples.py (28 sections): JOINs, aggregates, GROUP BY, DISTINCT, OFFSET FETCH, polymorphic lookups, audit trail, SQL read -> DataFrame transform -> SDK write-back - Updated query.sql() docstring with accurate SQL capabilities - 732 tests (was 318), all passing

…dd SELECT * + JOIN warning Live-tested against Aurora VM (aurorabapenv71aff.crm10.dynamics.com): - Server auto-caps at 5000 rows without TOP -> no client-side injection needed - Server blocks SELECT * on both single-table and JOIN queries -> SDK expansion confirmed needed - Added warning when SELECT * used with JOIN (expansion only includes first table columns) 732 tests passing.

New methods on client.query for SQL-first developers: - sql_columns(table) -> simplified column metadata list - sql_select(table) -> comma-separated column list for SELECT - sql_joins(table) -> all possible JOINs with ready-to-use clauses - sql_join(from, to) -> auto-generated JOIN clause between tables Key finding from live testing: SQL JOINs use the raw attribute name (e.g. parentcustomerid), NOT the _value suffix. The ReferencingAttribute from relationship metadata matches exactly. 13 new unit tests, 745 total passing.

New methods on client.query for OData users (parallel to SQL helpers): - odata_select(table) -> list[str] for records.get(select=) - odata_expands(table) -> all navigation properties with entity sets - odata_expand(from, to) -> PascalCase nav property name for expand= - odata_bind(from, to, id) -> @odata.bind dict for create/update payloads These eliminate the most error-prone parts of OData queries: - No more guessing PascalCase navigation property names for - No more manually constructing @odata.bind with entity set names - Column discovery matches records.get(select=) format directly 11 new unit tests, 756 total passing.

_list_table_relationships now fetches all 3 relationship types: - OneToManyRelationships (incoming: where table is referenced) - ManyToOneRelationships (outgoing: where table has a lookup) <-- NEW - ManyToManyRelationships This was critical for sql_joins() and odata_expands() which need outgoing lookups. Without ManyToOne, custom table lookups were invisible to the helpers. Also fixed: create_lookup_field requires lowercase logical names (not SchemaName) -- documented in copilot-instructions. Live-validated: 47/47 SQL tests passed on Aurora VM. Unit tests: 756 passed.

Columns with AttributeOf set are auto-generated display names (e.g. createdbyname, createdbyyominame) that cause errors in OData \ and are not real data columns. Now filtered out. Also: _list_table_relationships docstring updated for ManyToOne. 756 unit tests passing.

… depth - Moved write statement check (INSERT/UPDATE/DELETE) before _extract_logical_table() so users get the clear 'SQL endpoint is read-only' error instead of confusing 'no FROM clause' ValueError - Live-tested JOIN depth: 8-table JOINs confirmed working on Aurora VM - Live-tested nested polymorphic: customerid -> account -> owner -> businessunit chain works - Fixed validate_sql_full.py: lookup existence check (no more 95s waits), COUNT assertion relaxed for accumulated data, AttributeOf filter - 66/66 validation tests passed, 12/12 JOIN depth tests passed - 756 unit tests passing

…vs OData comparison New sections (27-31): - 27: AND/OR, NOT IN, NOT LIKE boolean logic - 28: Deep JOINs (5-8 tables) with built-in tables - 29: SQL helper functions (sql_columns, sql_select, sql_joins, sql_join) - 30: OData helper functions (odata_select, odata_expands, odata_expand, odata_bind) - 31: SQL vs OData side-by-side comparison with live benchmark - 32: Updated summary table with all new features - 33: Cleanup Summary table now includes: AND/OR, NOT IN/LIKE, 8+ table JOINs, nested polymorphic, self-JOIN, DISTINCT+JOIN, all helper functions. SQL-first workflow updated with helper-driven steps. 756 unit tests passing.

…vel cap) Live-tested on Aurora VM: - 9-table: 2.14s PASS - 10-table: 0.69s PASS (= OData limit) - 11-table: 0.56s PASS (BEYOND OData limit) - 12-table: 0.66s PASS - 13-table: 0.63s PASS - 15-table: 0.63s PASS (no degradation!) Updated sql_examples.py summary: 15+ tables (no depth limit) Updated SQL vs OData comparison table.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 11 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/data/test_select_star_expansion.py

examples/advanced/sql_examples.py

src/PowerPlatform/Dataverse/data/_odata.py

src/PowerPlatform/Dataverse/operations/tables.py

README.md

src/PowerPlatform/Dataverse/data/_odata.py

src/PowerPlatform/Dataverse/operations/query.py

New section 32: IMPORTANT Anti-Patterns & Best Practices - Cartesian products (FROM a, b without ON) -- AVOID - Leading-wildcard LIKE -- forces full table scans - No filter on large system tables -- returns max rows - SELECT * on wide tables -- transfers 260+ columns - Deep JOINs without TOP -- always use TOP N Also: toned down JOIN depth messaging to 'no limit (tested 15)' instead of promoting 15-table JOINs as a feature. Fixed duplicate SQL vs OData table in examples. 34 sections, 1194 lines. 756 unit tests passing.

Implicit cross joins (FROM a, b without ON) now raise ValidationError instead of UserWarning. The server allows these but they produce cartesian products (N*M intermediate rows) that degrade shared DB. SDK now blocks with clear error: 'Implicit cross join detected... Use explicit JOIN...ON syntax instead.' New subcode: validation_sql_cross_join_blocked SDK guardrail summary: - INSERT/UPDATE/DELETE -> ValidationError (blocked) - FROM a, b (cartesian) -> ValidationError (blocked) <-- upgraded - LIKE '%value' -> UserWarning (performance advisory) - SELECT * with JOIN -> UserWarning (partial expansion) 756 unit tests passing.

Principle: SDK should not be more restrictive than the server. The server allows cartesian queries (FROM a, b) -- they work and return results (capped at 5000). The SDK warns users but does not block, because: 1. Small-table cartesian products are safe (3 rows * 3 rows = 9) 2. Server has its own resource governance (5000-row cap, timeouts) 3. Blocking what the server allows creates user frustration Final guardrail design: - ValidationError: INSERT/UPDATE/DELETE (server blocks too, but SDK catches earlier with clearer message) - UserWarning: FROM a, b (cartesian), LIKE '%value', SELECT * + JOIN 756 unit tests passing.

All checks in one place (_sql_guardrails) for easy future bypass. BLOCKED (ValidationError -- saves network round-trip): - INSERT/UPDATE/DELETE/DROP/etc. (write statements) - CROSS JOIN, RIGHT JOIN, FULL OUTER JOIN (unsupported join types) - UNION / UNION ALL - HAVING - CTE (WITH ... AS) - Subqueries (IN (SELECT ...), EXISTS (SELECT ...)) WARNED (UserWarning -- query executes, advisory only): - LIKE '%value' (leading-wildcard, full table scan) - FROM a, b (implicit cartesian, server allows but risky) - SELECT * with JOIN (partial expansion) Principle: block what server blocks (save time), warn what server allows but is risky (respect user intent). 14 new tests, 770 total passing.

…* + JOIN behavior - SELECT * with JOIN only expands first (FROM) table columns - Joined table columns are NOT included in expansion - Added sql_select()/sql_columns() as recommended alternative - Updated guardrail summary to mention all blocked patterns

1. Remove unused imports (patch, PropertyMock) from test_select_star_expansion 2. Fix JOIN depth '6+' -> 'no depth limit' in query.py docstring 3. Fix stale 'auto-inject TOP' comment in _odata.py 4. Fix Learn-incompatible docstring types in tables.py list_columns 5. Fix Learn-incompatible docstring types in tables.py list_relationships 6. Fix list_table_relationships docstring: add ManyToOne 7. Fix cross-join regex to catch unaliased FROM account, contact 8. Fix write regex to catch comment-prefixed writes (/* */ and --) 9. Narrow odata_expands exception handling (KeyError/AttributeError/ValueError) 10. Consistent JOIN depth claims across all docs 770 unit tests passing.

saurabhrb · 2026-03-22T05:29:01Z

All 11 review comments addressed in commit a595a72:

Removed unused imports (patch, PropertyMock) from test_select_star_expansion.py
Fixed JOIN depth claim from '6+' to 'no depth limit' in query.py docstring
Fixed stale 'auto-inject TOP' comment in _odata.py
Fixed Learn-incompatible docstring types in tables.py list_columns() -- uses bracket generics now
Fixed Learn-incompatible docstring types in tables.py list_relationships()
Fixed list_table_relationships() docstring to include ManyToOne (matches implementation)
Broadened cross-join regex to catch both aliased and unaliased comma joins (FROM account, contact)
Improved write regex to catch comment-prefixed writes (/* */DELETE, --\nDELETE)
Narrowed odata_expands() exception handling to (KeyError, AttributeError, ValueError)
Consistent JOIN depth claims across all documentation
README comment about one-to-many+many-to-many was already corrected in earlier commits

770 unit tests passing.

src/PowerPlatform/Dataverse/data/_odata.py

…taverse API) Dataverse Web API supports EntityDefinitions({id})/ManyToOneRelationships as a standard metadata endpoint. ManyToOne is the outgoing lookup perspective (child -> parent), complementing OneToMany (parent -> child). - README: 'one-to-many + many-to-many' -> 'one-to-many + many-to-one + many-to-many' - _relationships.py docstring updated to match - copilot-instructions.md: added SKILL.md reference to governance list 770 unit tests passing.

…ributeOf test Critical fixes: - tables.py: Learn-incompatible ':class:\list\ of :class:\str\' -> 'list[str]' - _odata.py, _relationships.py, _error_codes.py: add __all__ = [] per SKILL.md Important fixes: - _odata.py: _delete_multiple param 'logical_name' -> 'table_schema_name' - _odata.py: _lowercase_list docstring removed misleading \ mention - _odata.py: _query_sql guardrails comment clarified (defense-in-depth) New test: - test_excludes_attribute_of_columns: verifies computed display-name columns (with AttributeOf set) are filtered from sql_columns() 771 unit tests passing.

…ng approach CodeQL found exponential backtracking in _SQL_WRITE_RE which used nested quantifiers for SQL comment matching: (?:/\*.*?\*/\s*|...)* Fix: separate comment stripping (_SQL_COMMENT_RE) from write detection. 1. Strip SQL comments with a safe non-backtracking regex 2. Check for write keywords with the simple anchored regex The comment regex uses [^*]*\*+(?:[^/*][^*]*\*+)*/ which is the standard safe pattern for matching C-style block comments. 3 new tests for comment-prefixed write detection. 774 unit tests passing.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/PowerPlatform/Dataverse/operations/query.py

examples/advanced/sql_examples.py

src/PowerPlatform/Dataverse/data/_odata.py

src/PowerPlatform/Dataverse/operations/tables.py

src/PowerPlatform/Dataverse/data/_odata.py

- sql_joins(): fix docstring example to not use aliases with join_clause (join_clause uses full table name; use sql_join() for aliased queries) - sql_examples: replace _value JOIN pattern with sql_join() helper - list_table_relationships: docstring now mentions ManyToOne - remove unused VALIDATION_SQL_CROSS_JOIN_BLOCKED import - PR description: cartesian updated from ValidationError to UserWarning

Copilot AI assigned Copilot and saurabhrb Mar 13, 2026

Copilot started work on behalf of saurabhrb March 13, 2026 16:12 View session

Copilot AI changed the title ~~[WIP] Add public API for discovering entity/table column definitions~~ Add public schema discovery APIs: list_columns, list_relationships, list_table_relationships Mar 13, 2026

Copilot AI requested a review from saurabhrb March 13, 2026 16:22

Copilot finished work on behalf of saurabhrb March 13, 2026 16:22

Copilot started work on behalf of saurabhrb March 13, 2026 16:24 View session

Copilot finished work on behalf of saurabhrb March 13, 2026 16:25

This was referenced Mar 13, 2026

Closed — see #144 #142

Closed

Closed — see #144 #143

Closed

Community Contribution Intake — Triage #1 (2026-03-13) #144

Open

saurabhrb unassigned Copilot Mar 20, 2026

saurabhrb marked this pull request as ready for review March 20, 2026 23:44

saurabhrb requested a review from a team as a code owner March 20, 2026 23:44

Copilot AI review requested due to automatic review settings March 20, 2026 23:44

Copilot started reviewing on behalf of saurabhrb March 20, 2026 23:45 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Copilot AI and others added 2 commits March 20, 2026 16:54

Initial plan

5b08ea0

Add list_columns, list_relationships, list_table_relationships APIs w…

d61ffca

…ith tests and README examples Co-authored-by: saurabhrb <32964911+saurabhrb@users.noreply.github.com>

saurabhrb changed the title ~~Add public schema discovery APIs: list_columns, list_relationships, list_table_relationships~~ End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, and comprehensive examples Mar 21, 2026

saurabhrb force-pushed the copilot/add-public-api-for-metadata branch from e9a25aa to 7f06533 Compare March 21, 2026 05:30

Saurabh Badenkal added 6 commits March 20, 2026 22:35

saurabhrb changed the title ~~End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, and comprehensive examples~~ End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails (8-table JOINs validated) Mar 22, 2026

Saurabh Badenkal added 2 commits March 21, 2026 21:30

saurabhrb changed the title ~~End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails (8-table JOINs validated)~~ End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails (15-table JOINs validated) Mar 22, 2026

saurabhrb requested a review from Copilot March 22, 2026 04:37

Copilot started reviewing on behalf of saurabhrb March 22, 2026 04:38 View session

Copilot AI reviewed Mar 22, 2026

View reviewed changes

Saurabh Badenkal added 5 commits March 21, 2026 21:49

saurabhrb changed the title ~~End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails (15-table JOINs validated)~~ End-to-end SQL support: schema discovery, SELECT * expansion, DataFrame bridge, helpers, guardrails Mar 22, 2026

github-advanced-security bot found potential problems Mar 22, 2026

View reviewed changes

src/PowerPlatform/Dataverse/data/_odata.py Fixed Show fixed Hide fixed

Saurabh Badenkal added 2 commits March 21, 2026 22:31

saurabhrb requested a review from Copilot March 22, 2026 06:30

Copilot started reviewing on behalf of saurabhrb March 22, 2026 06:30 View session

Copilot AI reviewed Mar 22, 2026

View reviewed changes

Conversation

Copilot AI commented Mar 13, 2026 • edited by saurabhrb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's new

Schema discovery APIs

SELECT * auto-expansion

DataFrame SQL bridge

SQL helpers (client.query)

OData helpers (client.query)

SQL safety guardrails (3 layers of protection)

Anti-patterns and best practices (Section 32 of examples)

Key findings from live testing

Test coverage

Comprehensive examples (sql_examples.py -- 34 sections, 1194 lines)

Uh oh!

saurabhrb commented Mar 13, 2026

Uh oh!

Copilot AI commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saurabhrb commented Mar 22, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 13, 2026 •

edited by saurabhrb

Loading

SQL helpers (`client.query`)

OData helpers (`client.query`)

Comprehensive examples (`sql_examples.py` -- 34 sections, 1194 lines)