Skip to content

0.7 alpha 2#258

Open
Chenglong-MS wants to merge 207 commits intomainfrom
dev
Open

0.7 alpha 2#258
Chenglong-MS wants to merge 207 commits intomainfrom
dev

Conversation

@Chenglong-MS
Copy link
Copy Markdown
Collaborator

@Chenglong-MS Chenglong-MS commented Mar 18, 2026

PR Summary

Agents & AI Pipeline

  • Unified data agents: Consolidated agent_py_data_rec, agent_sql_data_rec, agent_py_data_transform, agent_sql_data_transform, agent_concept_derive, agent_py_concept_derive, agent_data_clean, and agent_exploration into three unified agents: data_agent.py, agent_data_rec.py, and agent_data_transform.py
  • Semantic type system: New semantic_types.py backend module and full frontend type registry (src/lib/agents-chart/core/type-registry.ts, field-semantics.ts, semantic-types.ts) with domain shape inference, tick constraints, zero-baseline classification, and snap-to-bound heuristics
  • Chart insight agent: New agent_chart_insight.py for AI-generated chart takeaways
  • Language agent: New agent_language.py for i18n-aware prompts
  • Diagnostics agent: New agent_diagnostics.py with unified diagnostic information builder for better error reporting
  • Improved agent robustness: Better handling of missing output blocks, output variable detection, multimodal fallback for text-only models

Visualization

  • Agents-chart library: Complete new chart rendering library (src/lib/agents-chart/, 120 files, ~44K lines) with multi-backend support for Vega-Lite, ECharts, Chart.js, and GoFish — includes template system, semantic-aware axis/domain/tick handling, color decisions, layout computation, faceting, and overflow filtering
  • Chart gallery: New ChartGallery.tsx with expanded chart type support including pie, US map, world map, bump, candlestick, density, lollipop, pyramid, radar, rose, streamgraph, strip plot, waterfall, and more
  • Chart render service: New ChartRenderService.tsx replacing static SVG rendering with vega-embed for interactive charts
  • Insight panel redesign: Insight takeaways now display as styled cards (matching concept explanation style) with 2-column grid layout instead of bullet lists
  • Chart recommendations: New SimpleChartRecBox.tsx and chartRecommendation.ts for improved chart suggestion workflow
  • Score tick fix: Score type with small domain spans (e.g., [0,1]) no longer forces integer-only ticks, preserving intermediate decimal ticks

Data Thread & Workflow

  • Hybrid thread redesign: Unified data thread with reports integrated into threads (DataThread.tsx rewrite, new DataThreadCards.tsx, InteractionEntryCard.tsx)
  • Unified formulate data hook: New useFormulateData.ts consolidating data derivation logic
  • Report editor: New Tiptap-based report editor (TiptapReportEditor.tsx) with richer editing support

Data Loading & Management

  • Unified upload dialog: New UnifiedDataUploadDialog.tsx replacing the old table selection view — supports file upload, URL, paste, database, and sample datasets in a single dialog with loading state indicators
  • Multi-table preview: New MultiTablePreview.tsx for previewing multiple tables before loading
  • Unified table loading thunk: New tableThunks.ts handling all data source types with server-side workspace storage
  • Live data & refresh: New useDataRefresh.tsx with auto-refresh, stream data sources, and RefreshDataDialog.tsx
  • Virtual table sorting: Server-side sorting now returns original row IDs (#rowId) via ROW_NUMBER() in DuckDB and pandas paths, preserving original row positions after sort

Data Loaders (Database Plugins)

  • New data loaders: Added Athena, BigQuery, and MongoDB data loaders
  • Enhanced existing loaders: Improved MySQL, PostgreSQL, MSSQL, S3, Azure Blob, and Kusto loaders with better error handling, connection cleanup, and password sanitization

Datalake / Workspace Backend

  • New workspace system: Complete datalake/ package with workspace.py, azure_blob_workspace.py, cached_azure_blob_workspace.py, file_manager.py, metadata.py, cache_manager.py, parquet_utils.py, and table_names.py
  • Workspace factory: New workspace_factory.py for configuration-driven workspace initialization
  • Session management: New session_routes.py for session-level API endpoints
  • Unicode & encoding: Support for Unicode filenames, path traversal checks, safe filename processing, UTF-8/GBK encoding detection
  • Atomic metadata updates: Prevent lost updates in concurrent scenarios

Security

  • Code signing: New code_signing.py for generated code integrity verification
  • Auth module: New auth.py for authentication handling
  • URL allowlist: New url_allowlist.py for URL validation
  • Error sanitization: New sanitize.py to prevent leaking sensitive info in error messages
  • Sandbox system: New sandbox/ package with local_sandbox.py, docker_sandbox.py, not_a_sandbox.py, and Dockerfile.sandbox replacing the old py_sandbox.py
  • Identity management: New identity.ts with browser-based identity for multi-user support

Internationalization (i18n)

  • Full i18n framework: Added react-i18next with English and Chinese locale files across 7 namespaces (common, chart, encoding, messages, model, navigation, upload)
  • Translation guide: Comprehensive TRANSLATION_GUIDE.md for contributors

UI & Design System

  • Design tokens: New tokens.ts with centralized color, spacing, shadow, transition, and radius tokens
  • Canvas redesign: Refactored DataFormulator.tsx and App.tsx with TopNavButton, AppShell navigation, and model management UI
  • Encoding shelf updates: Reworked EncodingShelfCard.tsx and EncodingShelfThread.tsx
  • Removed legacy components: Deleted ConceptCard.tsx, ConceptShelf.tsx, DerivedDataDialog.tsx

Model Management

  • Server-side global models: New model_registry.py for managing model configurations server-side
  • Model selection dialog: Enhanced ModelSelectionDialog.tsx with multi-model support

Infrastructure & DevOps

  • Docker support: New Dockerfile, docker-compose.yml, docker-compose.test.yml with volume permissions and sandbox user handling
  • Updated dev container: Refreshed .devcontainer/devcontainer.json
  • Dependency management: Migrated from npm to yarn, added uv.lock, updated pyproject.toml and requirements.txt

Testing

  • Comprehensive test suite: 69 new test files (~8K lines) covering backend unit, integration, contract, security, plugin, and frontend unit tests
  • Test infrastructure: New vitest.config.ts, pytest.ini, conftest.py, frontend setup, and test_plan.md
  • Database plugin tests: Docker-based test harnesses for MySQL, PostgreSQL, MongoDB, and BigQuery

IAMkecheng and others added 30 commits February 27, 2026 21:35
Bumps [immutable](https://github.com/immutable-js/immutable-js) from 5.1.4 to 5.1.5.
- [Release notes](https://github.com/immutable-js/immutable-js/releases)
- [Changelog](https://github.com/immutable-js/immutable-js/blob/main/CHANGELOG.md)
- [Commits](immutable-js/immutable-js@v5.1.4...v5.1.5)

---
updated-dependencies:
- dependency-name: immutable
  dependency-version: 5.1.5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
add src/lib/agents-chart/core/color-decisions.ts, undated corresponding Echarts code
…ble-5.1.5

Bump immutable from 5.1.4 to 5.1.5
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.5.4 to 6.5.5.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](tornadoweb/tornado@v6.5.4...v6.5.5)

---
updated-dependencies:
- dependency-name: tornado
  dependency-version: 6.5.5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [pyjwt](https://github.com/jpadilla/pyjwt) from 2.11.0 to 2.12.0.
- [Release notes](https://github.com/jpadilla/pyjwt/releases)
- [Changelog](https://github.com/jpadilla/pyjwt/blob/master/CHANGELOG.rst)
- [Commits](jpadilla/pyjwt@2.11.0...2.12.0)

---
updated-dependencies:
- dependency-name: pyjwt
  dependency-version: 2.12.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
fix color setting of echarts and chart.js
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Remove user: "0:0" override in docker-compose.yml — the Dockerfile
already creates /home/appuser/.data_formulator and chowns it to appuser
before switching to USER appuser, so the override was causing the app to
run as root and write to /root/.data_formulator, bypassing the mounted
volume entirely.

Pass --user with host uid:gid to docker run in DockerSandbox so the
sandbox container UID matches the host user that created the bind-mounted
output directory. Without this, the non-root sandbox user cannot write
the output parquet file, silently breaking all Docker sandbox executions.
Bumps [pyasn1](https://github.com/pyasn1/pyasn1) from 0.6.2 to 0.6.3.
- [Release notes](https://github.com/pyasn1/pyasn1/releases)
- [Changelog](https://github.com/pyasn1/pyasn1/blob/main/CHANGES.rst)
- [Commits](pyasn1/pyasn1@v0.6.2...v0.6.3)

---
updated-dependencies:
- dependency-name: pyasn1
  dependency-version: 0.6.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
zhb-y-agent and others added 8 commits April 11, 2026 04:46
前端检测到匿名用户首次登录时,自动触发迁移流程并提供数据导入选项。后端实现安全的数据复制机制,确保迁移过程幂等且不删除源数据。同时添加必要的安全约束,防止非法迁移请求。
Implement workspace data migration from anonymous browser identity to authenticated user

- Add migration dialog component and internationalization text
- Extend workspace manager to support copying workspaces
- Add /sessions/migrate API endpoint
- Detect identity change after login and prompt for migration
…ndpoint

Add local logout handling logic to perform local cleanup and redirect when IdP does not provide end_session_endpoint
… authenticated user migration

Add local storage flag to track migration status, ensuring migration is only performed once per user
…ge management

- Change anonymous workspace migration from copy to move operation, with merge functionality added
- Add cleanup anonymous workspace API endpoint
- Persist identity type and browser ID in local storage
- Modify identity migration dialog logic to use the new cleanup API
- Fix local storage state inconsistency after migration
…and cleanup

fix(superset): Enhance SSO login popup handling and add documentation

refactor(workspace): Change migration operation to copy-then-delete pattern

docs: Add Superset SSO bridge configuration guide documentation

test: Add test cases for workspace locking scenarios
Add new internationalization text and logic handling to support IdP-initiated SSO login flow. When users directly redirect from the SSO system, automatically re-initiate the standard SP flow and display corresponding waiting prompt messages.
Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

zhb-y-agent and others added 10 commits April 11, 2026 16:19
…ty configuration

Add silent refresh capability for expired tokens in OIDC authentication, along with enhanced Flask session security configuration
Add related test cases to verify token refresh and session configuration
Add offline_access to OIDC scopes to support refresh tokens
…a during anonymous user migration

test(IdentityMigrationDialog): Add unit tests for migration dialog

refactor(DataFormulator): Refactor workspace list display logic
添加详细的数据源插件开发指南,包含后端和前端开发规范、目录结构约定、认证路由设计、CredentialVault集成、国际化、测试规范等内容,为开发者提供完整的插件开发参考
Add functionality to export tables as CSV or TSV files, with support for custom delimiters (comma or tab). Handles data deduplication and internal column filtering, and returns appropriate HTTP response with file download headers.
…bering login state

- Add credential vault feature with encrypted storage of user credentials
- Implement auto-login functionality for automatic login when users choose to remember credentials
- Add credential expiration detection mechanism to prompt users for re-entry when credentials become invalid
- Extend Superset plugin to support credential vault integration
- Add multi-language support including Chinese and English prompt messages
- Implement complete frontend-backend interaction flow including credential storage, retrieval, and deletion
- Add comprehensive unit test and integration test coverage
- Provide detailed development documentation explaining usage and security model
新增 ConfinedDir 路径安全原语,统一处理路径拼接安全校验
修复 AzureBlobWorkspace 中的路径穿越漏洞
修复 HTTP 响应头注入问题
添加相关单元测试和安全文档
…onse handling

Implement security-related error message sanitization and replace original error response methods across multiple routes
Use unified safe_error_response function for error handling to ensure sensitive information is not leaked to clients
zhb-y-agent and others added 10 commits April 11, 2026 18:30
… messages

In dataset loading and authentication handling, provide more specific error messages for different types of exceptions. For dataset loading, distinguish between ValueError/TypeError and other exceptions; for authentication failures, provide different error prompts based on HTTP status codes and connection errors.
fix: Improve error handling logic to provide more user-friendly error…
- Modify safe_error_response function to prioritize caller-provided safe messages
- Add default message _GENERIC_4XX for 4xx errors
- Remove logic that generates client messages directly from exception objects
- Unify error handling across routes using predefined safe messages
- Refactor sanitize_db_error_message to use predefined pattern matching for safe errors
fix(security): Improve error message handling to enhance security
…rmation leakage

- Remove error handling that directly exposes exception information, use fixed safe messages instead
- Clean up unused error message handling functions
- Update related tests to verify secure message handling
fix(security): Unify error message handling to prevent sensitive info…
… related tests

Add pattern matching classification for LLM/external API errors, returning predefined safe user messages
Update test cases to verify error classification functionality
Modify error handling logic in agent_routes to use the new classification feature
feat(security): Add LLM error classification functionality and update…
docs: Add and update multiple documentation files and skills
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants