-
Notifications
You must be signed in to change notification settings - Fork 3.3k
feat(glossary): import glossary from CSV #15172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ility functions, hooks, and UI components
… dropdown actions
|
🔴 Meticulous spotted visual differences in 167 of 1236 screens tested: view and approve differences detected. Meticulous evaluated ~8 hours of user flows against your PR. Last updated for commit 10dc9ed. This comment will update as new commits are pushed. |
…new' and 'updated' statuses for import count
Bundle ReportChanges will increase total bundle size by 102.48kB (0.36%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: datahub-react-web-esmAssets Changed:
Files in
|
❌ 2 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
…nsistency across glossary components
…use record name instead of fully qualified name
…s in comprehensive import hooks and utilities
…ng for glossaryTerm entities
…n glossary navigation tests
…sary CSV import functionality
…status in ImportProgressModal
…precated comments in glossary.utils.ts
…lity functions in import components
…structure in glossary-related files
… waits with dynamic checks and optimizing file upload handling
…ty checks and ensuring file input is enabled
…s with dynamic visibility checks for improved stability
…amic visibility checks for modal elements
Comprehensive Glossary Import Feature
Overview
This PR introduces a comprehensive glossary import feature that enables users to bulk import glossary terms and nodes from CSV files through a user-friendly UI. The feature provides intelligent entity comparison, change detection, hierarchical ordering, and atomic batch operations through a single GraphQL call.
🎯 Key Features
Core Functionality
Supported Entity Types
glossaryTerm)glossaryNode)Supported Metadata
📋 CSV Format
Required Columns
entity_typeglossaryTermorglossaryNodenamedescriptionterm_sourceINTERNAL,EXTERNAL, etc.source_refsource_urlOptional Columns
urnurn:li:glossaryTerm:abc123ownership_usersuser:ownershipType|user2:ownershipType2admin:Technical Owner|jdoe:Business Ownerownership_groupsgroup:ownershipType|group2:ownershipType2engineering:Technical Owner|product:Business Ownerparent_nodesBusiness TermsorBusiness Terms.Customer Datarelated_containsCustomer ID,Order IDrelated_inheritsPersonal Data,Financial Datadomain_urnurn:li:domain:engineeringdomain_nameEngineeringcustom_properties{"key1":"value1","key2":"value2"}Example CSV
🚀 How to Use
1. Access the Import Page
/glossary/import2. Upload CSV File
.csv3. Review Entities
After upload, the system will:
4. Review Changes (Optional)
5. Filter and Search
6. Start Import
7. Review Results
After import completes:
🔧 Technical Implementation
Architecture
The feature is implemented using a modular architecture with clear separation of concerns:
Backend Changes
New GraphQL Mutation:
patchEntitiesA new batch mutation endpoint that processes multiple patch operations atomically:
Key Features:
Implementation Files:
datahub-graphql-core/src/main/resources/patch.graphql- GraphQL schemadatahub-graphql-core/src/main/java/com/linkedin/datahub/graphql/resolvers/mutate/PatchEntitiesResolver.java- Resolver implementationdatahub-graphql-core/src/main/java/com/linkedin/datahub/graphql/resolvers/mutate/util/PatchResolverUtils.java- Utility functionsImport Process Flow
CSV Parsing
Entity Normalization
Comparison
Import Planning
Execution
patchEntitiesmutationaddRelatedTermsmutation)Results
Key Algorithms
Hierarchical Ordering
Entities are sorted by hierarchy level to ensure parents are created before children:
URN Pre-generation
URNs are pre-generated for new entities to enable:
Ownership Type Management
Change Detection
Intelligent comparison algorithm that:
🧪 Testing
Test Coverage
Comprehensive test suite covering:
🔒 Security & Permissions
📝 Migration Notes
For Existing Users
🎨 UI/UX Features
📊 Performance Characteristics