-
Notifications
You must be signed in to change notification settings - Fork 0
Entity Extraction
Vishwajeetsingh Desurkar edited this page Aug 23, 2025
·
1 revision
- Input: Translated and combined text from parsing + OCR.
-
Approach:
- Focused on Maharashtra SRO Index II documents (standardized format).
- Implemented using spaCy NLP pipeline with regex-based pattern matching for structured fields.
-
Extracted Entities:
- Property Details: State, District, Tehsil, Village/Area, Survey No, Plot No, Building/Flat Name.
- Parties: Buyer(s) & Seller(s) – Name, Address, PAN.
- Additional Info: SRO Office, Document Number.
- Output: Structured JSON with all extracted entities, ready for crawling & ownership search.
{
"success": true,
"error": "",
"data": {
"property": {
"state": "Maharashtra",
"district": "Pune",
"tehsil": "Haveli 3 23-08-2025 Document No.: 14929/2025 Note: Generated Through eSearch Module. For the original report",
"village": "Aundh",
"survey_no": "128",
"plot_no": "57"
},
"parties": {
"sellers": [
{
"name": "Sandhya Sachin Hande",
"address": "Plot No. 0, Floor No. 0, Building Name: -, Block No. -, Road No.: M.G. Road, Naupada, Thane, Maharashtra, Thane. PIN Code: 400602, PAN No.: - 2) Name: Sachin Laxman Hande, Age: 50, Address: Plot No. 0, Floor No. 0, Building Name: -, Block No. -, Road No.: M.G. Road, Naupada, Thane, Maharashtra, Thane. PIN Code: 400602,",
"pan": "-"
}
],
"buyers": [
{
"name": "Saraswat Co-op Bank Ltd.",
"address": "Plot No. 0, Floor No. 0, Building Name: -, Block No. -, Road No.: Eknath Thakur Bhavan, 953, Appasaheb Marathe Marg, Prabhadevi, Mumbai, Maharashtra, Mumbai. PIN Code: 400025,",
"pan": "ASSPB3858R"
}
]
},
"sro_office": "Haveli 3 23-08-2025 Document No.: 14929/2025 Note: Generated Through eSearch Module. For the original report",
"document_no": "14929/2025"
}
}