Skip to content

Entity Extraction

Vishwajeetsingh Desurkar edited this page Aug 23, 2025 · 1 revision

Entity Extraction

  • Input: Translated and combined text from parsing + OCR.
  • Approach:
    • Focused on Maharashtra SRO Index II documents (standardized format).
    • Implemented using spaCy NLP pipeline with regex-based pattern matching for structured fields.
  • Extracted Entities:
    • Property Details: State, District, Tehsil, Village/Area, Survey No, Plot No, Building/Flat Name.
    • Parties: Buyer(s) & Seller(s) – Name, Address, PAN.
    • Additional Info: SRO Office, Document Number.
  • Output: Structured JSON with all extracted entities, ready for crawling & ownership search.

JSON Output (Recommended)

{
  "success": true,
  "error": "",
  "data": {
    "property": {
      "state": "Maharashtra",
      "district": "Pune",
      "tehsil": "Haveli 3 23-08-2025 Document No.: 14929/2025 Note: Generated Through eSearch Module. For the original report",
      "village": "Aundh",
      "survey_no": "128",
      "plot_no": "57"
    },
    "parties": {
      "sellers": [
        {
          "name": "Sandhya Sachin Hande",
          "address": "Plot No. 0, Floor No. 0, Building Name: -, Block No. -, Road No.: M.G. Road, Naupada, Thane, Maharashtra, Thane. PIN Code: 400602, PAN No.: - 2) Name: Sachin Laxman Hande, Age: 50, Address: Plot No. 0, Floor No. 0, Building Name: -, Block No. -, Road No.: M.G. Road, Naupada, Thane, Maharashtra, Thane. PIN Code: 400602,",
          "pan": "-"
        }
      ],
      "buyers": [
        {
          "name": "Saraswat Co-op Bank Ltd.",
          "address": "Plot No. 0, Floor No. 0, Building Name: -, Block No. -, Road No.: Eknath Thakur Bhavan, 953, Appasaheb Marathe Marg, Prabhadevi, Mumbai, Maharashtra, Mumbai. PIN Code: 400025,",
          "pan": "ASSPB3858R"
        }
      ]
    },
    "sro_office": "Haveli 3 23-08-2025 Document No.: 14929/2025 Note: Generated Through eSearch Module. For the original report",
    "document_no": "14929/2025"
  }
}
Clone this wiki locally