Add TFLOP pipeline walkthrough documentation #21

ankitaggnitt · 2025-06-18T12:52:06Z

This commit adds a comprehensive Markdown document that provides a step-by-step walkthrough of the TFLOP pipeline, from data preprocessing to model architecture and training/evaluation. It bridges the concepts from the research paper with their concrete implementations in the Python codebase, citing relevant files and code snippets.

Key areas covered:

Data Preprocessing: Raw inputs, HTML to OTSL conversion, text region processing (bounding box normalization, layout embedding), pointer target creation, and final batch assembly.
Model Architecture: Overall TFLOP class structure, Image Encoder (Swin), Layout Encoder components, Logical Structure Decoder (MBART), Layout Pointer mechanism (pointer head, dot-product similarity, pointer loss, empty cell handling), and Span-aware Contrastive Supervision (loss module, positive/negative set construction, span coefficients, final contrastive loss).
Training/Evaluation: Main training script, key arguments, optimizer/scheduler, inference process (autoregressive generation, pointer usage, HTML construction), and TEDS evaluation details (use of apted library).

This commit adds a comprehensive Markdown document that provides a step-by-step walkthrough of the TFLOP pipeline, from data preprocessing to model architecture and training/evaluation. It bridges the concepts from the research paper with their concrete implementations in the Python codebase, citing relevant files and code snippets. Key areas covered: - Data Preprocessing: Raw inputs, HTML to OTSL conversion, text region processing (bounding box normalization, layout embedding), pointer target creation, and final batch assembly. - Model Architecture: Overall TFLOP class structure, Image Encoder (Swin), Layout Encoder components, Logical Structure Decoder (MBART), Layout Pointer mechanism (pointer head, dot-product similarity, pointer loss, empty cell handling), and Span-aware Contrastive Supervision (loss module, positive/negative set construction, span coefficients, final contrastive loss). - Training/Evaluation: Main training script, key arguments, optimizer/scheduler, inference process (autoregressive generation, pointer usage, HTML construction), and TEDS evaluation details (use of `apted` library).

ankitaggnitt

Let me know if its any good.

ankitaggnitt commented Jun 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add TFLOP pipeline walkthrough documentation #21

Add TFLOP pipeline walkthrough documentation #21

Uh oh!

ankitaggnitt commented Jun 18, 2025

Uh oh!

ankitaggnitt left a comment •

edited

Loading

Uh oh!

Uh oh!

Add TFLOP pipeline walkthrough documentation #21

Are you sure you want to change the base?

Add TFLOP pipeline walkthrough documentation #21

Uh oh!

Conversation

ankitaggnitt commented Jun 18, 2025

Uh oh!

ankitaggnitt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ankitaggnitt left a comment •

edited

Loading