Skip to content

Latest commit

 

History

History
13 lines (12 loc) · 1.84 KB

Data_Encoding_Formats.md

File metadata and controls

13 lines (12 loc) · 1.84 KB

Data Encoding Formats

Name Base Concept Schema Pros Cons Use cases
XML Text Extensible Markup Language Optional
  • Structure
  • Self-descriptive
  • Widely supported
  • Human-readable
  • Verbose
  • Parsing complexity
  • Complex schema definitions
  • Data type ambiguity (Cannot distinguish number and string)
  • Not support binary string
  • JSON Text JavaScript Object Notation Optional
  • Lightweight
  • Simple
  • Human-readable
  • Widely supported
  • Can distinguish between number and string
  • Good support for Unicode character strings
  • Built-in support in web browsers
  • Limited data types
  • Lack of schema
  • Verbose for large data
  • Not support binary strings
  • CSV Text Comma-Separated Values None
  • Simple
  • Human-readable
  • Widely supported
  • No Schema
  • Data type ambiguity (Cannot distinguish number and string)
  • Vague (What if value contains comma or newline character)
  • TSV Text Tab-Separated Values None
  • Simple
  • Human-readable
  • Widely supported
  • No Schema
  • Data type ambiguity (Cannot distinguish number and string)
  • Vague (What if value contains tab or newline character)
  • Protocol Buffers Binary Required
  • Efficiency
  • Strong typing
  • Backward-compatible schema changes
  • Not human-readable
  • Need careful consideration for schema change
  • Thrift Binary Required
  • Efficiency
  • Strong typing
  • Backward-compatible schema changes
  • Not human-readable
  • Not widely adopted
  • Avro Binary Required
  • Efficiency
  • Supports schema evolution
  • Bulit-in data compression
  • Not human-readable
  • Not widely adopted
  • BSON Binary Binary JSON, provides a binary representation of JSON-like documents
    CBOR Binary Concise Binary Object Representation