Skip to content

Latest commit

 

History

History
90 lines (58 loc) · 2.16 KB

File metadata and controls

90 lines (58 loc) · 2.16 KB

Reading Bytes with Pandas

Sometimes, data is stored in binary formats or needs to be read from byte streams. Pandas provides several methods to handle such scenarios effectively. Below are some common ways to read bytes into a Pandas DataFrame.

In a simple use case, you might have a byte string representing CSV data:

import pandas as pd
import io

df = pd.read_csv(io.BytesIO(byte_data))

What io.BytesIO Is

  • It’s a file-like object that lives in memory instead of on disk.
  • It behaves like a file you can read from and write to, but the content is stored as bytes in RAM, not in an actual .txt or .csv file.
  • You can use it anywhere a function expects a “file object” (.read(), .write(), .seek(), etc.).

Why It’s Useful

  • Many libraries (like pandas, PIL, requests, zipfile) want a file as input/output.
  • If you don’t want to write to disk (slower, messy, permissions issues), you can hand them a BytesIO object instead.
  • When you’re done, you can grab the raw bytes using .getvalue().

Example 1: Pretend File in Memory

import io

# Create a BytesIO object
buffer = io.BytesIO()

# Write some bytes
buffer.write(b"hello world")

# Move cursor back to the start
buffer.seek(0)

# Read the content
print(buffer.read())  # b'hello world'

Example 2: With Pandas (no disk write)

import pandas as pd
import io

df = pd.DataFrame({"x": [10, 20], "y": [30, 40]})

# Write DataFrame as CSV into BytesIO
buf = io.BytesIO()
df.to_csv(buf, index=False)

# Get raw bytes
csv_bytes = buf.getvalue()

print(csv_bytes[:50])  # shows the first few bytes of the CSV

Example 3: Sending Bytes Over a Network

import requests

# Suppose you want to POST a CSV without saving to disk
buf = io.BytesIO()
df.to_csv(buf, index=False)
buf.seek(0)  # reset cursor so it reads from start

response = requests.post("http://example.com/upload", files={"file": buf})

Mental Model

Think of BytesIO as a USB stick in RAM:

  • You can open it, write data, move around, and read it back.
  • Nothing ever touches your hard drive unless you explicitly write the bytes out.