Skip to content

fix(csv): strip leading byte-order mark in CsvParseStream#7183

Open
LeSingh1 wants to merge 1 commit into
denoland:mainfrom
LeSingh1:fix/csv-parse-stream-bom
Open

fix(csv): strip leading byte-order mark in CsvParseStream#7183
LeSingh1 wants to merge 1 commit into
denoland:mainfrom
LeSingh1:fix/csv-parse-stream-bom

Conversation

@LeSingh1

Copy link
Copy Markdown

The synchronous parse() function already strips a leading UTF-8
byte-order mark (U+FEFF) from its input, but CsvParseStream did not.

When a CSV file begins with a BOM -- common output from Excel and other
Windows tools -- the first field name arrives as "name" instead
of "name". That corrupts header-based lookups silently:

const source = ReadableStream.from(["name,age\n", "Alice,34\n"]);
const records = await Array.fromAsync(
  source.pipeThrough(new CsvParseStream({ skipFirstRow: true })),
);
// before this fix:
// [{ "name": "Alice", age: "34" }]   -- BOM leaks into key
// after:
// [{ name: "Alice", age: "34" }]

The fix adds a #firstLine flag to StreamLineReader and strips the
BOM from the first line it reads, exactly matching what parse() does
via its BYTE_ORDER_MARK constant.

Two new tests cover the regression: one for plain string[][] output and
one for skipFirstRow: true (object output, where the BOM corrupts the
header key).

parse() already strips a leading BOM from its input string, but
CsvParseStream left it intact. When a UTF-8 CSV file starts with a BOM
(common output of tools like Excel), the first field name would arrive as
"name" instead of "name", corrupting headers and key lookups.

StreamLineReader now strips the BOM from the first line it reads,
matching the existing parse() behaviour exactly.
@github-actions github-actions Bot added the csv label Jun 12, 2026
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.57%. Comparing base (cdf74a8) to head (9e34775).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7183      +/-   ##
==========================================
- Coverage   94.57%   94.57%   -0.01%     
==========================================
  Files         636      637       +1     
  Lines       52142    52159      +17     
  Branches     9401     9403       +2     
==========================================
+ Hits        49315    49328      +13     
- Misses       2249     2254       +5     
+ Partials      578      577       -1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant