Skip to content

PHOENIX-7267 CsvBulkLoadTool fails job due to a bad record with "(sta…#2399

Open
xavifeds8 wants to merge 1 commit intoapache:masterfrom
xavifeds8:PHOENIX-7267
Open

PHOENIX-7267 CsvBulkLoadTool fails job due to a bad record with "(sta…#2399
xavifeds8 wants to merge 1 commit intoapache:masterfrom
xavifeds8:PHOENIX-7267

Conversation

@xavifeds8
Copy link
Copy Markdown
Contributor

…rtline 1) EOF reached before encapsulated token finished"

@xavifeds8 xavifeds8 force-pushed the PHOENIX-7267 branch 2 times, most recently from 2404a9d to 41a8c42 Compare April 27, 2026 10:43
@xavifeds8
Copy link
Copy Markdown
Contributor Author

With commons-csv 1.0, CsvBulkLoadTool would fail the entire MapReduce job when encountering a malformed CSV record.
After the upgrade of commons-csv to 1.14.1
--ignore-errors: Bad records are skipped, good records are loaded, errors are counted in MR counters
Without --ignore-errors: Job fails gracefully with a clear error message instead of crashing

Sanity test for the upgrade : https://gist.github.com/xavifeds8/bd6015a1733ddbf630cbbdb453bdbc0d

@xavifeds8
Copy link
Copy Markdown
Contributor Author

xavifeds8 commented Apr 27, 2026

Changes made :

  1. Upgraded commons-csv from 1.0 to 1.14.1
  2. Migrated deprecated CSVFormat.withXxx() calls to CSVFormat.Builder API
  3. Migrated deprecated new CSVParser(reader, format) to CSVParser.builder().setFormat(format).setReader(reader).get()
  4. Caught UncheckedIOException (thrown by commons-csv 1.14.1 during iteration) in UpsertExecutor and CsvToKeyValueMapper, so parse errors are now routed through the normal error-handling path
  5. Updated Pherf's CSVFileResultHandler and GoogleChartGenerator for the same API migration

Tests:

  1. Updated existing CSVCommonsLoaderIT tests for the new API
  2. Fixed testCSVCommonsUpsertBadEncapsulatedControlChars assertion to match the new exception wrapping
  3. Added testCSVCommonsUpsertEOFInEncapsulatedToken — directly tests the reported scenario (unclosed quote at EOF)

…rtline 1) EOF reached before encapsulated token finished"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants