Skip to content

NIFI-16000: Allow spaces in FileUtils.getSanitizedFilename#11315

Open
mcgilman wants to merge 1 commit into
apache:mainfrom
mcgilman:NIFI-16000
Open

NIFI-16000: Allow spaces in FileUtils.getSanitizedFilename#11315
mcgilman wants to merge 1 commit into
apache:mainfrom
mcgilman:NIFI-16000

Conversation

@mcgilman

@mcgilman mcgilman commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Remove the space character from the invalid-character set so interior spaces are preserved, and normalize the result by collapsing whitespace runs to a single space and stripping leading/trailing whitespace and trailing dots. This lets the asset-upload callers accept common valid filenames such as "driver (1).jar" while still rejecting non-canonical names. Add TestFileUtils covering the sanitization contract.

Summary

NIFI-16000

FileUtils.getSanitizedFilename(String) treats the space character (code point 32) as invalid and replaces it with an underscore. The space character is legal on every major file system (NTFS, ext4, APFS, etc.), so this is stricter than necessary.

This matters because of how the method is consumed. Both ConnectorResource and ParameterContextResource use it as a strict validation gate for the asset name supplied in the Filename request header — they sanitize the supplied name and reject the request if the sanitized value differs from the original:

final String sanitizedAssetName = FileUtils.getSanitizedFilename(assetName);
if (!assetName.equals(sanitizedAssetName)) {
    throw new IllegalArgumentException(FILENAME_HEADER + " header contains an invalid file name");
}

Because any name containing a space is rewritten during sanitization, the equality check fails and the upload is rejected. As a result, common valid filenames cannot be uploaded as assets — e.g. a file produced by browser/OS download de-duplication such as driver (1).jar is sanitized to driver_(1).jar and rejected with "... header contains an invalid file name."

Changes

  • Removed the space character (32) from the invalid-character set so spaces are preserved rather than replaced.
  • Spaces are kept exactly as supplied (leading, trailing, repeated, and interior); no other normalization is performed. All other characters continue to be sanitized as before.
  • Added TestFileUtils covering the sanitization contract (null/empty, invalid-character replacement, spaces preserved, dots preserved).

The change is backward compatible: any filename that contained no spaces is sanitized exactly as before. The only behavioral change is that the space character is now preserved instead of replaced, so filenames whose sole issue was a space are now accepted by the asset-upload callers instead of being rejected.

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as described in the issue tracking

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Remove the space character from the invalid-character set so spaces are
preserved rather than replaced with an underscore. No other normalization
is performed, so leading, trailing, repeated, and interior spaces are kept
exactly as supplied. This lets the asset-upload callers accept common
valid filenames such as "driver (1).jar" that were previously rejected,
while remaining backward compatible for names that contain no spaces. Add
TestFileUtils covering the sanitization contract.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant