-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug47950 -- make stream/directory name lookup in OLE2 case insensitive #477
Conversation
@@ -297,11 +298,15 @@ public static String getWorkbookDirEntryName(DirectoryNode directory) { | |||
"It must be decrypted before use by XSSF, it cannot be used by HSSF"); | |||
} | |||
|
|||
// check for previous version of file format | |||
if (directory.hasEntry(OLD_WORKBOOK_DIR_ENTRY_NAME)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal if this is too hard but is there any chance that hasEntry
could remain case sensitive for backward compatibility reasons? And a new caseInsensitiveHasEntry
added?
If leaving hasEntry
case sensitive is too hard, could you could not using directory.getEntryNames().contains
and adding a caseSensitiveHasEntry
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Y, I don't like the backwards compatibility break that this proposed change brings about. If we don't do this, though, then we'll have to change hasEntry(name)
and getEntry(name)
to the case insensitive options throughout our codebase. I can do that if desired.
Yep, I'll add that now.
poi/src/main/java/org/apache/poi/hssf/usermodel/HSSFWorkbook.java
Outdated
Show resolved
Hide resolved
I look forward to running the regression tests after this change. 😄 Let me know what else you find. I haven't worked with POI in a while, and this is a major change in behavior. |
Results show no significant changes in ~400k msoffice files (not all ole2): https://corpora.tika.apache.org/base/reports/tika-2.8.1-msoffice-reports.tgz There is a major difference in handling "true"/"false" "1/0" in xlsx(?only?) between the versions of POI, but this change is unrelated. |
@tballison could you raise an issue for the xssf boolean issue with and xlsx attached that makes it easier to debug? With this PR, I'm still a little concerned about changing the behaviour of hasEntry/getEntry. I suppose that we could agree that the next release of POI has to be a v5.3.0 instead of v5.2.4 because of the behaviour change. |
I'm not sure the boolean issue is wrong, just different. When I opened a couple of the files, there was nothing in the cells. Will look deeper later. I'm also concerned about changing the behavior. We could go to 5.3.0, or I could have a major refactoring where I modify all the internal I'm willing to make that change, but it will be a large set of changes -- TBF, a bunch are in unit tests. Let me know what you think. The other thing I shouldn't have done was change the values of |
If possible and the IDE makes it easy, I think keeping getEntry/hasEntry case-sensitive and switching most code to a set of case-sensitive methods makes releasing easier. |
Should I add |
Seems best to expose them fully. |
Agree, but won't that break any implementations of DirectoryEntry? I can't imagine anyone is actually doing this, but, it still would break backward compatibility, no? |
I still think we should add it to the interface. The number of affected users will be low. |
…nsitive; restore InternalWorkbook.WORKBOOK_DIR_ENTRY_NAMES
PR just got huge...LOL... |
@pjfanning , wdyt? Ready to merge or further work required? Thanks, again! |
Sure - seems ok to me. You'll need to rebase your branch before submitting to svn because a few changes have gone into trunk. |
@tballison is this PR still something that you want to merge? It's a year since the last POI release and it would be good to get a release out in the next month or 2. |
Y, will update it today and merge. Thank you! |
Argh... Monday... |
# Conflicts: # poi-scratchpad/src/main/java/org/apache/poi/hpbf/model/HPBFPart.java # poi-scratchpad/src/main/java/org/apache/poi/hslf/record/CurrentUserAtom.java # poi-scratchpad/src/main/java/org/apache/poi/hslf/usermodel/HSLFSlideShowImpl.java # poi-scratchpad/src/main/java/org/apache/poi/hwpf/HWPFDocumentCore.java # poi-scratchpad/src/test/java/org/apache/poi/hsmf/parsers/TestPOIFSChunkParser.java # src/resources/ooxml-lite-report.xsb
I'm consistently getting two test failures now locally:
|
Sorry, my bad, different version of Java have different XML Parsers, should be fixed on latest trunk, please rebase/merge onto latest. |
I _think_ I had already brought in your changes. I've since merged from
upstream and am getting the same two exceptions.
…On Mon, Sep 18, 2023 at 2:26 PM Dominik Stadler ***@***.***> wrote:
Sorry, my bad, different version of Java have different XML Parsers,
should be fixed on latest trunk, please rebase/merge onto latest.
—
Reply to this email directly, view it on GitHub
<#477 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTNNPXSZOGXHRGALNIQDLTX3CG6PANCNFSM6AAAAAAZO65RZM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
As this file is now set to IGNORED in stress.xls it should not be executed at all, so if you see the same file-name "clusterfuzz-testcase-minimized-POIXWPFFuzzer-6733884933668864.docx" in the failures, the necessary change seems to be still missing. CI-Jobs at https://ci-builds.apache.org/job/POI/ were failing as well and are now fixed, thus I am fairly confident that latest trunk has it fixed now. |
Pushed merge remote-tracking. I'm getting a clean build now. Thank you @centic9 ! |
Looks like I don't have write access?! @pjfanning or @centic9 would you be able to merge? THANK YOU! |
@tballison You need to commit this to svn. https://svn.apache.org/repos/asf/poi/trunk The git repo is a read-only mirror afaik. |
🤦
…On Wed, Sep 20, 2023 at 4:00 PM PJ Fanning ***@***.***> wrote:
@tballison <https://github.com/tballison> You need to commit this to svn.
https://svn.apache.org/repos/asf/poi/trunk
The git repo is a read-only mirror afaik.
—
Reply to this email directly, view it on GitHub
<#477 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTNNPWPIU2BR325IY7TBALX3NDNNANCNFSM6AAAAAAZO65RZM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
47950.zip |
@tballison could you close this? I believe that this was applied and was part of POI 5.2.4 release. |
This is a first draft at making stream/directory name look up case insensitive for OLE2 containers.