Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It takes about 2 hours to detect the Excel file which contains some pictures. #78

Open
ivanliu-microsoft opened this issue Aug 19, 2024 · 4 comments

Comments

@ivanliu-microsoft
Copy link

Our CSV parser leverages PROSE to check if the string content is CSV format or not. But our customer reported that it took so much time (2+ hours) to parse before it returned false (Not a qualified CSV file). I find it is blocked by the line of codes below: (See the full codes)
image
Here is the sample excel file attached.
Quotation-Personal care wipes.xlsx
My question is:

  • Why this excel file takes so much long time to "learn" by the PROSE
  • Any best practice recommended from your side in this case?
@ashishxtiwari
Copy link
Member

Can you clarify what exactly is being used to set "strData"? In other words, how is "strData" generated from the shared excel file? (I can't access the Babylon repo to find out.)

@ivanliu-microsoft
Copy link
Author

ivanliu-microsoft commented Aug 20, 2024 via email

@ivanliu-microsoft
Copy link
Author

Our csv parser will not feed any data format to PROSE. We just call its Learn() API to detect if the file is CSV or not (return CsvProgram or not). See the screen shot attached above.

@ivanliu-microsoft
Copy link
Author

Can you clarify what exactly is being used to set "strData"? In other words, how is "strData" generated from the shared excel file? (I can't access the Babylon repo to find out.)

Hi Ashish, any updates from your side? need some workaround solution on it. Many thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants