-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CATEGORY_LV1 is not as descripted in README.md #15
Comments
In addition, these are 2022.3.1.0
2022.3.1.1
2022.3.1.2
2022.3.1.3
2022.3.1.4
2022.3.1.5
2022.3.1.6
2022.3.1.7
2022.3.1.8
2022.3.1.9
2022.3.1.10
2022.3.2.472
2022.3.2.473
2022.3.2.474
2022.3.2.475
2022.3.2.476
2022.3.2.477
2022.3.2.478
2022.3.2.479
2022.3.2.480
2022.3.2.481
2022.3.2.482
2022.3.2.483
2022.3.2.484
2022.3.2.485
2022.3.2.486
2022.3.2.520
2022.3.3(1).431
2022.3.3(1).432
2022.3.3(2).584
2022.3.3(2).585
2022.3.3(2).586
2022.3.3(2).587
2022.3.3(2).588
2022.3.3(2).589
2022.3.3(2).590
2022.3.3(2).887
2022.3.3(2).888
2022.3.3(2).889
2022.3.3(2).890
2022.3.3(2).891
2022.3.3(2).892
2022.3.3(2).893
2022.3.3(4).957
2022.3.3(4).958
2022.3.3(4).959
2022.3.3(4).960
2022.3.3(4).961
2022.3.3(4).962
2022.3.3(4).963
2022.3.3(4).964
2022.3.3(4).965
2022.3.3(4).966
2022.3.3(4).967
2022.3.3(4).968
2022.3.3(4).969
2022.3.3(4).970
2022.3.3(4).971
2022.3.3(4).972
2022.3.3(4).973
2022.3.3(4).974
2022.3.3(4).975
2022.3.3(4).976
2022.3.3(4).977
2022.3.3(4).978
2022.3.3(4).979
2022.3.3(4).980
2022.3.3(4).981
2022.3.3(4).982
2022.3.3(4).983
2022.3.3(4).984
2022.3.3(4).985
2022.3.3(4).986
2022.3.3(4).987
2022.3.3(4).988
2022.3.7.891
2022.3.7.892
2022.3.7.893
2022.3.7.1091
2022.3.7.1092
2022.3.7.1093
2022.3.8.252
2022.3.8.253
2022.3.8.254
2022.3.8.442
2022.3.8.443
2022.3.8.444
2022.3.8.445
2022.3.8.446
2022.3.8.447
2022.3.8.448
2022.3.8.449
2022.3.8.450
2022.3.8.451
2022.3.8.452
2022.3.10.1489
2022.3.10.1490
2022.3.10.1491
2022.3.10.1492
2022.3.10.1493
2022.3.10.1494
2022.3.10.1495
2022.3.10.1496
2022.3.10.1497
2022.3.10.1498
2022.3.12.124
2022.3.12.125
2022.3.12.126
2022.3.12.127
2022.3.12.128
2022.3.12.129
2022.3.12.197
2022.3.12.1667
2022.3.12.1718
2022.3.13(2).79
2022.3.13(2).187
2022.3.14.501
2022.3.16(2).665
2022.3.16(2).666
2022.3.16(2).667 |
Hi @JanYanisa, thanks for reporting issue here! I've done some investigations and flag the errors you provided in the image above into three categories:
as shown in the image below. Syntactic Error (S)For syntactic errors, it means that the data is incorrectly written in source pdfs. For example, "ค่าครุภัณฑ์ ที่ดินและสิ่งก่อสร้าง" as shown in the image below: You can see that it was an error produced from the Government Budgetary Office! This is one kind of syntactic error that we MUST CORRECT THE DATA BY HAND. Exceptional Cases (E)These cases are produced from "งบกลาง" entries. We'll update the documentation to state the exceptions more clearly. Thanks for your notice :) OCR Error (O)These errors produced by the OCR Tool used in this project --Google Cloud Vision API. There is nothing we can do except editing data by hand as well. SummarySome of Syntactic Errors, such as wrong indentation and bulleting, etc., can be resolved by adding some logics to the compiler to make it more robust to those errors. We've planned to develop our compiler to make it more robust, and, run through all of the source pdfs to regenerate the output file again. Hence, the errors which must be edited by hand will be the last to be fixed. I'll let this issue open and keep updating you when we do the further release. Cheers ;) |
Add exceptional cases explanation to `CATEGORY_LV1` for "งบกลาง" reported by this issue [#15].
Thank for the response! There are also the problematic rows that do not show in the image above which are con3 = df["CATEGORY_LV1"] == "งบบุคลากร"
con4 = df["CATEGORY_LV1"] == "งบดำเนินงาน"
con5 = df["CATEGORY_LV1"] == "งบลงทุน"
con6 = df["CATEGORY_LV1"] == "งบเงินอุดหนุน"
con7 = df["CATEGORY_LV1"] == "งบรายจ่ายอื่น"
df[["ITEM_ID","CATEGORY_LV1"]][~(con3 | con4 | con5 | con6 | con7)] |
@napatswift I think we should produce a message to Valid
Since it is minor change --producing log message, I think we could wait for other major changes, then, include this issue into the next release. |
Hello there,
According to README.md, it said "หมวดงบรายจ่าย level-1 จะประกอบไปด้วย
งบบุคลากร,
งบดำเนินงาน,
งบลงทุน,
งบเงินอุดหนุน,
งบรายจ่ายอื่น เท่านั้น" for CATEGORY_LV1
But this is what I got when using python pandas
So, I would like to state the issues here and hope that it will be fixed soon
The text was updated successfully, but these errors were encountered: