Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc文档抓取 #608

Open
zhumingyu opened this issue Jan 15, 2025 · 2 comments
Open

doc文档抓取 #608

zhumingyu opened this issue Jan 15, 2025 · 2 comments

Comments

@zhumingyu
Copy link

您好作者,请问能捕获这种Doc或zip文档吗?https://www.zxxk.com/soft/45705861.html

@xifangczy
Copy link
Owner

额不行... 原理上做不到...
网站提供的都是非原始文件 比如你这个网站 他把doc转成了svg文件 到设置添加后缀 svg可以得到。

@louiesun
Copy link

额不行... 原理上做不到... 网站提供的都是非原始文件 比如你这个网站 他把doc转成了svg文件 到设置添加后缀 svg可以得到。

这种算好的,你能拿到pdf格式(pdf整体上是矢量图)
像doc88这种你只能拿到位图,要想尽可能还原只能ocr了。
greasyfork上应该有这类项目,你可以去楼搜一下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants