Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

安居客得爬取出现错误,是网站作了防范吗 #3

Open
YaphetsH opened this issue Dec 29, 2016 · 2 comments
Open

安居客得爬取出现错误,是网站作了防范吗 #3

YaphetsH opened this issue Dec 29, 2016 · 2 comments

Comments

@YaphetsH
Copy link

爬取开始
正在爬取 http://shenzhen.anjuke.com/tycoon/nanshan/p1
Traceback (most recent call last):
File "anjuke.py", line 34, in
content = anjukeSpider.getContent(currenturl)
File "anjuke.py", line 11, in getContent
conn = request.urlopen(url)
File "E:\Python3.5\lib\urllib\request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "E:\Python3.5\lib\urllib\request.py", line 472, in open
response = meth(req, response)
File "E:\Python3.5\lib\urllib\request.py", line 582, in http_response
'http', request, response, code, msg, hdrs)
File "E:\Python3.5\lib\urllib\request.py", line 504, in error
result = self._call_chain(*args)
File "E:\Python3.5\lib\urllib\request.py", line 444, in _call_chain
result = func(*args)
File "E:\Python3.5\lib\urllib\request.py", line 696, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "E:\Python3.5\lib\urllib\request.py", line 472, in open
response = meth(req, response)
File "E:\Python3.5\lib\urllib\request.py", line 582, in http_response
'http', request, response, code, msg, hdrs)
File "E:\Python3.5\lib\urllib\request.py", line 510, in error
return self._call_chain(*args)
File "E:\Python3.5\lib\urllib\request.py", line 444, in _call_chain
result = func(*args)
File "E:\Python3.5\lib\urllib\request.py", line 590, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

@ipfono
Copy link
Collaborator

ipfono commented Dec 30, 2016

应该是的,可以尝试用urllib2设置一下header模仿浏览器发出请求或用selenium驱动浏览器爬取

@dengwen168
Copy link

我和上面出现一模一样的错误,我还以为我的程序在哪儿出了问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants