您现在的位置是: 首页 > 家常菜做法 家常菜做法

qsbk(券商板块)

佚名 2024-04-25 人已围观

简介学习scrapy爬虫,请帮忙看下问题出在哪zou@zou-VirtualBox:~/qsbk$tree.items.pyqsbknit__.pyitems.pypipelines.pysettings.pyspiders_init

学习scrapy爬虫,请帮忙看下问题出在哪

zou@zou-VirtualBox:~/qsbk$ tree

.

items.py

qsbk

nit__.py

items.py

pipelines.py

settings.py

spiders

_init__.py

qsbk_spider.py

scrapy.cfg

-------------------------

vi items.py

from scrapy.item import Item,Field

class TutorialItem(Item):

# define the fields for your item here like:

# name = Field()

pass

class Qsbk(Item):

title = Field()

link = Field()

desc = Field()

-----------------------

vi qsbk/spiders/qsbk_spider.py

from scrapy.spider import Spider

class QsbkSpider(Spider):

name = "qsbk"

allowed_domains = ["qiushike.com"]

start_urls = ["

def parse(self, response):

filename = response

open(filename, 'wb').write(response.body)

------------------------

然后我 scrapy shell www.qiushike.com 想先把网页取下来,再xpath里面的子节点(即一些内容)

这个想法应该没错吧,但是到scrapy shell www.qiushike.com的时候网页内容就无法显示了,

错误反馈:

Python code

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

zou@zou-VirtualBox:~/qsbk$ scrapy shell ScrapyDeprecationWarning: Mole `scrapy.spider` is deprecated, use `scrapy.spiders` instead

from scrapy.spider import Spider

2015-12-21 00:18:30 [scrapy] INFO: Scrapy 1.0.3 started (bot: qsbk)

2015-12-21 00:18:30 [scrapy] INFO: Optional features available: ssl, http11

2015-12-21 00:18:30 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'qsbk.spiders', 'SPIDER_MODULES': ['qsbk.spiders'], 'LOGSTATS_INTERVAL': 0, 'BOT_NAME': 'qsbk'}

2015-12-21 00:18:30 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, CoreStats, SpiderState

2015-12-21 00:18:30 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats

2015-12-21 00:18:30 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware

2015-12-21 00:18:30 [scrapy] INFO: Enabled item pipelines:

2015-12-21 00:18:30 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023

2015-12-21 00:18:30 [scrapy] INFO: Spider opened

2015-12-21 00:18:30 [scrapy] DEBUG: Retrying <GET (failed 1 times): [<twisted.python.failure.Failure <class 'twisted.internet.error.ConnectionDone'>>]

2015-12-21 00:18:30 [scrapy] DEBUG: Retrying <GET (failed 2 times): [<twisted.python.failure.Failure <class 'twisted.internet.error.ConnectionDone'>>]

2015-12-21 00:18:30 [scrapy] DEBUG: Gave up retrying <GET (failed 3 times): [<twisted.python.failure.Failure <class 'twisted.internet.error.ConnectionDone'>>]

Traceback (most recent call last):

File "/usr/local/bin/scrapy", line 11, in <mole>

sys.exit(execute())

File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute

_run_print_help(parser, _run_command, cmd, args, opts)

File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help

func(*a, **kw)

File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command

cmd.run(args, opts)

File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/shell.py", line 63, in run

shell.start(url=url)

File "/usr/local/lib/python2.7/dist-packages/scrapy/shell.py", line 44, in start

self.fetch(url, spider)

File "/usr/local/lib/python2.7/dist-packages/scrapy/shell.py", line 87, in fetch

reactor, self._schele, request, spider)

File "/usr/lib/python2.7/dist-packages/twisted/internet/threads.py", line 122, in blockingCallFromThread

result.raiseException()

File "<string>", line 2, in raiseException

twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure <class 't

求一段话:糗百微博上的,其中一句是:当她累了,就带她看旋转的木马,当她。。。带她看尽人世间的繁华。

如果她涉世未深,请带她尝遍世间繁华;如果她历经沧桑,请带她坐十次木马。

如果他情窦初开,你就宽衣解带; 如果他阅人无数,你就灶边炉台。