python爬虫教程之爬取百度贴吧并下载的示例

yipeiwu_com6年前 (2020-03-06)Python爬虫

测试url：http://tieba.baidu.com/p/27141123322?pn=
begin 1
end 4

import string ,urllib2

def baidu_tieba(url,begin_page,end_page):
    for i in range(begin_page, end_page+1):
        sName = string.zfill(i,5)+ '.html'
        print '正在下载第' + str(i) + '个网页，并将其存储为' + sName + '..........'
        f = open (sName,'w+')
        m = urllib2.urlopen(url + str(i)).read()
        f.write(m)
        f.close()

bdurl = str(raw_input('url： \n'))
begin_page = int(raw_input('begin :\n'))
end_page = int(raw_input('end : \n'))

baidu_tieba(bdurl,begin_page,end_page)

返回列表

上一篇：Python字符转换

下一篇：PHP生成静态页面详解

python高阶爬虫实战分析

关于这篇文章有几句话想说，首先给大家道歉，之前学的时候真的觉得下述的是比较厉害的东西，但是后来发现真的是基础中的基础，内容还不是很完全。再看一遍自己写的这篇文章，突然有种想自杀的冲动。e...

Python爬虫使用浏览器cookies：browsercookie过程解析

很多用Python的人可能都写过网络爬虫，自动化获取网络数据确实是一件令人愉悦的事情，而Python很好的帮助我们达到这种愉悦。然而，爬虫经常要碰到各种登录、验证的阻挠，让人灰心丧气（网...

使用Python3编写抓取网页和只抓网页图片的脚本

最基本的抓取网页内容的代码实现： #!/usr/bin/env python from urllib import urlretrieve def firstNonBl...

Python 正则表达式爬虫使用案例解析

现在拥有了正则表达式这把神兵利器，我们就可以进行对爬取到的全部网页源代码进行筛选了。下面我们一起尝试一下爬取内涵段子网站： http://www.neihan8.com/articl...

Python爬虫之pandas基本安装与使用方法示例

本文实例讲述了Python爬虫之pandas基本安装与使用方法。分享给大家供大家参考，具体如下：一、简介： Python Data Analysis Library 或 pandas...

宜配屋

python爬虫教程之爬取百度贴吧并下载的示例

相关文章

python高阶爬虫实战分析

Python爬虫使用浏览器cookies：browsercookie过程解析

使用Python3编写抓取网页和只抓网页图片的脚本

Python 正则表达式爬虫使用案例解析

Python爬虫之pandas基本安装与使用方法示例

© YiPeiWu.com 【宜配屋】粤ICP备17031333号

Powered By Z-BlogPHP. Theme by TOYEAN.

宜配屋

python爬虫教程之爬取百度贴吧并下载的示例

相关文章

python高阶爬虫实战分析

Python爬虫使用浏览器cookies：browsercookie过程解析

使用Python3编写抓取网页和只抓网页图片的脚本

Python 正则表达式爬虫使用案例解析

Python爬虫之pandas基本安装与使用方法示例

© YiPeiWu.com 【宜配屋】 粤ICP备17031333号 var _hmt = _hmt || [];(function() { var hm = document.createElement("script"); hm.src = "https://hm.baidu.com/hm.js?8aa60ae04b767b2af31903508928acc0"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(hm, s);})();

Powered By Z-BlogPHP. Theme by TOYEAN.

© YiPeiWu.com 【宜配屋】粤ICP备17031333号