python-网络爬虫初学一:获取网页源码以及发送POST和GET请求

    xiaoxiao2021-04-16  26

    一、工具包urlllib和urllib2导入;

    # GET和POST请求需要工具包urllib import urllib # 导入工具包 import urllib2

    二、a)爬取网站对应的源码

    # 通过资源定位符获取网页对象,通过read方法返回网页的源码 response = urllib2.urlopen("http://www.baidu.com") print response.read()

    b)将其写得规范一点,则如下所示

    # 构造request请求实例 request = urllib2.Request("http://www.baidu.com") response = urllib2.urlopen(request) print response.read()

    三、构造POST请求

    # POST请求 values = {"username": "geek", "password": "**********"} # 或者 values = {} values["username"] = "geek" values["password"] = "**********" # 将字典编码 data = urllib.urlencode(values) url = "https://passport.csdn.net/account/login?from=http://my.csdn.net/my/mycsdn" request = urllib2.Request(url, data) response = urllib2.urlopen(request) print response.read()

    四、构造GET请求

    # GET请求 values = {"username": "geek", "password": "**********"} data = urllib.urlencode(values) url = "https://passport.csdn.net/account/login?from=http://my.csdn.net/my/mycsdn" request = url + "?" + data response = urllib2.urlopen(request) print response.read()

    转载请注明原文地址: https://ju.6miu.com/read-672261.html

    最新回复(0)