爬虫之HttpClient-day02

爬虫之HttpClient-day02

HttpClient

网络爬虫就是用程序帮助我们访问网络上的资源，我们一直以来都是使用HTTP协议访问互联网的网页，网络爬虫需要编写程序，在这里使用同样的HTTP协议访问网页。
这里我们使用Java的HTTP协议客户端 HttpClient这个技术，来实现抓取网页数据。

1.GET请求
访问京东官网，请求url地址：http://www.jd.com/

public static void main(String[] args) throws IOException {
    //创建HttpClient对象
    CloseableHttpClient httpClient = HttpClients.createDefault();

    //创建HttpGet请求
    HttpGet httpGet = new HttpGet("http://www.jd.com/");

    CloseableHttpResponse response = null;
    try {
        //使用HttpClient发起请求
        response = httpClient.execute(httpGet);

        //判断响应状态码是否为200
        if (response.getStatusLine().getStatusCode() == 200) {
            //如果为200表示请求成功，获取返回数据
            String content = EntityUtils.toString(response.getEntity(), "UTF-8");
            //打印数据长度
            System.out.println(content);
        }

    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        //释放连接
        if (response == null) {
            try {
                response.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            httpClient.close();
        }
    }
}

爬取结果：

在这里插入图片描述

2. 带参数的GET请求
在牛客网搜索c++，地址为：
https://www.nowcoder.com/search?query=c%2B%2B&type=all

public static void main(String[] args) throws IOException {
    //创建HttpClient对象
    CloseableHttpClient httpClient = HttpClients.createDefault();

    //创建HttpGet请求
    String uri = "https://www.nowcoder.com/search?query=c%2B%2B&type=all";
    HttpGet httpGet = new HttpGet(uri);

    CloseableHttpResponse response = null;
    try {
        //使用HttpClient发起请求
        response = httpClient.execute(httpGet);

        //判断响应状态码是否为200
        if (response.getStatusLine().getStatusCode() == 200) {
            //如果为200表示请求成功，获取返回数据
            String content = EntityUtils.toString(response.getEntity(), "UTF-8");
            //打印数据长度

            System.out.println(content);
        }

    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        //释放连接
        if (response == null) {
            try {
                response.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            httpClient.close();
        }
    }
}

爬取结果：
在这里插入图片描述

春来花自青@向阳花开

发布了8 篇原创文章 · 获赞 13 · 访问量 235

私信关注

爬虫之HttpClient-day02

HttpClient

猜你喜欢