PHP版 RSS源 获取数据整理(不完美版)

Action()类的操作:

   public function curl_get($url){
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_URL, $url);          //设置提交的url
        curl_setopt($curl, CURLOPT_TIMEOUT, 5);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);      //参数为1表示传输数据,为0表示直接输出显示。
        curl_setopt($curl, CURLOPT_HEADER, 0);  //参数为0表示不带头文件,为1表示带头文件
        curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);  //如果接口URL是https的,我们将其设为不验证,如果不是https的接口,这句可以不用加
        curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
        $output = curl_exec($curl);         //执行命令
        curl_close($curl);
        return $output;
    }

PATH_RSSHUB_APP:在config.php文件下的配置:

define('PATH_RSSHUB_APP','https://rsshub.app');


Controller的操作:正则获取解析的数据这些是网上找的,忘在哪篇文章了

 //订阅源数据显示
    public function show(){
        $act = new Action();
        $url = PATH_RSSHUB_APP.'/readhub/category/news';//Readhub
        $html = $act ->curl_get($url);                 //获取rss源的数据
        $item = '#<item>(.*)</item>#isU';              //采集的正则(所有内容)
        preg_match_all($item,$html, $html2); //正则匹配
        $html3 = array();
        //处理数组 三维变二维
        foreach($html2 as $k2=>$v2){$html3 = $v2;}
        //正则采集
        $pattern_tle = '#<title>(.*)</title>#isU';                   //1、标题
        $pattern_desp = '#<description>(.*)</description>#isU';    //2、内容
        $pattern_pubdate = '#<pubDate>(.*)</pubDate>#isU';          //3、日期
        $pattern_desp2 = '#阅读:([1-9]\d*+).*</description>#isU';  //4、阅读量
        $pattern_link = '#<link>(.*)</link>#isU';                   //5、原文链接
        $pattern_author = '#<author>(.*)</author>#isU';             //6、作者名
        $arr = array();
        foreach($html3 as $k3=>$v3){
            //匹配正则数据
            preg_match($pattern_tle, $v3,$n1); //1、标题
            if(!empty($n1)){
                $str = addslashes(htmlspecialchars($n1['1']));
                $lastshow = strrpos($str,"[")+1; //字符串中最后一次出现的位置
                $firstshow = strpos($str,"]");   //字符串中第一次出现的位置
                $getstr = substr($str,$lastshow,$firstshow - $lastshow); //substr 截取字符串
                $arr['b_tilte'][] = $getstr;
            }else{
                return '为空';
            }
            preg_match($pattern_desp, $v3, $n2); //2、内容
            if(!empty($n2)){
                $str2 = addslashes(htmlspecialchars($n2['1']));
                $arr['b_content'][] = addslashes($str2);
            }else{
                $arr['b_content'][]  = '';
            }
            preg_match($pattern_pubdate, $v3, $n3);  //3、时间
            if(!empty($n3)){
                $arr['b_time'][] = date('Y-m-d H:i:s',strtotime($n3['1']));
            }else{
                $arr['b_time'][] = '';
            }
            preg_match($pattern_desp2, $v3, $n4); //4、阅读量
            if(!empty($n4)){
                $arr['b_number'][] = $n4['1'];
            }else{
                $arr['b_number'][] = '';
            }
            preg_match($pattern_link, $v3, $n5);  //5、原文链接
            if(!empty($n5)){
                $arr['b_url'][] = $n5['1'];
            }else{
                $arr['b_url'][] = '';
            }
            preg_match($pattern_author, $v3, $n6); //6、作者名
            if(!empty($n6)){
                $str6 = addslashes(htmlspecialchars($n6['1']));
                $lastshow6 = strrpos($str6,"[")+1; //字符串中最后一次出现的位置
                $firstshow6 = strpos($str6,"]");   //字符串中第一次出现的位置:
                $getstr6 = substr($str6,$lastshow6,$firstshow6 - $lastshow6); //substr 截取字符串
                $arr['b_author'][] = addslashes($getstr6);
            }else{
                $arr['b_author'][] = '';
            }
          }
        $data = array();
        foreach($arr as $ka=>$va){
            foreach($va as $key=>$val){
                $data[$key][$ka] = $val;
            }
        }
        $this ->assign('da',$data);
        return $this ->fetch();
    }

html:

<!DOCTYPE html>
<html lang="zh-cn">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width,initial-scale=1.0,maximum-scale=1.0,user-scalable=no">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1">
    <title>rss数据显示</title>
    <script src="/public/static/js/jquery-3.3.1.min.js"></script>
</head>
<style>
    .rss-box{max-width: 980px;text-align: center;margin:20px auto;background-color: #f9f9f9;}
    .rss-item{width:240px;text-align: left;display: inline-block;padding:10px;margin:10px;background-color: #fff;}
    .rss-item a{width:240px;display: block;color: #0000FF;text-decoration: none;overflow: hidden;text-overflow: ellipsis;white-space: nowrap;}
    .rss-item p{margin:3px 0;}
</style>
<body>
<div class="rss-box">
    {volist name='da' id='da'}
    <div class="rss-item">
        <a href="{$da.b_url}">{$da.b_tilte}</a>
        {if $da.b_time != ''}
        <p>时间:{$da.b_time}</p>
        {/if}

        {if $da.b_content != ''}
        <p>{$da.b_content}</p>
        {/if}

        {if $da.b_number != ''}
        <p>阅读量:{$da.b_number}</p>
        {/if}

        {if $da.b_author != ''}
        <p>作者:{$da.b_author}</p>
        {/if}
    </div>
    {/volist}
</div>
</body>
<script></script>
</html>
发布了52 篇原创文章 · 获赞 15 · 访问量 3万+

猜你喜欢

转载自blog.csdn.net/qq_41408081/article/details/101520275
RSS