(资料图片)

php入门到就业线上直播课:进入学习Apipost = Postman + Swagger + Mock + Jmeter 超好用的API调试工具:点击使用

本教程操作环境:Windows7系统、PHP8.1版、Dell G3电脑。

php怎么只获取文章文字内容?

php只抓取网页body文字内容,并过滤网页标签

php只抓取网页文字内容,并过滤其标签,说干就干,开始!

代码如下:

<?php function curl_request ( $url , $post = "" , $cookie = "" ,  $returnCookie = 0 ) {     $ua = $ua==""?$_SERVER ["HTTP_USER_AGENT"]:"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 732; .NET4.0C; .NET4.0E; LBBROWSER)" ;            $curl  =  curl_init ( ) ;            curl_setopt ( $curl , CURLOPT_URL ,  $url ) ;            curl_setopt ( $curl , CURLOPT_USERAGENT , $ua ) ;            curl_setopt ( $curl , CURLOPT_FOLLOWLOCATION ,  1 ) ;            curl_setopt ( $curl , CURLOPT_AUTOREFERER ,  1 ) ;            curl_setopt ( $curl , CURLOPT_REFERER ,  "https://www.baidu.com" ) ;            if ( $post )  {                 curl_setopt ( $curl , CURLOPT_POST ,  1 ) ;                 curl_setopt ( $curl , CURLOPT_POSTFIELDS ,  http_build_query ( $post ) ) ;            }            if ( $cookie )  {                 curl_setopt ( $curl , CURLOPT_COOKIE ,  $cookie ) ;            }            curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);            curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);            curl_setopt ( $curl , CURLOPT_HEADER ,  $returnCookie ) ;            curl_setopt ( $curl , CURLOPT_TIMEOUT ,  10 ) ;            curl_setopt ( $curl , CURLOPT_RETURNTRANSFER ,  1 ) ;            $data  =  curl_exec ( $curl ) ;            if  ( curl_errno ( $curl ) )  {                 return  curl_error ( $curl ) ;            }            curl_close ( $curl ) ;            if ( $returnCookie ) {                 list ( $header ,  $body )  =  explode ( "\r\n\r\n" ,  $data ,  2 ) ;                 preg_match_all ( "/Set\-Cookie:([^;]*);/" ,  $header ,  $matches ) ;                 $info [ "cookie" ]   =  substr ( $matches [ 1 ] [ 0 ] ,  1 ) ;                 $info [ "content" ]  =  $body ;                 return  $info ;            } else {                 //return  $data ;                 $data=mb_convert_encoding($data, "UTF-8", "UTF-8,GBK,GB2312,BIG5");                preg_match("/<body.*?>(.*?)<\/body>/is",$data,$match);                $str= trim($match[1]);      $html = strip_tags($str);    $html_len = mb_strlen($html,"UTF-8");    $html = mb_substr($html, 0, strlen($html), "UTF-8");    $search = array(" "," ","\n","\r","\t");    $replace = array("","","","","");    echo str_replace($search, $replace, $html);            }}curl_request ( $url, $post = "" , $cookie = "" ,  $returnCookie = 0 );?>

推荐学习:《PHP视频教程》

以上就是php怎么只获取文章文字内容的详细内容,更多请关注php中文网其它相关文章!

推荐内容