根据实际工作需要,想从访问日志里找出自己想要的东西,如找不到的文件,从google来的还是从yahoo来的或从别的地方来的,还是搜索引擎的蜘蛛访问。原理很简单就是打开文件,过滤不要的记录,分解记录字段,列表所需结果。几乎凭一个PHP的函数preg_match()搞定。下面是源代码,自己研究吧 ! <html> <head> <title> Simple tools for website logs </title> </head> <body> <form name="my_form" method="post"> Select your type :<br> <select name="type"> <option value="">Get the null links</option> <option value="yahoo">Acess from yahoo</option> <option value="google">Access from google</option> <option value="msn">Access from Msn</option> <option value="robot">Access by robots</option> </select> <input type="submit" name="submit" value="get the result"> </form> <table border=1> <tr bgcolor="#FFCCFF"> <td><font color="#000000">ClientIP</font></td> <td><font color="#000000">AccessTime</font></td> <td><font color="#000000">TargetPage</font></td> <td><font color="#000000">Code</font></td> <td><font color="#000000">FromURL</font></td> <td><font color="#000000">Client ENV</font></td> </tr> <?PHP $doc_path= $_SERVER["DOCUMENT_ROOT"]; if(substr($doc_path,-1)!="/"){ $doc_path=$doc_path."/"; }
if($type=='yahoo'){ $lines = file ($doc_path.'logs/access_log'); foreach ($lines as $line_num => $line) { if (preg_match ("/yahoo/i",strtolower($line))) { if (!preg_match ("/slurp/",strtolower($line))){ preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches); echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>"; } }
} }elseif($type=="robot"){ $lines = file ($doc_path.'logs/access_log'); foreach ($lines as $line_num => $line) { if (!preg_match("/robots.txt/i",$line)){ if (preg_match ("/(slurp)|(msnbot)|(googlebot)|(psbot)/i",strtolower($line))){ preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches); echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>"; } } } }elseif($type!=""){ $lines = file ($doc_path.'logs/access_log'); foreach ($lines as $line_num => $line) { if (preg_match ("/$type/i",strtolower($line))) { if (!preg_match ("/".$type."bot/",strtolower($line))){ preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches); echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>"; } }
}
}else{ $lines = file ($doc_path.'logs/access_log'); foreach ($lines as $line_num => $line) { if (preg_match ("/ 404 /i",$line)) { if (!preg_match ("/robots.txt/",$line)){ preg_match("/([0-9.]+)?([ -]+)?(\[)?([0-9a-zA-Z+: \/]+)?(\])?( \"GET \/)?([a-z0-9A-Z.\/\?&=%_\-:+]+)?( HTTP\/1.[1|0|2]\" )?([0-9.]+)?( )?([0-9.\-]+)?( \")?([a-z0-9A-Z.\/\?&=%_\-:+]+)?(\" \")?(.*)/i",$line, $matches); echo "<tr><td>".$matches[1]."</td><td>".$matches[4]."</td><td>".$matches[7]."</td><td>".$matches[9]."</td><td>".$matches[13]."</td><td>".$matches[15]."</td><tr>"; } } } } ?> </table> </body> </html> 
|