Tema: PHP & XPath
View Single Post
Staro 03.11.2010., 17:26   #1
svebee
/
 
Datum registracije: Oct 2006
Lokacija: /
Postovi: 2,053
Arrow PHP & XPath

Želio bih "skinuti" ZET-ov vozni red, tj. linkove na vozne redove...stvar je da mi prozuji kroz sve linkove osim zadnjih 6, ako spremim stranicu kao .HTML i nju probam "skinuti" - sve uredno prolazi.

gdje je problem? možda u 6. linku odozdola koji sadrži "/(" u linku? to mi sad jedino pada na pamet, ali ne vidim to kao problem uglavnom ispiše sve do /media/39180/182.pdf. nakon toga - ništa.

Link je - http://www.zet.hr/autobus/dnevni.aspx

Kod je

Code:
<?php    
    $target_url = "http://www.zet.hr/autobus/dnevni.aspx";
    
    $userAgent = 'IE 6 – Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
    
    // make the cURL request to $target_url
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
    curl_setopt($ch, CURLOPT_URL,$target_url);
    curl_setopt($ch, CURLOPT_FAILONERROR, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 50);
    $html= curl_exec($ch);
    if (!$html) {
        echo "<br />cURL error number:" .curl_errno($ch);
        echo "<br />cURL error:" . curl_error($ch);
        exit;
    }
    
    // parse the html into a DOMDocument
    $dom = new DOMDocument();
    @$dom->loadHTML($html);
    
    // grab all the on the page
    $xpath = new DOMXPath($dom);
    $hrefs = $xpath->evaluate("/html/body/div[@id='container']/div[@id='content']/div[@id='autobus']/ul/li//a");
    echo "length: " . $hrefs->length;
        
    for ($i = 0; $i < $hrefs->length; $i++) {
        $href = $hrefs->item($i);
        $url = $href->getAttribute('href');
        echo "<br />" . ($i+1) . " | " . $url;
    }
?>

Zadnje izmijenjeno od: svebee. 03.11.2010. u 17:32.
svebee je offline   Reply With Quote