2015-07-08 2 views
0

Я хотел бы, чтобы очистить следующие действия HTMLскребковые HTML с использованием Domdoc + PHP

<div class="venue-event-list " rel="GB"> 
          <div class="tracks-list"> 
<div class="single-track"> 
      <a href="//livevideo.betfair.com/Default.do?mi=119408124" target="_blank" class="live-video-link"><div class="bf-icon-live-video tag-i13n i13n-ltxt-LVid i13n-sec-GB i13n-tab-today" title="Watch now on Betfair Live Video"></div></a> 
    <div class="info-container"> 
     <span class="track-name"> 
      <a class="tag-i13n i13n-ltxt-meeting i13n-sec-GB i13n-tab-today" href="/exchange/plus/#/horse-racing/market/1.119408124">Lingfield</a> 
     </span> 
     <div class="races-list"> 


<div class="single-race" id="m-1_119408124"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408124" 
      title="5f Nursery | 7 Runners">14:10</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408128"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408128" 
      title="6f Mdn Stks | 11 Runners">14:40</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408132"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408132" 
      title="7f Mdn Stks | 6 Runners">15:10</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408136"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408136" 
      title="2m Hcap | 12 Runners">15:40</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408140"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408140" 
      title="1m2f Sell Stks | 6 Runners">16:10</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408144"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408144" 
      title="1m3f Hcap | 8 Runners">16:40</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408148"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408148" 
      title="1m1f Hcap | 14 Runners">17:10</a> 
    </span> 
</div> 
     </div> 
    </div> 
</div> 
        </div> 
          <div class="tracks-list"> 
<div class="single-track"> 
      <a href="//livevideo.betfair.com/Default.do?mi=119408153" target="_blank" class="live-video-link"><div class="bf-icon-live-video tag-i13n i13n-ltxt-LVid i13n-sec-GB i13n-tab-today" title="Watch now on Betfair Live Video"></div></a> 
    <div class="info-container"> 
     <span class="track-name"> 
      <a class="tag-i13n i13n-ltxt-meeting i13n-sec-GB i13n-tab-today" href="/exchange/plus/#/horse-racing/market/1.119408153">Wolverhampton</a> 
     </span> 
     <div class="races-list"> 


<div class="single-race" id="m-1_119408153"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408153" 
      title="5f Mdn Stks | 7 Runners">14:20</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408157"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408157" 
      title="1m6f Hcap | 7 Runners">14:50</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408161"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408161" 
      title="1m4f Sell Stks | 5 Runners">15:20</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408165"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408165" 
      title="1m1f Hcap | 13 Runners">15:50</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408169"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408169" 
      title="1m1f Hcap | 11 Runners">16:20</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408173"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408173" 
      title="1m Mdn Stks | 11 Runners">16:50</a> 
    </span> 
     <span class="separator">|</span> 
</div> 


<div class="single-race" id="m-1_119408177"> 
    <span class="race-time link-text"> 
     <a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
      href="/exchange/plus/#/horse-racing/market/1.119408177" 
      title="1m Hcap | 13 Runners">17:20</a> 
    </span> 
</div> 
     </div> 
    </div> 
</div> 
        </div> 

Я использовал следующий код, чтобы тянуть racename и времени гонки

$url   = ""; 
$html  = file_get_contents($url); 
$dom   = new DOMDocument(); 
@$dom->loadHTML($html); 
$dom->preserveWhiteSpace = false; 
$xpath     = new DOMXPath($dom); 
//pull the individual cards for the day 
//li class="rac-cardsclass="ix ixc" 
$getdropdown    = '//div[contains(@class, "tracks-list")]'; 
$getdropdown2   = $xpath->query($getdropdown); 
//loop through each individual card 
foreach ($getdropdown2 as $dropresults) { 
echo $dropresults->textContent. "<br />"; 
} 

Что бы я как показано ниже: «GB» и «today» (это находится в тексте класса) -

> <a class="tag-i13n i13n-ltxt-meeting i13n-sec-GB i13n-tab-today" 
> href="/exchange/plus/#/horse-racing/market/1.119408124">Lingfield</a> 

поэтому результат будет Lingfield ... если это правда, я хотел бы затем потянуть время гонки и рынок идентификатор из следующих действий:

<a class="race-link tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today" 
     href="/exchange/plus/#/horse-racing/market/1.119408124" 
     title="5f Nursery | 7 Runners">14:10</a> 

так что результат будет:

Lingfield 14:10 1.119408124 
Lingfield 14:40 1.119408144 
............................. 
Wolverhampton 14:20 1.119408153 
+1

Возможный дубликат [Содержимое веб-страницы Scrape] (http://stackoverflow.com/questions/584826/scrape-web-page-contents) –

ответ

0
$xpath->query("a[contains(@class,'GB') and contains(@class,'today')]"); 

Будет полезно.

+0

Никогда не думал, что jQuery может работать. –

+0

Я попробовал выше, и он не отступил назад. –

+0

Вы пробовали только с одним условием вроде $ xpath-> query ("a [содержит (@ class, 'GB')]"); –

Смежные вопросы