2013-11-20 3 views
1

Я хочу извлечь конкретный контент из гибкой таблицы HTML, я использую Jsoup.Извлечение определенного содержимого таблицы html с помощью Jsoup

Вот структура моего стола:

<table id="main_widget_table" class="table table-striped table-hover table-condensed table-bordered"> 
       <tbody> 
       <!-- ngRepeat: object in currentView --><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_BACKUP</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task backup</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUBACKUP</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_TOTO</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task toto</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUTOTO</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_FTP</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task ftp</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUFTP</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_MSSQL</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task mssql</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUMSSQL</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_ORACLE</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task oracle</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUORA1</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_TUTU</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task tutu</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUTUTU</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_TITI</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task titi</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: TASKMUTITI</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_WSB</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task wsb</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: MUWSB</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_SAP</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task sap</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: FRQPMDEV18</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr><tr ng-repeat="object in currentView" class="ng-scope"> 
        <td> 
         <a id="main_widget_table_object_name_action" href="#//object/" target="_blank"> 
          <b class="ng-binding">TASK_BATCH</b> 
         </a> 

         <p style="font-size:11px"> 
          <span class="text-success" ng-show="object.label"><em class="ng-binding"> task batch</em></span> 
          <br ng-show="object.label"> 
          <span ng-show="object.session" class="ng-binding" style="display: none;"> 
           <span class="label label-default">WORKFLOW</span> &nbsp; <em class="ng-binding"> 
           </em> 
          </span> 
          <br ng-show="object.session" style="display: none;"> 
          <span ng-hide="object.session" class="ng-binding"> 
           <span class="label label-default">JOB</span> &nbsp; <em class="ng-binding"></em> 
          </span> 
          <br ng-hide="object.session"> 
          <span class="text-warning ng-binding">Location: MUFRQPMDE</span> 
          <span ng-show="isShowing('nextPlanified')" class="badge pull-right ng-binding" style="display: none;"> 

          </span> 
         </p> 
        </td> 
       </tr> 
       </tbody> 
      </table> 

Я только один, чтобы извлечь значение между жирными метками, например, для первого TD значение является TASK_TOTO.

Вот мой JAVA код:

ublic class HtmlParser { 

public class HtmlParser { 

public static void main(String[] args) throws Exception { 
    Document doc = Jsoup.connect("http://frstmwarwebsrv2.orsyptst.com:9000/ui/#/en/search?searchString=TSK&filterchecks=nameSWF").get(); 
    for (Element table : doc.select("#search_results_table")) { 
     for (Element row : table.select("tr")) { 
      Elements tds = row.select("td"); 
      System.out.println(tds.get(0).text()); 
     } 
    } 
} 

} 

Я новичок в JSOUP и мой код ничего не Diplay до сих пор. Я использую идентификатор таблицы, чтобы найти таблицу.

Благодаря Для вашей помощи

FYI: Моя таблица генерируется с использованием угловых JS так Jsoup не лучший способ извлечь данные таблицы.

При использовании этого кода вместо:

List<WebElement> resultsDiv = driver.findElements(By.xpath("id('search_results_table')")); 
     for (int i=0; i<resultsDiv.size(); i++) { 
     System.out.println(resultsDiv.get(i).getText()); 
     System.out.println (resultsDiv.size()); 

Я до сих пор не получить содержимое на экране и размер устанавливается в 1 !! Я не уверен, что я делаю неправильно!

ответ

0

Ну, на основании предоставленного вами фрагмента HTML, идентификатор таблицы равен main_widget_table, а не search_results_table. (Не URL-адрес в коде больше не доступен, поэтому я не могу сказать, если есть какой-то другой search_results_table на этой странице.)

Вы можете напечатать текст всех b тегов в этой таблице с

for (Element e : doc.select("#main_widget_table b")) 
    System.out.println(e.text()); 
Смежные вопросы