2016-04-15 2 views

ответ

1

Комментарии в дивы с классом comment-renderer-text-content

for elem in browser.find_elements_by_xpath('//div[@class="comment-renderer-text-content"]'): 
    print elem.text 

Который дает вам:

great stuff man. question: why use selenium for this site when the data you're looking for is in the source code and could be scraped with requests/beautifulsoup? disclaimer: i'm commenting a year later so the source code may be different :) 
Good question, if the data is in source you're right, selenium is overkill. I use selenium when I find it quicker to not have to reverse engineer a site looking for sever calls which return json data that only exists inside the browser etc... So the bottom line is if you're really crafty picking off JSON calls to the server and replicating that without needing to have the DOM built for you than it's a much better be to use BeautifulSoup or Python Requests. However if you're creating for instance an automated program to automatically pin, like stuff on facebook etc... you will most likely not be able to pull that off very easily just using BeautifulSoup. 
Answered my questions very well. 
Great job! I do have a questions though. What if the site is built in silverlight? Then I cannot see the Xpath of each element... 
the first test was slow because of a slow loading adserver, you can see it in firefox at the bottom bar. 
This is good stuff. 
Clear and useful although i'm using java. Thx 
YOU ARE BETTER THAN A PROFESSIONAL TEACHER MAN!!!.. 
thanks man 

Комментарии загружаются динамически, так что вы, возможно, придется ждать presecnce элементов:

from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 


def wait(dr, x): 
    element = WebDriverWait(dr, 20).until(
     EC.presence_of_all_elements_located((By.XPATH, x)) 
    ) 
    return element 


from selenium import webdriver 

browser = webdriver.Firefox() 
browser.get("https://www.youtube.com/watch?v=a6NhKKl-iR0") 

for elem in wait(browser, '//div[@class="comment-renderer-text-content"]'): 
    print elem.text 
+0

Я использовал ваш код, но он ничего не отобразил –

+0

Это из-за проблемы с прокси. Не могли бы вы рассказать. –

+0

@VinayakumarR, вам, скорее всего, просто нужно подождать, я отредактировал ответ, комментарии загружаются после загрузки страницы –

Смежные вопросы