Использование XPathSapply в R, я пытаюсь, чтобы получить URL в Edgar: атрибут URL:Использование XPath для извлечения атрибутов узла и атрибутов с двоеточием в идентификаторах
<edgar:xbrlFile edgar:sequence="3" edgar:file="edgr-2004_10k.xml" edgar:type="EX-100.INS" edgar:size="25257" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-2004_10k.xml" />
Я попробовал несколько вариантов из следующих действий:
url <- "http://www.sec.gov/Archives/edgar/monthly/xbrlrss-2005-04.xml"
data <- getURL(url)
doc <- xmlParse(data)
url <- xpathSApply(doc, "//item/*[name()='edgar:xbrlFiling']", xmlValue)
Ниже приведен пример элемента из URL, указанный в приведенном выше коде:
<item>
<title>EDGAR ONLINE INC (0001080224) (Filer)</title>
<link>http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/0001275287-05-001434-index.htm</link>
<description>8-K</description>
<pubDate>Mon, 25 Apr 2005 15:15:09 EDT</pubDate>
<edgar:xbrlFiling xmlns:edgar="http://www.sec.gov/Archives/edgar">
<edgar:companyName>EDGAR ONLINE INC</edgar:companyName>
<edgar:formType>8-K</edgar:formType>
<edgar:filingDate>04/25/2005</edgar:filingDate>
<edgar:cikNumber>0001080224</edgar:cikNumber>
<edgar:accessionNumber>0001275287-05-001434</edgar:accessionNumber>
<edgar:fileNumber>001-32194</edgar:fileNumber>
<edgar:acceptanceDatetime>20050425151509</edgar:acceptanceDatetime>
<edgar:period>20050425</edgar:period>
<edgar:assistantDirector>2 & 3</edgar:assistantDirector>
<edgar:assignedSic>7389</edgar:assignedSic>
<edgar:fiscalYearEnd>1204</edgar:fiscalYearEnd>
<edgar:xbrlFiles>
<edgar:xbrlFile edgar:sequence="1" edgar:file="eo2425.txt" edgar:type="8-K" edgar:size="5282" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/eo2425.txt" />
<edgar:xbrlFile edgar:sequence="2" edgar:file="eo2425ex991.txt" edgar:type="EX-99.1" edgar:size="4469" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/eo2425ex991.txt" />
<edgar:xbrlFile edgar:sequence="3" edgar:file="edgr-2004_10k.xml" edgar:type="EX-100.INS" edgar:size="25257" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-2004_10k.xml" />
<edgar:xbrlFile edgar:sequence="4" edgar:file="edgr-20050228.xsd" edgar:type="EX-100.SCH" edgar:size="12111" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228.xsd" />
<edgar:xbrlFile edgar:sequence="5" edgar:file="edgr-20050228_cal.xml" edgar:type="EX-100.CAL" edgar:size="18069" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228_cal.xml" />
<edgar:xbrlFile edgar:sequence="6" edgar:file="edgr-20050228_lab.xml" edgar:type="EX-100.LAB" edgar:size="51434" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228_lab.xml" />
<edgar:xbrlFile edgar:sequence="7" edgar:file="edgr-20050228_pre.xml" edgar:type="EX-100.PRE" edgar:size="27275" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228_pre.xml" />
</edgar:xbrlFiles>
</edgar:xbrlFiling>
</item>
<item>
Возможный дубликат: http://stackoverflow.com/a/25316044/423105 – LarsH
Не дубликат. Вопрос заключается в том, что атрибут node AND имеет двоеточие. – Optimus