2013-12-24 3 views
1

Я новичок в Python наступающем из PHP поэтому у меня возникают некоторые трудности декодирования этого JSon сценариядекодирования JSON в Python

import json 
import mechanize 

# Create a list of extensions and tehir page numbrs 
extensions = {'com':'512','net':'55','co':'21','org':'62'} 

# run a loop through the extension assosciative array 
for (ext, pages) in extensions.items(): 
# set up arrays and future variables 
visited_urls = [] 
found = 0 
member = 0 
not_found = 0 
repeated_url = 0 
added = 0 
# Set the loop for page numbers 
page_number = 1 
while page_number <= 1: 
    #set the target url 
    target = "http://punkspider.hyperiongray.com/service/search/domain/? searchkey=url&searchvalue=."+ext+"&pagesize=10&pagenumber="+str(page_number)+"&filtertype=A ND&sqli=1" 
    br = mechanize.Browser() 
    html = br.open(target).read() 
    json_data = json.loads(html) 
    for (key, val) in json_data.items(): 
     print val['id'] 
    page_number +=1 

Мишень только запрос JSON страница делает в соответствии с поисковым запросом здесь JSON

{"data":{"numberOfPages":512,"domainSummaryDTOs":[{"id":"http://www.cdfdmy.com/","timestamp":"Tue May 14 12:59:28 GMT 2013","title":"【推荐】成都空压机|四川空压机|成都空气压缩机|四川空气压缩机|成都螺杆空压机|四川螺杆空压机|成都双螺杆空压机|四川双螺杆空压机|成都福道贸易有限公司","exploitabilityLevel":4,"bsqli":2,"sqli":2,"url":"http://www.cdfdmy.com/","xss":0},{"id":"http://www.chushijob.com/","timestamp":"Tue May 14 12:59:28 GMT 2013","title":"餐饮世界人才网-中国厨师人才网-中国酒店人才网","exploitabilityLevel":5,"bsqli":3,"sqli":2,"url":"http://www.chushijob.com/","xss":2},{"id":"http://www.hbenshi.com/","timestamp":"Tue May 14 12:59:28 GMT 2013","title":"恩施旅游网--恩施大峡谷 腾龙洞 利川 清江闯滩 土司城 欢迎您!","exploitabilityLevel":5,"bsqli":3,"sqli":4,"url":"http://www.hbenshi.com/","xss":1},{"id":"http://bbs.laiyb.com/","timestamp":"Mon Apr 29 03:30:09 GMT 2013","title":"莱阳论坛_莱阳吧_莱阳人的网络社区 -","exploitabilityLevel":4,"bsqli":4,"sqli":1,"url":"http://bbs.laiyb.com/","xss":0},{"id":"http://photostudio-town.com/","timestamp":"Mon Apr 29 03:30:09 GMT 2013","title":"フォトスタジオ・タウン-就職証明写真・お受験写真・オーディション写真-","exploitabilityLevel":5,"bsqli":1,"sqli":1,"url":"http://photostudio-town.com/","xss":1},{"id":"http://sp.sosfang.com/","timestamp":"Mon Apr 29 03:30:09 GMT 2013","title":"上海商铺出租/转让,上海门面房出租信息/上海门面转让-上海商铺网","exploitabilityLevel":2,"bsqli":0,"sqli":1,"url":"http://sp.sosfang.com/","xss":0},{"id":"http://www.msdssafe.com/","timestamp":"Sat Apr 06 11:03:33 GMT 2013","title":"MSDS查询网 英文MSDS查询网 MSDS MSDS报告 MSDS下载 msds是什么意思 MSDS安全网","exploitabilityLevel":4,"bsqli":15,"sqli":3,"url":"http://www.msdssafe.com/","xss":0},{"id":"http://www.tiananjidian.com/","timestamp":"Sat Apr 06 11:15:03 GMT 2013","title":"上海精工阀门厂总代理★上海精工阀门|上工牌阀门|精工阀门厂|上海阀门|精工阀门|广东阀门|广州阀门|惠州阀门|东莞阀门|佛山阀门|深圳阀门|中山阀门|潮州阀门|珠海阀门|河源阀门|汕头阀门|肇庆阀门|","exploitabilityLevel":3,"bsqli":0,"sqli":2,"url":"http://www.tiananjidian.com/","xss":1},{"id":"http://www.ywscocie.com/","timestamp":"Sat Apr 06 11:20:46 GMT 2013","title":"","exploitabilityLevel":2,"bsqli":0,"sqli":2,"url":"http://www.ywscocie.com/","xss":0},{"id":"http://bookingsbarbados.com/","timestamp":"Wed May 15 00:54:31 GMT 2013","title":"Bookings Caribbean | Barbados Bookings Center. Book barbados Hotels and Activities. Search, tourism ","exploitabilityLevel":5,"bsqli":4,"sqli":2,"url":"http://bookingsbarbados.com/","xss":18}],"rowsFound":5115,"qTime":1}} 

Я пытаюсь ключ «ид» из файла JSON, который представляет собой URL howver он дает мне ошибку, что ключом «ид» не существует

+0

Поскольку у вас было бы точно такое же поведение в PHP, или на JavaScript или на любом другом языке, я не уверен, почему вы считаете, что различия между Python и PHP здесь имеют значение. – abarnert

+0

@abarnert В PHP я мог бы просто запустить цикл foreach и дать значение => значение ['id'] – user3051232

+0

И в Python вы можете просто запустить цикл 'for' и использовать значение' 'id'] '. Это точно то же самое. Проблема заключается не в том, что вы не знаете, как писать циклы в Python, это значит, что вы зациклились на неправильном. – abarnert

ответ

3

Действительно, единственный верх ключ уровня что существует, 'data', и значение, связанное с этим ключом не имеет 'id' ключ:

>>> json_data['data'].keys() 
[u'numberOfPages', u'domainSummaryDTOs', u'rowsFound', u'qTime'] 

The id ключи находятся в json_data['data']['domainSummaryDTOs'] списке словарей:

for entry in json_data['data']['domainSummaryDTOs']: 
    print entry['id'] 

Демо:

>>> import json 
>>> json_data = json.loads('''{"data":{"numberOfPages":512,"domainSummaryDTOs":[{"id":"http://www.cdfdmy.com/","timestamp":"Tue May 14 12:59:28 GMT 2013","title":"【推荐】成都空压机|四川空压机|成都空气压缩机|四川空气压缩机|成都螺杆空压机|四川螺杆空压机|成都双螺杆空压 机|四川双螺杆空压机|成都福道贸易有限公司","exploitabilityLevel":4,"bsqli":2,"sqli":2,"url":"http://www.cdfdmy.com/","xss":0},{"id":"http://www.chushijob.com/","timestamp":"Tue May 14 12:59:28 GMT 2013","title":"餐饮世界人才网-中国厨师人才网-中国酒店人才网","exploitabilityLevel":5,"bsqli":3,"sqli":2,"url":"http://www.chushijob.com/","xss":2},{"id":"http://www.hbenshi.com/","timestamp":"Tue May 14 12:59:28 GMT 2013","title":"恩施旅游网--恩施大峡谷 腾龙洞 利川 清江闯滩 土司城 欢迎您!","exploitabilityLevel":5,"bsqli":3,"sqli":4,"url":"http://www.hbenshi.com/","xss":1},{"id":"http://bbs.laiyb.com/","timestamp":"Mon Apr 29 03:30:09 GMT 2013","title":"莱阳论坛_莱阳吧_莱阳人的网络社区 -","exploitabilityLevel":4,"bsqli":4,"sqli":1,"url":"http://bbs.laiyb.com/","xss":0},{"id":"http://photostudio-town.com/","timestamp":"Mon Apr 29 03:30:09 GMT 2013","title":"フォトスタジオ・タウン-就職証明写真・お受験写真・オーディション写真-","exploitabilityLevel":5,"bsqli":1,"sqli":1,"url":"http://photostudio-town.com/","xss":1},{"id":"http://sp.sosfang.com/","timestamp":"Mon Apr 29 03:30:09 GMT 2013","title":"上海商铺出租/转让,上海门面房出租信息/上海门面转让-上海商铺网","exploitabilityLevel":2,"bsqli":0,"sqli":1,"url":"http://sp.sosfang.com/","xss":0},{"id":"http://www.msdssafe.com/","timestamp":"Sat Apr 06 11:03:33 GMT 2013","title":"MSDS查 询网 英文MSDS查询网 MSDS MSDS报告 MSDS下载 msds是什么意思 MSDS安全网","exploitabilityLevel":4,"bsqli":15,"sqli":3,"url":"http://www.msdssafe.com/","xss":0},{"id":"http://www.tiananjidian.com/","timestamp":"Sat Apr 06 11:15:03 GMT 2013","title":"上海精工阀门厂总代理★上海精工阀门|上工牌阀门|精工阀门厂|上海阀门|精工阀门|广东阀门|广州阀门|惠州阀门|东莞阀门|佛山阀门|深圳阀门|中山阀门|潮州阀门|珠海阀门|河源阀门|汕头阀门|肇庆阀门|","exploitabilityLevel":3,"bsqli":0,"sqli":2,"url":"http://www.tiananjidian.com/","xss":1},{"id":"http://www.ywscocie.com/","timestamp":"Sat Apr 06 11:20:46 GMT 2013","title":"","exploitabilityLevel":2,"bsqli":0,"sqli":2,"url":"http://www.ywscocie.com/","xss":0},{"id":"http://bookingsbarbados.com/","timestamp":"Wed May 15 00:54:31 GMT 2013","title":"Bookings Caribbean | Barbados Bookings Center. Book barbados Hotels and Activities. Search, tourism ","exploitabilityLevel":5,"bsqli":4,"sqli":2,"url":"http://bookingsbarbados.com/","xss":18}],"rowsFound":5115,"qTime":1}} 
... ''') 
>>> for entry in json_data['data']['domainSummaryDTOs']: 
...  print entry['id'] 
... 
http://www.cdfdmy.com/ 
http://www.chushijob.com/ 
http://www.hbenshi.com/ 
http://bbs.laiyb.com/ 
http://photostudio-town.com/ 
http://sp.sosfang.com/ 
http://www.msdssafe.com/ 
http://www.tiananjidian.com/ 
http://www.ywscocie.com/ 
http://bookingsbarbados.com/ 

Обычно это помогает придать вашему JSON сначала более читаемое дерево. Вы можете использовать онлайн JSONLint service, или вы можете использовать модуль питона json в командной строке на файл:

python -m json.tool filename.json 

Для вашего входа, JSONLint производит:

{ 
    "data": { 
     "numberOfPages": 512, 
     "domainSummaryDTOs": [ 
      { 
       "id": "http://www.cdfdmy.com/", 
       "timestamp": "Tue May 14 12:59:28 GMT 2013", 
       "title": "【推荐】成都空压机|四川空压机|成都空气压缩机|四川空气压缩机|成都螺杆空压机|四川螺杆空压机|成都双螺杆空压机|四川双螺杆空压机|成都福道贸易有限公司", 
       "exploitabilityLevel": 4, 
       "bsqli": 2, 
       "sqli": 2, 
       "url": "http://www.cdfdmy.com/", 
       "xss": 0 
      }, 
      { 
       "id": "http://www.chushijob.com/", 
       "timestamp": "Tue May 14 12:59:28 GMT 2013", 
       "title": "餐饮世界人才网-中国厨师人才网-中国酒店人才网", 
       "exploitabilityLevel": 5, 
       "bsqli": 3, 
       "sqli": 2, 
       "url": "http://www.chushijob.com/", 
       "xss": 2 
      }, 
      { 
       "id": "http://www.hbenshi.com/", 
       "timestamp": "Tue May 14 12:59:28 GMT 2013", 
       "title": "恩施旅游网--恩施大峡谷 腾龙洞 利川 清江闯滩 土司城 欢迎您!", 
       "exploitabilityLevel": 5, 
       "bsqli": 3, 
       "sqli": 4, 
       "url": "http://www.hbenshi.com/", 
       "xss": 1 
      }, 
      { 
       "id": "http://bbs.laiyb.com/", 
       "timestamp": "Mon Apr 29 03:30:09 GMT 2013", 
       "title": "莱阳论坛_莱阳吧_莱阳人的网络社区 -", 
       "exploitabilityLevel": 4, 
       "bsqli": 4, 
       "sqli": 1, 
       "url": "http://bbs.laiyb.com/", 
       "xss": 0 
      }, 
      { 
       "id": "http://photostudio-town.com/", 
       "timestamp": "Mon Apr 29 03:30:09 GMT 2013", 
       "title": "フォトスタジオ・タウン-就職証明写真・お受験写真・オーディション写真-", 
       "exploitabilityLevel": 5, 
       "bsqli": 1, 
       "sqli": 1, 
       "url": "http://photostudio-town.com/", 
       "xss": 1 
      }, 
      { 
       "id": "http://sp.sosfang.com/", 
       "timestamp": "Mon Apr 29 03:30:09 GMT 2013", 
       "title": "上海商铺出租/转让,上海门面房出租信息/上海门面转让-上海商铺网", 
       "exploitabilityLevel": 2, 
       "bsqli": 0, 
       "sqli": 1, 
       "url": "http://sp.sosfang.com/", 
       "xss": 0 
      }, 
      { 
       "id": "http://www.msdssafe.com/", 
       "timestamp": "Sat Apr 06 11:03:33 GMT 2013", 
       "title": "MSDS查询网 英文MSDS查询网 MSDS MSDS报告 MSDS下载 msds是什么意思 MSDS安全网", 
       "exploitabilityLevel": 4, 
       "bsqli": 15, 
       "sqli": 3, 
       "url": "http://www.msdssafe.com/", 
       "xss": 0 
      }, 
      { 
       "id": "http://www.tiananjidian.com/", 
       "timestamp": "Sat Apr 06 11:15:03 GMT 2013", 
       "title": "上海精工阀门厂总代理★上海精工阀门|上工牌阀门|精工阀门厂|上海阀门|精工阀门|广东阀门|广州阀门|惠州阀门|东莞阀门|佛山阀门|深圳阀门|中山阀门|潮州阀门|珠海阀门|河源阀门|汕头阀门|肇庆阀门|", 
       "exploitabilityLevel": 3, 
       "bsqli": 0, 
       "sqli": 2, 
       "url": "http://www.tiananjidian.com/", 
       "xss": 1 
      }, 
      { 
       "id": "http://www.ywscocie.com/", 
       "timestamp": "Sat Apr 06 11:20:46 GMT 2013", 
       "title": "", 
       "exploitabilityLevel": 2, 
       "bsqli": 0, 
       "sqli": 2, 
       "url": "http://www.ywscocie.com/", 
       "xss": 0 
      }, 
      { 
       "id": "http://bookingsbarbados.com/", 
       "timestamp": "Wed May 15 00:54:31 GMT 2013", 
       "title": "Bookings Caribbean | Barbados Bookings Center. Book barbados Hotels and Activities. Search, tourism ", 
       "exploitabilityLevel": 5, 
       "bsqli": 4, 
       "sqli": 2, 
       "url": "http://bookingsbarbados.com/", 
       "xss": 18 
      } 
     ], 
     "rowsFound": 5115, 
     "qTime": 1 
    } 
} 

, который, возможно, немного легче дешифровать.

+0

спасибо, что я проголосовал за ваш ответ, но проверил другой только потому, что видел его первым, спасибо, хотя работал отлично – user3051232

+0

Я закончил голосование за вас только из-за всей лишней хорошей информации у вас есть – user3051232

+0

Спасибо! Если это заставляет вас чувствовать себя лучше, я был на 30 секунд быстрее, отправив свой ответ. :-P –

1

ваших элементов «ID» в списке в domainSummaryDTOs

print json_data['data']['domainSummaryDTOs'][0]['id']; 

... поможет вам первый элемент, вам нужно исправить цикл, чтобы следовать этой структуры:

for item in json_data['data']['domainSummaryDTOs']: 
    print item['id'] 
+0

Спасибо, человек отлично поработал – user3051232

Смежные вопросы