2016-05-03 5 views
0

У меня есть следующий ДФ:Панда: удаление повторяющихся строки

url='https://raw.githubusercontent.com/108michael/ms_thesis/master/crsp.dime.mpl.abbridged' 

zz=pd.read_csv(url) 
zz.head(30) 

    date feccandid feccandcfscore.dyn pacid paccfscore cid  catcode  type_x di amtsum state log_diff_unemployment party type_y bills years_exp disposition  billsum 
0 2006 S8NV00073 0.496 C00000422 0.330 N00006619 H1100 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  3 
1 2006 S8NV00073 0.496 C00000422 0.330 N00006619 H1100 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  3 
2 2006 S8NV00073 0.496 C00000422 0.330 N00006619 H1100 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  3 
3 2006 S8NV00073 0.496 C00000422 0.330 N00006619 H1100 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  3 
4 2006 S8NV00073 0.496 C00000422 0.330 N00006619 H1100 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  3 
5 2006 S8NV00073 0.496 C00000422 0.330 N00006619 H1100 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  3 
6 2006 S8NV00073 0.496 C00375360 0.176 N00006619 H1100 24K  D 4500 NV -0.024693 Republican rep  s22-109  12 support  3 
7 2006 S8NV00073 0.496 C00375360 0.176 N00006619 H1100 24K  D 4500 NV -0.024693 Republican rep  s22-109  12 support  3 
8 2006 S8NV00073 0.496 C00375360 0.176 N00006619 H1100 24K  D 4500 NV -0.024693 Republican rep  s22-109  12 support  3 
9 2006 S8NV00073 0.496 C00375360 0.176 N00006619 H1100 24K  D 4500 NV -0.024693 Republican rep  s22-109  12 support  3 
10 2006 S8NV00073 0.496 C00375360 0.176 N00006619 H1100 24K  D 4500 NV -0.024693 Republican rep  s22-109  12 support  3 
11 2006 S8NV00073 0.496 C00375360 0.176 N00006619 H1100 24K  D 4500 NV -0.024693 Republican rep  s22-109  12 support  3 
12 2006 S8NV00073 0.496 C00113803 0.269 N00006619 H1130 24K  D 2500 NV -0.024693 Republican rep  s22-109  12 support  2 
13 2006 S8NV00073 0.496 C00113803 0.269 N00006619 H1130 24K  D 2500 NV -0.024693 Republican rep  s22-109  12 support  2 
14 2006 S8NV00073 0.496 C00249342 0.421 N00006619 H1130 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  2 
15 2006 S8NV00073 0.496 C00249342 0.421 N00006619 H1130 24K  D 5000 NV -0.024693 Republican rep  s22-109  12 support  2 

Некоторые из строк являются полными копиями друг друга. Есть ли способ удалить повторяющиеся строки?

ответ

2

Я думаю, что вы можете использовать drop_duplicates:

print zz.drop_duplicates() 
+0

Это было стыдно! По какой-то причине я думал, что 'drop_duplicates' будет работать только с уникальными столбцами. –

Смежные вопросы