2016-07-21 2 views
2

У меня есть два многоуровневых Series и хотел бы объединить их в соответствии с обоими индексами. Первый Series выглядит следующим образом:pandas слияние двух многоуровневых серий

           # of restaurants  
BORO   CUISINE  
BRONX   American        425 
       Chinese         330 
       Pizza         206 
BROOKLYN  American        1254 
       Chinese         750 
       Cafe/Coffee/Tea       350 

второй один имеет больше строк, и как это:

           # of votes  
BORO   CUISINE  
BRONX   American        2425 
       Caribbean        320 
       Chinese         3130 
       Pizza         3336 
BROOKLYN  American        21254 
       Caribbean        2320 
       Chinese         7250 
       Cafe/Coffee/Tea       3350 
       Pizza         13336 

ответ

2

Установка:

s1 = pd.Series({('BRONX', 'American'): 425, ('BROOKLYN', 'Chinese'): 750, ('BROOKLYN', 'Cafe/Coffee/Tea'): 350, ('BRONX', 'Pizza'): 206, ('BROOKLYN', 'American'): 1254, ('BRONX', 'Chinese'): 330}) 
s2 = pd.Series({('BRONX', 'Caribbean'): 320, ('BRONX', 'American'): 2425, ('BROOKLYN', 'Chinese'): 7250, ('BROOKLYN', 'Cafe/Coffee/Tea'): 3350, ('BRONX', 'Pizza'): 3336, ('BROOKLYN', 'American'): 21254, ('BROOKLYN', 'Pizza'): 13336, ('BRONX', 'Chinese'): 3130, ('BROOKLYN', 'Caribbean'): 2320}) 
s1 = s1.rename_axis(['BORO','CUISINE']).rename('restaurants') 
s2 = s2.rename_axis(['BORO','CUISINE']).rename('votes') 


print (s1) 
BORO  CUISINE   
BRONX  American   425 
      Chinese    330 
      Pizza    206 
BROOKLYN American   1254 
      Chinese    750 
      Cafe/Coffee/Tea  350 
Name: restaurants, dtype: int64 

print (s2) 
BORO  CUISINE   
BRONX  American   2425 
      Caribbean   320 
      Chinese    3130 
      Pizza    3336 
BROOKLYN American   21254 
      Caribbean   2320 
      Chinese    7250 
      Cafe/Coffee/Tea  3350 
      Pizza    13336 
Name: votes, dtype: int64 

Использование concat с параметром join при необходимости inner join:

print (pd.concat([s1,s2], axis=1, join='inner')) 
          restaurants votes 
BORO  CUISINE        
BRONX American     425 2425 
     Chinese     330 3130 
     Pizza     206 3336 
BROOKLYN American    1254 21254 
     Cafe/Coffee/Tea   350 3350 
     Chinese     750 7250 

#join='outer' is by default, so can be omited 
print (pd.concat([s1,s2], axis=1)) 
          restaurants votes 
BORO  CUISINE        
BRONX American    425.0 2425 
     Caribbean    NaN 320 
     Chinese    330.0 3130 
     Pizza     206.0 3336 
BROOKLYN American    1254.0 21254 
     Cafe/Coffee/Tea  350.0 3350 
     Caribbean    NaN 2320 
     Chinese    750.0 7250 
     Pizza     NaN 13336 

Другим решением является использование merge с reset_index:

#by default how='inner', so can be omited 
print (pd.merge(s1.reset_index(), s2.reset_index(), on=['BORO','CUISINE'])) 
     BORO   CUISINE restaurants votes 
0  BRONX   American   425 2425 
1  BRONX   Chinese   330 3130 
2  BRONX   Pizza   206 3336 
3 BROOKLYN   American   1254 21254 
4 BROOKLYN   Chinese   750 7250 
5 BROOKLYN Cafe/Coffee/Tea   350 3350 

#outer join 
print (pd.merge(s1.reset_index(), s2.reset_index(), on=['BORO','CUISINE'], how='outer')) 
     BORO   CUISINE restaurants votes 
0  BRONX   American  425.0 2425 
1  BRONX   Chinese  330.0 3130 
2  BRONX   Pizza  206.0 3336 
3 BROOKLYN   American  1254.0 21254 
4 BROOKLYN   Chinese  750.0 7250 
5 BROOKLYN Cafe/Coffee/Tea  350.0 3350 
6  BRONX  Caribbean   NaN 320 
7 BROOKLYN  Caribbean   NaN 2320 
8 BROOKLYN   Pizza   NaN 13336 
Смежные вопросы