风格(Idioms)

风格(Idioms)

These are some neat pandas idioms

if-then/if-then-else on one column, and assignment to another one or more columns:

In [1]: df = pd.DataFrame(
   ...:      {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
   ...: 
Out[1]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50

if-then…

An if-then on one column

In [2]: df.loc[df.AAA >= 5,'BBB'] = -1; df
Out[2]: 
   AAA  BBB  CCC
0    4   10  100
1    5   -1   50
2    6   -1  -30
3    7   -1  -50

An if-then with assignment to 2 columns:

In [3]: df.loc[df.AAA >= 5,['BBB','CCC']] = 555; df
Out[3]: 
   AAA  BBB  CCC
0    4   10  100
1    5  555  555
2    6  555  555
3    7  555  555

Add another line with different logic, to do the -else

In [4]: df.loc[df.AAA < 5,['BBB','CCC']] = 2000; df
Out[4]: 
   AAA   BBB   CCC
0    4  2000  2000
1    5   555   555
2    6   555   555
3    7   555   555

Or use pandas where after you’ve set up a mask

In [5]: df_mask = pd.DataFrame({'AAA' : [True] * 4, 'BBB' : [False] * 4,'CCC' : [True,False] * 2})
In [6]: df.where(df_mask,-1000)
Out[6]: 
   AAA   BBB   CCC
0    4 -1000  2000
1    5 -1000 -1000
2    6 -1000   555
3    7 -1000 -1000

if-then-else using numpy’s where()

In [7]: df = pd.DataFrame(
   ...:      {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
   ...: 
Out[7]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50
In [8]: df['logic'] = np.where(df['AAA'] > 5,'high','low'); df
Out[8]: 
   AAA  BBB  CCC logic
0    4   10  100   low
1    5   20   50   low
2    6   30  -30  high
3    7   40  -50  high

Splitting

Split a frame with a boolean criterion

In [9]: df = pd.DataFrame(
   ...:      {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
   ...: 
Out[9]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50
In [10]: dflow = df[df.AAA <= 5]; dflow
Out[10]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
In [11]: dfhigh = df[df.AAA > 5]; dfhigh
Out[11]: 
   AAA  BBB  CCC
2    6   30  -30
3    7   40  -50

Building Criteria

Select with multi-column criteria

In [12]: df = pd.DataFrame(
   ....:      {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
   ....: 
Out[12]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50

…and (without assignment returns a Series)

In [13]: newseries = df.loc[(df['BBB'] < 25) & (df['CCC'] >= -40), 'AAA']; newseries
Out[13]: 
0    4
1    5
Name: AAA, dtype: int64

…or (without assignment returns a Series)

In [14]: newseries = df.loc[(df['BBB'] > 25) | (df['CCC'] >= -40), 'AAA']; newseries;

…or (with assignment modifies the DataFrame.)

In [15]: df.loc[(df['BBB'] > 25) | (df['CCC'] >= 75), 'AAA'] = 0.1; df
Out[15]: 
   AAA  BBB  CCC
0  0.1   10  100
1  5.0   20   50
2  0.1   30  -30
3  0.1   40  -50

Select rows with data closest to certain value using argsort

In [16]: df = pd.DataFrame(
   ....:      {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
   ....: 
Out[16]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50
In [17]: aValue = 43.0
In [18]: df.loc[(df.CCC-aValue).abs().argsort()]
Out[18]: 
   AAA  BBB  CCC
1    5   20   50
0    4   10  100
2    6   30  -30
3    7   40  -50

Dynamically reduce a list of criteria using a binary operators

In [19]: df = pd.DataFrame(
   ....:      {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
   ....: 
Out[19]: 
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50
In [20]: Crit1 = df.AAA <= 5.5
In [21]: Crit2 = df.BBB == 10.0
In [22]: Crit3 = df.CCC > -40.0

One could hard code:

In [23]: AllCrit = Crit1 & Crit2 & Crit3

…Or it can be done with a list of dynamically built criteria

In [24]: CritList = [Crit1,Crit2,Crit3]
In [25]: AllCrit = functools.reduce(lambda x,y: x & y, CritList)
In [26]: df[AllCrit]
Out[26]: 
   AAA  BBB  CCC
0    4   10  100

风格(Idioms.md)

风格(Idioms)

if-then…

Splitting

Building Criteria