3. 同时选取DataFrame的行和列

  1. # 读取college数据集,给行索引命名为INSTNM;选取前3行和前4列
  2. In[23]: college = pd.read_csv('data/college.csv', index_col='INSTNM')
  3. college.iloc[:3, :4]
  4. Out[23]:

3. 同时选取DataFrame的行和列 - 图1

  1. # 用loc实现同上功能
  2. In[24]: college.loc[:'Amridge University', :'MENONLY']
  3. Out[24]:

3. 同时选取DataFrame的行和列 - 图2

  1. # 选取两列的所有的行
  2. In[25]: college.iloc[:, [4,6]].head()
  3. Out[25]:

3. 同时选取DataFrame的行和列 - 图3

  1. # loc实现同上功能
  2. In[26]: college.loc[:, ['WOMENONLY', 'SATVRMID']]
  3. Out[26]:

3. 同时选取DataFrame的行和列 - 图4

  1. # 选取不连续的行和列
  2. In[27]: college.iloc[[100, 200], [7, 15]]
  3. Out[27]:

3. 同时选取DataFrame的行和列 - 图5

  1. # 用loc和列表,选取不连续的行和列
  2. In[28]: rows = ['GateWay Community College', 'American Baptist Seminary of the West']
  3. columns = ['SATMTMID', 'UGDS_NHPI']
  4. college.loc[rows, columns]
  5. Out[28]:

3. 同时选取DataFrame的行和列 - 图6

  1. # iloc选取一个标量值
  2. In[29]: college.iloc[5, -4]
  3. Out[29]: 0.40100000000000002
  1. # loc选取一个标量值
  2. In[30]: college.loc['The University of Alabama', 'PCTFLOAN']
  3. Out[30]: 0.40100000000000002
  1. # iloc对行切片,并只选取一列
  2. In[31]: college.iloc[90:80:-2, 5]
  3. Out[31]: INSTNM
  4. Empire Beauty School-Flagstaff 0
  5. Charles of Italy Beauty College 0
  6. Central Arizona College 0
  7. University of Arizona 0
  8. Arizona State University-Tempe 0
  9. Name: RELAFFIL, dtype: int64
  1. # loc对行切片,并只选取一列
  2. In[32]: start = 'Empire Beauty School-Flagstaff'
  3. stop = 'Arizona State University-Tempe'
  4. college.loc[start:stop:-2, 'RELAFFIL']
  5. Out[32]: INSTNM
  6. Empire Beauty School-Flagstaff 0
  7. Charles of Italy Beauty College 0
  8. Central Arizona College 0
  9. University of Arizona 0
  10. Arizona State University-Tempe 0
  11. Name: RELAFFIL, dtype: int64