6. 惰性行切片

  1. # 读取college数据集;从行索引10到20,每隔一个取一行
  2. In[50]: college = pd.read_csv('data/college.csv', index_col='INSTNM')
  3. college[10:20:2]
  4. Out[50]:

6. 惰性行切片 - 图1

  1. # Series也可以进行同样的切片
  2. In[51]: city = college['CITY']
  3. city[10:20:2]
  4. Out[51]: INSTNM
  5. Birmingham Southern College Birmingham
  6. Concordia College Alabama Selma
  7. Enterprise State Community College Enterprise
  8. Faulkner University Montgomery
  9. New Beginning College of Cosmetology Albertville
  10. Name: CITY, dtype: object
  1. # 查看第4002个行索引标签
  2. In[52]: college.index[4001]
  3. Out[52]: 'Spokane Community College'
  1. # Series和DataFrame都可以用标签进行切片。下面是对DataFrame用标签切片
  2. In[53]: start = 'Mesa Community College'
  3. stop = 'Spokane Community College'
  4. college[start:stop:1500]
  5. Out[53]:

6. 惰性行切片 - 图2

  1. # 下面是对Series用标签切片
  2. In[54]: city[start:stop:1500]
  3. Out[54]: INSTNM
  4. Mesa Community College Mesa
  5. Hair Academy Inc-New Carrollton New Carrollton
  6. National College of Natural Medicine Portland
  7. Name: CITY, dtype: object

更多

惰性切片不能用于列,只能用于DataFrame的行和Series,也不能同时选取行和列。

  1. # 下面尝试选取两列,导致错误
  2. In[55]: college[:10, ['CITY', 'STABBR']]
  3. ---------------------------------------------------------------------------
  4. TypeError Traceback (most recent call last)
  5. <ipython-input-55-92538c61bdfa> in <module>()
  6. ----> 1 college[:10, ['CITY', 'STABBR']]
  7. /Users/Ted/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
  8. 1962 return self._getitem_multilevel(key)
  9. 1963 else:
  10. -> 1964 return self._getitem_column(key)
  11. 1965
  12. 1966 def _getitem_column(self, key):
  13. /Users/Ted/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
  14. 1969 # get column
  15. 1970 if self.columns.is_unique:
  16. -> 1971 return self._get_item_cache(key)
  17. 1972
  18. 1973 # duplicate columns & possible reduce dimensionality
  19. /Users/Ted/anaconda/lib/python3.6/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
  20. 1641 """Return the cached item, item represents a label indexer."""
  21. 1642 cache = self._item_cache
  22. -> 1643 res = cache.get(item)
  23. 1644 if res is None:
  24. 1645 values = self._data.get(item)
  25. TypeError: unhashable type: 'slice'
  1. # 只能用.loc和.iloc选取
  2. In[56]: first_ten_instnm = college.index[:10]
  3. college.loc[first_ten_instnm, ['CITY', 'STABBR']]
  4. Out[56]:

6. 惰性行切片 - 图3