使用缺少标签的列表进行索引已弃用

警告

Starting in 0.21.0, using .loc or [] with a list with one or more missing labels, is deprecated, in favor of .reindex.

In prior versions, using .loc[list-of-labels] would work as long as at least 1 of the keys was found (otherwise it would raise a KeyError). This behavior is deprecated and will show a warning message pointing to this section. The recommended alternative is to use .reindex().

For example.

  1. In [102]: s = pd.Series([1, 2, 3])
  2. In [103]: s
  3. Out[103]:
  4. 0 1
  5. 1 2
  6. 2 3
  7. dtype: int64

Selection with all keys found is unchanged.

  1. In [104]: s.loc[[1, 2]]
  2. Out[104]:
  3. 1 2
  4. 2 3
  5. dtype: int64

Previous Behavior

  1. In [4]: s.loc[[1, 2, 3]]
  2. Out[4]:
  3. 1 2.0
  4. 2 3.0
  5. 3 NaN
  6. dtype: float64

Current Behavior

  1. In [4]: s.loc[[1, 2, 3]]
  2. Passing list-likes to .loc with any non-matching elements will raise
  3. KeyError in the future, you can use .reindex() as an alternative.
  4. See the documentation here:
  5. http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  6. Out[4]:
  7. 1 2.0
  8. 2 3.0
  9. 3 NaN
  10. dtype: float64

Reindexing

The idiomatic way to achieve selecting potentially not-found elmenents is via .reindex(). See also the section on reindexing.

  1. In [105]: s.reindex([1, 2, 3])
  2. Out[105]:
  3. 1 2.0
  4. 2 3.0
  5. 3 NaN
  6. dtype: float64

Alternatively, if you want to select only valid keys, the following is idiomatic and efficient; it is guaranteed to preserve the dtype of the selection.

  1. In [106]: labels = [1, 2, 3]
  2. In [107]: s.loc[s.index.intersection(labels)]
  3. Out[107]:
  4. 1 2
  5. 2 3
  6. dtype: int64

Having a duplicated index will raise for a .reindex():

  1. In [108]: s = pd.Series(np.arange(4), index=['a', 'a', 'b', 'c'])
  2. In [109]: labels = ['c', 'd']
  1. In [17]: s.reindex(labels)
  2. ValueError: cannot reindex from a duplicate axis

Generally, you can intersect the desired labels with the current axis, and then reindex.

  1. In [110]: s.loc[s.index.intersection(labels)].reindex(labels)
  2. Out[110]:
  3. c 3.0
  4. d NaN
  5. dtype: float64

However, this would still raise if your resulting index is duplicated.

  1. In [41]: labels = ['a', 'd']
  2. In [42]: s.loc[s.index.intersection(labels)].reindex(labels)
  3. ValueError: cannot reindex from a duplicate axis