6.3 真实世界中的数据集

scikit-learn 提供加载较大数据集的工具,并在必要时下载这些数据集。

这些数据集可以用下面的函数加载 :

调用描述
fetch_olivetti_faces([data_home, shuffle, …])Load the Olivetti faces data-set from AT&T (classification).
fetch_20newsgroups([data_home, subset, …])Load the filenames and data from the 20 newsgroups dataset (classification).
fetch_20newsgroups_vectorized([subset, …])Load the 20 newsgroups dataset and vectorize it into token counts (classification).
fetch_lfw_people([data_home, funneled, …])Load the Labeled Faces in the Wild (LFW) people dataset (classification).
fetch_lfw_pairs([subset, data_home, …])Load the Labeled Faces in the Wild (LFW) pairs dataset (classification).
fetch_covtype([data_home, …])Load the covertype dataset (classification).
fetch_rcv1([data_home, subset, …])Load the RCV1 multilabel dataset (classification).
fetch_kddcup99([subset, data_home, shuffle, …])Load the kddcup99 dataset (classification).
fetch_california_housing([data_home, …])Load the California housing dataset (regression).

译者注:同样的,各个数据集的具体描述此处不翻译,若需查询请点击链接查看英文描述