Tabular learner

Open In Colab

The function to immediately get a Learner ready to train for tabular data

  1. /usr/local/lib/python3.8/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  2. return torch._C._cuda_getDeviceCount() > 0
  1. from fastai.tabular.data import *

The main function you probably want to use in this module is tabular_learner. It will automatically create a TabularModel suitable for your data and infer the right loss function. See the tabular tutorial for an example of use in context.

Main functions

class TabularLearner[source]

TabularLearner(dls, model, loss_func=None, opt_func=Adam, lr=0.001, splitter=trainable_params, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95)) :: Learner

Learner for tabular data

It works exactly as a normal Learner, the only difference is that it implements a predict method specific to work on a row of data.

tabular_learner[source]

tabular_learner(dls, layers=None, emb_szs=None, config=None, n_out=None, y_range=None, loss_func=None, opt_func=Adam, lr=0.001, splitter=trainable_params, cbs=None, metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True, moms=(0.95, 0.85, 0.95))

Get a Learner using dls, with metrics, including a TabularModel created using the remaining params.

If your data was built with fastai, you probably won’t need to pass anything to emb_szs unless you want to change the default of the library (produced by get_emb_sz), same for n_out which should be automatically inferred. layers will default to [200,100] and is passed to TabularModel along with the config.

Use tabular_config to create a config and customize the model used. There is just easy access to y_range because this argument is often used.

All the other arguments are passed to Learner.

  1. path = untar_data(URLs.ADULT_SAMPLE)
  2. df = pd.read_csv(path/'adult.csv')
  3. cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
  4. cont_names = ['age', 'fnlwgt', 'education-num']
  5. procs = [Categorify, FillMissing, Normalize]
  6. dls = TabularDataLoaders.from_df(df, path, procs=procs, cat_names=cat_names, cont_names=cont_names,
  7. y_names="salary", valid_idx=list(range(800,1000)), bs=64)
  8. learn = tabular_learner(dls)

TabularLearner.predict[source]

TabularLearner.predict(row)

Predict on a Pandas Series

We can pass in an individual row of data into our TabularLearner‘s predict method. It’s output is slightly different from the other predict methods, as this one will always return the input as well:

  1. row, clas, probs = learn.predict(df.iloc[0])
  1. row.show()
workclasseducationmarital-statusoccupationrelationshipraceeducation-num_naagefnlwgteducation-numsalary
0PrivateAssoc-acdmMarried-civ-spouse#na#WifeWhiteFalse49.0101320.00168512.0<50k
  1. clas, probs
  1. (tensor(0), tensor([0.5264, 0.4736]))

Company logo

©2021 fast.ai. All rights reserved.
Site last generated: Mar 31, 2021