Pytorch models in modAL workflows

Thanks to Skorch API, you can seamlessly integrate Pytorch models into your modAL workflow. In this tutorial, we shall quickly introduce how to use Skorch API of Keras and we are going to see how to do active learning with it. More details on the Keras scikit-learn API can be found here.

The executable script for this example can be found here!

Skorch API

By default, a Pytorch model’s interface differs from what is used for scikit-learn estimators. However, with the use of Skorch wrapper, it is possible to adapt your model.

import torch
from torch import nn
from skorch import NeuralNetClassifier

# build class for the skorch API
class Torch_Model(nn.Module):
    def __init__(self,):
        super(Torch_Model, self).__init__()
        self.convs = nn.Sequential(
        self.fcs = nn.Sequential(

    def forward(self, x):
        out = x
        out = self.convs(out)
        out = out.view(-1,12*12*64)
        out = self.fcs(out)
        return out

For our purposes, the classifier which we will initialize now acts just like any scikit-learn estimator.

# create the classifier
device = "cuda" if torch.cuda.is_available() else "cpu"
classifier = NeuralNetClassifier(Torch_Model,

Active learning with Pytorch

In this example, we are going to use the famous MNIST dataset, which is available as a built-in for PyTorch.

[ ]:
import numpy as np
from import DataLoader
from torchvision.transforms import ToTensor
from torchvision.datasets import MNIST

mnist_data = MNIST('.', download=True, transform=ToTensor())
dataloader = DataLoader(mnist_data, shuffle=True, batch_size=60000)
X, y = next(iter(dataloader))

# read training data
X_train, X_test, y_train, y_test = X[:50000], X[50000:], y[:50000], y[50000:]
X_train = X_train.reshape(50000, 1, 28, 28)
X_test = X_test.reshape(10000, 1, 28, 28)

# assemble initial data
n_initial = 1000
initial_idx = np.random.choice(range(len(X_train)), size=n_initial, replace=False)
X_initial = X_train[initial_idx]
y_initial = y_train[initial_idx]

# generate the pool
# remove the initial data from the training dataset
X_pool = np.delete(X_train, initial_idx, axis=0)[:5000]
y_pool = np.delete(y_train, initial_idx, axis=0)[:5000]
0it [00:00, ?it/s]
Downloading to ./MNIST/raw/train-images-idx3-ubyte.gz
 97%|█████████▋| 9584640/9912422 [00:15<00:00, 1777143.52it/s]
Extracting ./MNIST/raw/train-images-idx3-ubyte.gz

0it [00:00, ?it/s]
Downloading to ./MNIST/raw/train-labels-idx1-ubyte.gz

  0%|          | 0/28881 [00:00<?, ?it/s]
 57%|█████▋    | 16384/28881 [00:00<00:00, 62622.03it/s]
32768it [00:00, 41627.01it/s]
0it [00:00, ?it/s]
Extracting ./MNIST/raw/train-labels-idx1-ubyte.gz
Downloading to ./MNIST/raw/t10k-images-idx3-ubyte.gz

  0%|          | 0/1648877 [00:00<?, ?it/s]

Active learning with data and classifier ready is as easy as always. Because training is very expensive in large neural networks, this time we are going to query the best 200 instances each time we measure the uncertainty of the pool.

[ ]:
from modAL.models import ActiveLearner

# initialize ActiveLearner
learner = ActiveLearner(
    X_training=X_initial, y_training=y_initial,

To make sure that you train only on newly queried labels, pass only_new=True to the .teach() method of the learner.

[ ]:
# the active learning loop
n_queries = 10
for idx in range(n_queries):
    print('Query no. %d' % (idx + 1))
    query_idx, query_instance = learner.query(X_pool, n_instances=100)
        X=X_pool[query_idx], y=y_pool[query_idx], only_new=True,
    # remove queried instance from pool
    X_pool = np.delete(X_pool, query_idx, axis=0)
    y_pool = np.delete(y_pool, query_idx, axis=0)
[ ]: