RNA in Python

Next we will see how we can implement ANN (Artificial Neuron Networks) in Python. For this, we will use the keras library over tensorflow (which is the most common).

Classification of textual data sets

We are going to use the Pima Indian diabetes onset dataset. This is a standard Machine Learning dataset from the UCI Machine Learning repository. It describes the medical record data of Pima Indian patients and whether they had an onset of diabetes within five years.

Step 1. Reading the processed data set

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split

total_data = pd.read_csv("https://raw.githubusercontent.com/4GeeksAcademy/machine-learning-content/master/assets/clean-pima-indians-diabetes.csv")

X = total_data.drop("8", axis = 1)
y = total_data["8"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

X_train.head()
Out[1]:
0 1 2 3 4 5 6 7
60 -0.547919 -1.154694 -3.572597 -1.288212 -0.692891 -4.060474 -0.507006 -1.041549
618 1.530847 -0.278373 0.666618 0.217261 -0.692891 -0.481351 2.446670 1.425995
346 -0.844885 0.566649 -1.194501 -0.096379 0.027790 -0.417892 0.550035 -0.956462
294 -1.141852 1.255187 -0.987710 -1.288212 -0.692891 -1.280942 -0.658012 2.702312
231 0.639947 0.410164 0.563223 1.032726 2.519781 1.803195 -0.706334 1.085644

The train set will be used to train the model, while the test set will be used to evaluate the effectiveness of the model. In addition, it is generally a good practice to normalize the data before training an artificial neural network (ANN). Two types can be applied: from 0 to 1 or from -1 to 1.

Step 2: Model initialization and training

Models in Keras are defined as a sequence of layers. We create a sequential model and add layers one by one until we are satisfied with our network architecture.

The input layer will always have as many neurons as predictor variables. In this case, we have a total of 8 (from 0 to 7). Next, we add two hidden layers, one of 12 neurons and one of 8. Finally, the fourth layer, the output layer, will have a single neuron, since the problem is dichotomous. If it were of n classes, the network would have n outputs.

NOTE: We have created a default network with random hidden layers and neurons in each hidden layer. Normally you would start this way and then do a hyperparameter optimization.

In [2]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import set_random_seed

set_random_seed(42)

model = Sequential()
model.add(Dense(12, input_shape = (8,), activation = "relu"))
model.add(Dense(8, activation = "relu"))
model.add(Dense(1, activation = "sigmoid"))
2023-08-07 16:30:23.463216: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-07 16:30:23.491361: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-08-07 16:30:23.491955: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-07 16:30:24.196569: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Then, once the model is defined, we can compile it. The backend automatically chooses the best way to represent the network to train and make predictions to run on your hardware, such as CPU or GPU or even distributed.

When compiling, we must specify some additional properties required when training the network. Recall that training a network means finding the best set of weights to map inputs to outputs in our dataset.

In [3]:
model.compile(loss = "binary_crossentropy", optimizer = "adam", metrics = ["accuracy"])
model
Out[3]:
<keras.src.engine.sequential.Sequential at 0x7fa9cda46cd0>

We will define the optimizer known as adam. This is a popular version of gradient descent because it is automatically tuned and gives good results on a wide range of problems. we will collect and report the classification accuracy, defined through the argument of the metrics.

Training occurs in epochs and each epoch is divided into batches.

  • Epoch: One pass through all rows of the training data set.
  • Batch: One or more samples considered by the model within an epoch before the weights are updated.

The training process will run for a fixed number of iterations, which are the epochs. We must also set the number of rows in the data set that are considered before the model weights are updated within each epoch, which is called the batch size and is set by the batch_size argument.

For this problem, we will run a small number of epochs (150) and use a relatively small batch size of 10:

In [4]:
# Fit the keras model on the data set
model.fit(X_train, y_train, epochs = 150, batch_size = 10)
Epoch 1/150
62/62 [==============================] - 1s 862us/step - loss: 0.7338 - accuracy: 0.4756
Epoch 2/150
62/62 [==============================] - 0s 775us/step - loss: 0.6790 - accuracy: 0.6173
Epoch 3/150
62/62 [==============================] - 0s 697us/step - loss: 0.6336 - accuracy: 0.6873
Epoch 4/150
62/62 [==============================] - 0s 689us/step - loss: 0.5910 - accuracy: 0.7296
Epoch 5/150
62/62 [==============================] - 0s 703us/step - loss: 0.5552 - accuracy: 0.7508
Epoch 6/150
62/62 [==============================] - 0s 683us/step - loss: 0.5276 - accuracy: 0.7541
Epoch 7/150
62/62 [==============================] - 0s 695us/step - loss: 0.5077 - accuracy: 0.7606
Epoch 8/150
62/62 [==============================] - 0s 689us/step - loss: 0.4935 - accuracy: 0.7573
Epoch 9/150
62/62 [==============================] - 0s 693us/step - loss: 0.4827 - accuracy: 0.7655
Epoch 10/150
62/62 [==============================] - 0s 671us/step - loss: 0.4748 - accuracy: 0.7720
Epoch 11/150
62/62 [==============================] - 0s 727us/step - loss: 0.4691 - accuracy: 0.7720
Epoch 12/150
62/62 [==============================] - 0s 710us/step - loss: 0.4661 - accuracy: 0.7720
Epoch 13/150
62/62 [==============================] - 0s 734us/step - loss: 0.4625 - accuracy: 0.7720
Epoch 14/150
62/62 [==============================] - 0s 721us/step - loss: 0.4587 - accuracy: 0.7785
Epoch 15/150
62/62 [==============================] - 0s 720us/step - loss: 0.4566 - accuracy: 0.7769
Epoch 16/150
62/62 [==============================] - 0s 725us/step - loss: 0.4539 - accuracy: 0.7801
Epoch 17/150
62/62 [==============================] - 0s 727us/step - loss: 0.4525 - accuracy: 0.7818
Epoch 18/150
62/62 [==============================] - 0s 723us/step - loss: 0.4507 - accuracy: 0.7818
Epoch 19/150
62/62 [==============================] - 0s 680us/step - loss: 0.4485 - accuracy: 0.7834
Epoch 20/150
62/62 [==============================] - 0s 699us/step - loss: 0.4469 - accuracy: 0.7850
Epoch 21/150
62/62 [==============================] - 0s 737us/step - loss: 0.4443 - accuracy: 0.7866
Epoch 22/150
62/62 [==============================] - 0s 735us/step - loss: 0.4428 - accuracy: 0.7866
Epoch 23/150
62/62 [==============================] - 0s 727us/step - loss: 0.4410 - accuracy: 0.7834
Epoch 24/150
62/62 [==============================] - 0s 719us/step - loss: 0.4390 - accuracy: 0.7899
Epoch 25/150
62/62 [==============================] - 0s 689us/step - loss: 0.4374 - accuracy: 0.7883
Epoch 26/150
62/62 [==============================] - 0s 712us/step - loss: 0.4370 - accuracy: 0.7899
Epoch 27/150
62/62 [==============================] - 0s 706us/step - loss: 0.4352 - accuracy: 0.7915
Epoch 28/150
62/62 [==============================] - 0s 694us/step - loss: 0.4354 - accuracy: 0.7866
Epoch 29/150
62/62 [==============================] - 0s 736us/step - loss: 0.4327 - accuracy: 0.7915
Epoch 30/150
62/62 [==============================] - 0s 745us/step - loss: 0.4314 - accuracy: 0.7915
Epoch 31/150
62/62 [==============================] - 0s 730us/step - loss: 0.4302 - accuracy: 0.7915
Epoch 32/150
62/62 [==============================] - 0s 763us/step - loss: 0.4297 - accuracy: 0.7899
Epoch 33/150
62/62 [==============================] - 0s 761us/step - loss: 0.4282 - accuracy: 0.7932
Epoch 34/150
62/62 [==============================] - 0s 734us/step - loss: 0.4269 - accuracy: 0.7964
Epoch 35/150
62/62 [==============================] - 0s 724us/step - loss: 0.4256 - accuracy: 0.8013
Epoch 36/150
62/62 [==============================] - 0s 725us/step - loss: 0.4247 - accuracy: 0.8013
Epoch 37/150
62/62 [==============================] - 0s 720us/step - loss: 0.4236 - accuracy: 0.7948
Epoch 38/150
62/62 [==============================] - 0s 710us/step - loss: 0.4225 - accuracy: 0.7948
Epoch 39/150
62/62 [==============================] - 0s 709us/step - loss: 0.4220 - accuracy: 0.7980
Epoch 40/150
62/62 [==============================] - 0s 702us/step - loss: 0.4208 - accuracy: 0.7980
Epoch 41/150
62/62 [==============================] - 0s 704us/step - loss: 0.4198 - accuracy: 0.7980
Epoch 42/150
62/62 [==============================] - 0s 700us/step - loss: 0.4196 - accuracy: 0.8029
Epoch 43/150
62/62 [==============================] - 0s 704us/step - loss: 0.4185 - accuracy: 0.8029
Epoch 44/150
62/62 [==============================] - 0s 701us/step - loss: 0.4178 - accuracy: 0.7980
Epoch 45/150
62/62 [==============================] - 0s 698us/step - loss: 0.4166 - accuracy: 0.7980
Epoch 46/150
62/62 [==============================] - 0s 693us/step - loss: 0.4164 - accuracy: 0.7948
Epoch 47/150
62/62 [==============================] - 0s 693us/step - loss: 0.4158 - accuracy: 0.7997
Epoch 48/150
62/62 [==============================] - 0s 704us/step - loss: 0.4148 - accuracy: 0.7980
Epoch 49/150
62/62 [==============================] - 0s 696us/step - loss: 0.4140 - accuracy: 0.8046
Epoch 50/150
62/62 [==============================] - 0s 693us/step - loss: 0.4129 - accuracy: 0.8029
Epoch 51/150
62/62 [==============================] - 0s 697us/step - loss: 0.4123 - accuracy: 0.7980
Epoch 52/150
62/62 [==============================] - 0s 702us/step - loss: 0.4117 - accuracy: 0.8013
Epoch 53/150
62/62 [==============================] - 0s 703us/step - loss: 0.4104 - accuracy: 0.8062
Epoch 54/150
62/62 [==============================] - 0s 716us/step - loss: 0.4087 - accuracy: 0.8078
Epoch 55/150
62/62 [==============================] - 0s 720us/step - loss: 0.4093 - accuracy: 0.8094
Epoch 56/150
62/62 [==============================] - 0s 743us/step - loss: 0.4082 - accuracy: 0.8046
Epoch 57/150
62/62 [==============================] - 0s 730us/step - loss: 0.4070 - accuracy: 0.8094
Epoch 58/150
62/62 [==============================] - 0s 724us/step - loss: 0.4061 - accuracy: 0.8062
Epoch 59/150
62/62 [==============================] - 0s 730us/step - loss: 0.4055 - accuracy: 0.8127
Epoch 60/150
62/62 [==============================] - 0s 697us/step - loss: 0.4054 - accuracy: 0.8111
Epoch 61/150
62/62 [==============================] - 0s 716us/step - loss: 0.4048 - accuracy: 0.8127
Epoch 62/150
62/62 [==============================] - 0s 696us/step - loss: 0.4031 - accuracy: 0.8111
Epoch 63/150
62/62 [==============================] - 0s 738us/step - loss: 0.4021 - accuracy: 0.8111
Epoch 64/150
62/62 [==============================] - 0s 721us/step - loss: 0.4017 - accuracy: 0.8127
Epoch 65/150
62/62 [==============================] - 0s 727us/step - loss: 0.4015 - accuracy: 0.8143
Epoch 66/150
62/62 [==============================] - 0s 691us/step - loss: 0.4011 - accuracy: 0.8078
Epoch 67/150
62/62 [==============================] - 0s 700us/step - loss: 0.3993 - accuracy: 0.8094
Epoch 68/150
62/62 [==============================] - 0s 725us/step - loss: 0.3992 - accuracy: 0.8192
Epoch 69/150
62/62 [==============================] - 0s 721us/step - loss: 0.3973 - accuracy: 0.8192
Epoch 70/150
62/62 [==============================] - 0s 729us/step - loss: 0.3972 - accuracy: 0.8192
Epoch 71/150
62/62 [==============================] - 0s 715us/step - loss: 0.3966 - accuracy: 0.8111
Epoch 72/150
62/62 [==============================] - 0s 689us/step - loss: 0.3955 - accuracy: 0.8192
Epoch 73/150
62/62 [==============================] - 0s 691us/step - loss: 0.3962 - accuracy: 0.8062
Epoch 74/150
62/62 [==============================] - 0s 694us/step - loss: 0.3949 - accuracy: 0.8160
Epoch 75/150
62/62 [==============================] - 0s 697us/step - loss: 0.3937 - accuracy: 0.8241
Epoch 76/150
62/62 [==============================] - 0s 692us/step - loss: 0.3945 - accuracy: 0.8208
Epoch 77/150
62/62 [==============================] - 0s 733us/step - loss: 0.3941 - accuracy: 0.8176
Epoch 78/150
62/62 [==============================] - 0s 705us/step - loss: 0.3932 - accuracy: 0.8176
Epoch 79/150
62/62 [==============================] - 0s 688us/step - loss: 0.3915 - accuracy: 0.8208
Epoch 80/150
62/62 [==============================] - 0s 685us/step - loss: 0.3903 - accuracy: 0.8176
Epoch 81/150
62/62 [==============================] - 0s 682us/step - loss: 0.3906 - accuracy: 0.8062
Epoch 82/150
62/62 [==============================] - 0s 701us/step - loss: 0.3899 - accuracy: 0.8241
Epoch 83/150
62/62 [==============================] - 0s 682us/step - loss: 0.3890 - accuracy: 0.8160
Epoch 84/150
62/62 [==============================] - 0s 719us/step - loss: 0.3879 - accuracy: 0.8257
Epoch 85/150
62/62 [==============================] - 0s 735us/step - loss: 0.3871 - accuracy: 0.8290
Epoch 86/150
62/62 [==============================] - 0s 735us/step - loss: 0.3858 - accuracy: 0.8257
Epoch 87/150
62/62 [==============================] - 0s 740us/step - loss: 0.3857 - accuracy: 0.8306
Epoch 88/150
62/62 [==============================] - 0s 701us/step - loss: 0.3858 - accuracy: 0.8257
Epoch 89/150
62/62 [==============================] - 0s 706us/step - loss: 0.3839 - accuracy: 0.8274
Epoch 90/150
62/62 [==============================] - 0s 723us/step - loss: 0.3857 - accuracy: 0.8225
Epoch 91/150
62/62 [==============================] - 0s 731us/step - loss: 0.3852 - accuracy: 0.8241
Epoch 92/150
62/62 [==============================] - 0s 693us/step - loss: 0.3837 - accuracy: 0.8290
Epoch 93/150
62/62 [==============================] - 0s 694us/step - loss: 0.3828 - accuracy: 0.8274
Epoch 94/150
62/62 [==============================] - 0s 693us/step - loss: 0.3821 - accuracy: 0.8290
Epoch 95/150
62/62 [==============================] - 0s 697us/step - loss: 0.3815 - accuracy: 0.8322
Epoch 96/150
62/62 [==============================] - 0s 693us/step - loss: 0.3815 - accuracy: 0.8322
Epoch 97/150
62/62 [==============================] - 0s 711us/step - loss: 0.3811 - accuracy: 0.8274
Epoch 98/150
62/62 [==============================] - 0s 729us/step - loss: 0.3815 - accuracy: 0.8306
Epoch 99/150
62/62 [==============================] - 0s 701us/step - loss: 0.3798 - accuracy: 0.8274
Epoch 100/150
62/62 [==============================] - 0s 683us/step - loss: 0.3802 - accuracy: 0.8339
Epoch 101/150
62/62 [==============================] - 0s 713us/step - loss: 0.3783 - accuracy: 0.8322
Epoch 102/150
62/62 [==============================] - 0s 718us/step - loss: 0.3797 - accuracy: 0.8306
Epoch 103/150
62/62 [==============================] - 0s 763us/step - loss: 0.3795 - accuracy: 0.8257
Epoch 104/150
62/62 [==============================] - 0s 741us/step - loss: 0.3783 - accuracy: 0.8322
Epoch 105/150
62/62 [==============================] - 0s 722us/step - loss: 0.3777 - accuracy: 0.8306
Epoch 106/150
62/62 [==============================] - 0s 704us/step - loss: 0.3778 - accuracy: 0.8290
Epoch 107/150
62/62 [==============================] - 0s 700us/step - loss: 0.3764 - accuracy: 0.8290
Epoch 108/150
62/62 [==============================] - 0s 728us/step - loss: 0.3766 - accuracy: 0.8306
Epoch 109/150
62/62 [==============================] - 0s 694us/step - loss: 0.3767 - accuracy: 0.8290
Epoch 110/150
62/62 [==============================] - 0s 690us/step - loss: 0.3770 - accuracy: 0.8290
Epoch 111/150
62/62 [==============================] - 0s 692us/step - loss: 0.3764 - accuracy: 0.8257
Epoch 112/150
62/62 [==============================] - 0s 691us/step - loss: 0.3746 - accuracy: 0.8290
Epoch 113/150
62/62 [==============================] - 0s 720us/step - loss: 0.3742 - accuracy: 0.8290
Epoch 114/150
62/62 [==============================] - 0s 738us/step - loss: 0.3744 - accuracy: 0.8322
Epoch 115/150
62/62 [==============================] - 0s 731us/step - loss: 0.3741 - accuracy: 0.8339
Epoch 116/150
62/62 [==============================] - 0s 741us/step - loss: 0.3728 - accuracy: 0.8339
Epoch 117/150
62/62 [==============================] - 0s 685us/step - loss: 0.3730 - accuracy: 0.8339
Epoch 118/150
62/62 [==============================] - 0s 704us/step - loss: 0.3711 - accuracy: 0.8355
Epoch 119/150
62/62 [==============================] - 0s 703us/step - loss: 0.3717 - accuracy: 0.8355
Epoch 120/150
62/62 [==============================] - 0s 748us/step - loss: 0.3710 - accuracy: 0.8339
Epoch 121/150
62/62 [==============================] - 0s 735us/step - loss: 0.3710 - accuracy: 0.8355
Epoch 122/150
62/62 [==============================] - 0s 745us/step - loss: 0.3706 - accuracy: 0.8388
Epoch 123/150
62/62 [==============================] - 0s 690us/step - loss: 0.3695 - accuracy: 0.8371
Epoch 124/150
62/62 [==============================] - 0s 710us/step - loss: 0.3711 - accuracy: 0.8388
Epoch 125/150
62/62 [==============================] - 0s 693us/step - loss: 0.3698 - accuracy: 0.8355
Epoch 126/150
62/62 [==============================] - 0s 697us/step - loss: 0.3683 - accuracy: 0.8339
Epoch 127/150
62/62 [==============================] - 0s 690us/step - loss: 0.3675 - accuracy: 0.8388
Epoch 128/150
62/62 [==============================] - 0s 716us/step - loss: 0.3689 - accuracy: 0.8371
Epoch 129/150
62/62 [==============================] - 0s 700us/step - loss: 0.3675 - accuracy: 0.8339
Epoch 130/150
62/62 [==============================] - 0s 727us/step - loss: 0.3664 - accuracy: 0.8339
Epoch 131/150
62/62 [==============================] - 0s 740us/step - loss: 0.3663 - accuracy: 0.8355
Epoch 132/150
62/62 [==============================] - 0s 733us/step - loss: 0.3658 - accuracy: 0.8274
Epoch 133/150
62/62 [==============================] - 0s 724us/step - loss: 0.3663 - accuracy: 0.8371
Epoch 134/150
62/62 [==============================] - 0s 697us/step - loss: 0.3674 - accuracy: 0.8355
Epoch 135/150
62/62 [==============================] - 0s 718us/step - loss: 0.3645 - accuracy: 0.8420
Epoch 136/150
62/62 [==============================] - 0s 693us/step - loss: 0.3643 - accuracy: 0.8355
Epoch 137/150
62/62 [==============================] - 0s 709us/step - loss: 0.3646 - accuracy: 0.8339
Epoch 138/150
62/62 [==============================] - 0s 694us/step - loss: 0.3642 - accuracy: 0.8355
Epoch 139/150
62/62 [==============================] - 0s 684us/step - loss: 0.3636 - accuracy: 0.8388
Epoch 140/150
62/62 [==============================] - 0s 715us/step - loss: 0.3630 - accuracy: 0.8371
Epoch 141/150
62/62 [==============================] - 0s 714us/step - loss: 0.3640 - accuracy: 0.8322
Epoch 142/150
62/62 [==============================] - 0s 711us/step - loss: 0.3619 - accuracy: 0.8371
Epoch 143/150
62/62 [==============================] - 0s 701us/step - loss: 0.3606 - accuracy: 0.8355
Epoch 144/150
62/62 [==============================] - 0s 694us/step - loss: 0.3610 - accuracy: 0.8371
Epoch 145/150
62/62 [==============================] - 0s 691us/step - loss: 0.3609 - accuracy: 0.8404
Epoch 146/150
62/62 [==============================] - 0s 720us/step - loss: 0.3626 - accuracy: 0.8355
Epoch 147/150
62/62 [==============================] - 0s 691us/step - loss: 0.3615 - accuracy: 0.8355
Epoch 148/150
62/62 [==============================] - 0s 688us/step - loss: 0.3584 - accuracy: 0.8420
Epoch 149/150
62/62 [==============================] - 0s 695us/step - loss: 0.3592 - accuracy: 0.8371
Epoch 150/150
62/62 [==============================] - 0s 722us/step - loss: 0.3576 - accuracy: 0.8420
Out[4]:
<keras.src.callbacks.History at 0x7fa9cd1d0b50>
In [5]:
_, accuracy = model.evaluate(X_train, y_train)

print(f"Accuracy: {accuracy}")
20/20 [==============================] - 0s 742us/step - loss: 0.3535 - accuracy: 0.8420
Accuracy: 0.8420195579528809

The training time of a model will depend, first of all, on the size of the dataset (instances and features), and also on the type of model and its configuration.

The accuracy of the training set is 86.97%.

Step 3: Model prediction

In [6]:
y_pred = model.predict(X_test)
y_pred[:15]
5/5 [==============================] - 0s 797us/step
Out[6]:
array([[2.6933843e-01],
       [5.7993677e-02],
       [7.6992743e-02],
       [4.8524177e-01],
       [3.1675667e-01],
       [6.4265609e-01],
       [7.3388085e-04],
       [2.8476545e-01],
       [8.7694836e-01],
       [4.1469648e-01],
       [1.6080230e-01],
       [8.2213795e-01],
       [2.1518065e-01],
       [5.3527528e-01],
       [1.2730679e-01]], dtype=float32)

As we can see, the model does not return the classes 0 and 1 directly, but requires a previous preprocessing:

In [7]:
y_pred_round = [round(x[0]) for x in y_pred]
y_pred_round[:15]
Out[7]:
[0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0]

With raw data it is very difficult to know whether the model is getting it right or not. To do this, we must compare it with reality. There are a large number of metrics to measure the effectiveness of a model in predicting, including accuracy, which is the fraction of predictions that the model made correctly.

In [8]:
from sklearn.metrics import accuracy_score

accuracy_score(y_test, y_pred_round)
Out[8]:
0.7272727272727273

Step 4: Saving the model

Once we have the model we were looking for (presumably after hyperparameter optimization), to be able to use it in the future it is necessary to store it in our directory.

In [9]:
model.save("keras_8-12-8-1_42.keras")

Adding an explanatory name to the model is vital, since in the case of losing the code that has generated it we will know what architecture it has (in this case we say 8-12-8-8-1 because it has 8 neurons in the input layer, 12 and 8 in the two hidden layers and one neuron in the output layer) and also the seed to replicate the random components of the model, which in this case we do by adding a number to the file name, 42.

Image set classification

The following is a simple example of how to train a neural network to classify images from the MNIST dataset. MNIST is a dataset of images of handwritten digits, from 0 to 9.

Step 1. Reading the data set

In [10]:
from tensorflow.keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize the data (transform pixel values from 0-255 to 0-1)
X_train, X_test = X_train / 255.0, X_test / 255.0

The pixel values of the images are normalized to be in the range 0 to 1 instead of 0 to 255.

Step 2: Model initialization and training

The architecture of the neural network is defined. In this case, we are using a simple sequential model with a flattening layer that transforms 2D images into 1D vectors, a dense layer with 128 neurons and an output layer with 10 neurons.

An alternative way to create an ANN to the above is provided below. Both are valid:

In [11]:
from tensorflow.keras.layers import Flatten

set_random_seed(42)

model = Sequential([
  # Layer that flattens the 28x28 pixel input image to a vector of 784 elements
  Flatten(input_shape = (28, 28)),
  # Dense hidden layer with 128 neurons and ReLU activation function
  Dense(128, activation = "relu"),
  # Output layer with 10 neurons (one for each digit from 0 to 9)
  Dense(10)
])

We also added the network compiler to define the optimizer and the loss function, as we did before:

In [12]:
from tensorflow.keras.losses import SparseCategoricalCrossentropy

model.compile(optimizer = "adam", loss = SparseCategoricalCrossentropy(from_logits = True), metrics = ["accuracy"])

The model is trained on the training set for a certain number of epochs. When working with images it is less common to use the batch_size parameter:

In [13]:
model.fit(X_train, y_train, epochs = 5)
Epoch 1/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.2530 - accuracy: 0.9276
Epoch 2/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.1111 - accuracy: 0.9671
Epoch 3/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.0759 - accuracy: 0.9757
Epoch 4/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.0566 - accuracy: 0.9831
Epoch 5/5
1875/1875 [==============================] - 2s 1ms/step - loss: 0.0432 - accuracy: 0.9865
Out[13]:
<keras.src.callbacks.History at 0x7fa9704ed010>
In [14]:
_, accuracy = model.evaluate(X_train, y_train)

print(f"Accuracy: {accuracy}")
1875/1875 [==============================] - 1s 669us/step - loss: 0.0441 - accuracy: 0.9858
Accuracy: 0.9858166575431824

The training time of a model will depend, first of all, on the size of the dataset (instances and features), and also on the type of model and its configuration.

Step 3: Model prediction

In [15]:
test_loss, test_acc = model.evaluate(X_test,  y_test, verbose=2)

print('\nTest accuracy:', test_acc)
313/313 - 0s - loss: 0.0841 - accuracy: 0.9751 - 271ms/epoch - 867us/step

Test accuracy: 0.9750999808311462

Step 4: Saving the model

Once we have the model we were looking for (presumably after hyperparameter optimization), to be able to use it in the future it is necessary to store it in our directory.

In [16]:
model.save("keras_28x28-128-10_42.keras")

Adding an explanatory name to the model is vital, since in the case of losing the code that has generated it we will know what architecture it has (in this case we say 28x28-128-10 because it has an input layer of 28 x 28 pixels, 128 neurons in the only hidden layer it has and 10 neurons in the output layer) and also the seed to replicate the random components of the model, which in this case we do by adding a number to the file name, 42.