1. auto-reload modules
#To auto-reload modules in jupyter notebook (so that changes in files *.py doesn't require manual reloading): %reload_ext autoreload %autoreload 2 #To inline the output of plotting commands is displayed inline within frontend Jupyter notebook %matplotlib inline
2. Import all main extrnal libraries
# Confirm Python Version 3 from platform import python_version print(python_version()) import os #Change working Directory to import Fastai libraries os.chdir('/home/paperspace/fastai/courses/dl1') %pwd # Import all main extrnal libraries from fastai.imports import * # Manage transformation library written from scratch in PyTorch. # The main purpose of the library is for data augmentation, also use it for more general image transformation purposes from fastai.transforms import * # Get a pre-trained model with the fast ai library from fastai.conv_learner import * # DL models by fastai (http://files.fast.ai/models/) from fastai.models import * # Module has the necessary functions to be able to download several useful datasets from fastai.dataset import * # Module to reform SGDR from fastai.sgdr import * # Module to display plots - such as learning rate from fastai.plots import * # Additional Libraries for Multi Class image classificationfrom fastai.imports import * from fastai.torch_imports import *
a3. Check NVidia GPU framework
# NVidia GPU with programming framework CUDA is critical & following command must return true torch.cuda.is_available() # Make sure deep learning package from CUDA CuDNN is enabled for improving training performance ( prefered) torch.backends.cudnn.enabled
4. Set Parameters
# Example 1: Binary Image Classifcation #Path is path to Data PATH= '/home/paperspace/fastai/courses/SelfCodes/Binary_Image_Classiff_cats_dogs/Binary_Image_Classiff_cats_dogs/data/dogscats/' # sz' is size of images, should be changed in order to perform fast training, # One should start with small size, once results improve saizes can be increased sz=224 # Select D-learning model arch=resnet34 # Fo dog breed we try arch = resnext101_64 # bs- batch size - default is 64 bs=58 # start from 28 Information on Architecture# different architectures have different number of layers, size of kernels, filters, etc.# We have been using ResNet34 — a great starting point and often a good finishing point # because it does not have too many parameters and works well with small dataset. # here is another architecture called ResNext which was the second-place winner in last year’s ImageNet competition.# ResNext50 takes twice as long and 2–4 times more memory than ResNet34
5. Observations
a. Observe Folder Structure of path
# Example of cats and dogs # list directories of 'PATH' os.listdir(PATH) ['sample', 'valid', 'models', 'train', 'tmp', 'test1'] # list directories of 'Valid' os.listdir(f'{PATH}valid') ['cats', 'dogs'] For dog breed['train.zip', 'sample_submission.csv', '.ipynb_checkpoints', 'test', 'all.zip', 'labels.csv', 'train', 'test.zip']
b. Optional - for structure similar to do breed- csv
c. Observe Files
Folder Structure
- Cats & Dogs
# Observe file names os.listdir(f'{PATH}valid/cats')[0] # observe cat picture img = plt.imread(PATH + 'valid/cats/' + os.listdir(f'{PATH}valid/cats')[0]) plt.imshow(img); # Observe img shape img.shape
Folder Structure
- Dog breed-csv
Read head of file CSV
# Code for Multi breed classificationlabel_df = pd.read_csv(label_csv)label_df.head()
Distribution of breed
# Code for Multi breed classification label_df.pivot_table(index='breed', aggfunc=len).sort_values('id', ascending=False)
For dog - breed CSV
6. Model Development
a. Model Development step1
# Example of cats and dogs # list directories of 'PATH' os.listdir(PATH) ['sample', 'valid', 'models', 'train', 'tmp', 'test1'] # list directories of 'Valid' os.listdir(f'{PATH}valid') ['cats', 'dogs']
b. Choosing Learning Rate
# find an optimal learning rate find an optimal learning rate. # we simply keep increasing the learning rate from a very small value, until the loss stops decreasing # we should re run this command as model changes learn.lr_find()
c. Model Development Step 2
d. Choosing Learning Rate
# find an optimal learning rate find an optimal learning rate. # we simply keep increasing the learning rate from a very small value, until the loss stops decreasing # we should re run this command as model changes learn.lr_find()
Create learner with 
Precompute= True
# So far We are using a pre-trained network which has already learned to recognize features # (i.e. we do not want to change hyper parameters it learned), # so what we can do is to pre-compute activations for hidden layers and just train the final linear portion. # To use data augmentation, we have to do learn.precompute=False learn.precompute=False
Fit Model on Current data
learn.fit(1e-2, 3, cycle_len=1)
Information
# cycle_len enables stochastic gradient descent with restarts (SGDR). # small changes to the weights may result in big changes to the loss. # We want to encourage our model to find parts of the weight space that are both accurate and stable. # Therefore, from time to time we increase the learning rate (this is the ‘restarts’ in ‘SGDR’), # which will force the model to jump to a different part of the weight space if the current area is “spiky” # The basic idea is as you get closer and closer to the spot with the minimal loss, # you may want to start decrease the learning rate (taking smaller steps) in order to get to exactly the right spot. # This is called learning rate learning rate annealing # An approach is simply to pick some kind of functional form # really good functional form is one half of the cosign curve # which maintains the high learning rate for a while at the beginning, then drop quickly when you get closer.
Observe Learning rate
LR with iterations
# for mini batch we increate the learning rate , the losses will decrease & then get worse learn.sched.plot_lr()
LR with loss
# plot of loss vs learning rate learn.sched.plot()
Save Model
Save Model
# Save model learn.save('224_lastlayer')
Load Model
# Load saved model learn.load('224_lastlayer')
View model arch
learn.summary()
e. Fine-tuning and
 differential learning rate annealing
Unfreeze Conv Filters
learn.unfreeze()
Set Differential Learning Rate
lr=np.array([1e-4,1e-3,1e-2])
Fit Model on Current data
learn.fit(lr, 3, cycle_len=1, cycle_mult=2) Check Accuracyepoch trn_loss val_loss accuracy 0 0.045694 0.025966 0.991 1 0.040596 0.019218 0.993 2 0.035022 0.018968 0.9925 3 0.027045 0.021262 0.992 4 0.027029 0.018686 0.993 5 0.022999 0.017357 0.993 6 0.02003 0.017553 0.993
Information
# Earlier we said 3 is the number of epochs, but it is actually cycles. #So if cycle_len=2 , it will do 3 cycles where each cycle is 2 epochs (i.e. 6 epochs). Then why did it 7? #It is because of cycle_multcycle_mult=2 : this multiplies the length of the cycle after each cycle # (1 epoch + 2 epochs + 4 epochs = 7 epochs).
Save Model
Save Model
# Save model learn.save('224_lastlayer')
Load Model
# Load saved model learn.load('224_lastlayer')
7. Test time Augmentation
8. Analyzing Results
log_preds,y = learn.TTA() probs = np.mean(np.exp(log_preds),0)
