Training deep models is the most time-consuming process in developing deep models' journey. Both from a configuration perspective as well as computational complexity. Having gone through the struggle, we listed below a few learnings that we hope will make your experience much smoother!

  1. Start Simple

When training your deep model, the choice of hyperparameters (batch size, learning rate...) is important!

If the task/dataset is new, or the model/approach is significantly different from other work, the best is to go to simple choices; reasonable batch size (e.g. 16, 32), small learning rate (e.g. 1e-5), slight data augmentation (e.g. image flipping, rotation..). Once your model output something meaningful, you can pick better choices.

If u have a certain task (example: semantic segmentation), and you have a certain complex architecture in mind. Don't try training this complex model at first. Instead, try a simple vanilla architecture. You may be surprised that this simple architecture works too well and it only requires some minor tweaks to boost the performance.

  1. Don't Reinvent the Wheel

3. Get yourself familiar with repositories you like

There are a lot of repositories out there. you need to explore and see what you like and what works best for you and your model. Examples: ( https://github.com/albumentations-team/albumentations - https://github.com/scikit-image/scikit-image - https://github.com/geopandas/geopandas - https://github.com/rasterio/rasterio ...)

4. Make sure you use Data Augmentation wisely

5. Watch your Pre-Trained Weights