VGG16 is outdated. CNNs based on architectures like EfficientNet and MobileNetV3 have superior accuracy. If attention mechanisms are acceptable in the network, Vision Transformers such as MobileViT are excellent.
Machine Learning
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
You may use MobileNet
models as they use separable convolutions, which have lesser parameters and execution time than simple/regular convolutions. Moreover, MobileNet
s are easy to train and setup (tf.keras.applications.*
has a pre-trained model) and can be used as a backbone model for fine-tuning on datasets other than the ImageNet
.
Further, you can also explore quantization and weight pruning. These are some techniques that can be used to optimize models to have a smaller memory footprint and smaller execution time on embedded devices.
No try mobile net
Try mobilenet of mobile vit
Look at tflight model maker. It will walk you through everything you need to make a light weight mobile friendly image classifier.
https://www.tensorflow.org/lite/models/modify/model_maker/image_classification