Blogs
byhafiz
Which Model to Use?
February 19, 2023 | Author: Hafiz, CTO at Enygma

From the past 3 decades (from the 90s), AI has started to be more and more popular, following the boom of hardware technology that keeps evolving to larger storage, larger memory, faster speed, and lower price. Until now, there are a lot of ML models proposed by research papers around the world to the point that it become "obsolete" just after months of releasing. But "obsolete" is not the best term to use for this scenario, each models have its own strength and weakness from various aspects and there is a wide spectrum that we can look at to choose the right model.

What is Your Model Constraint?

Before we can choose our model, we first need to understand the project requirements. And from there we can extract the constraints. These are some of the important ones:

Accuracy

Does your model need to be super accurate in order to have a reliable system? Can it tolerate some false positives? Can it tolerate some false negatives? Does 70% of accuracy enough? Or does it needs to be >80%? Or perhap >90%?

Obviously, the higher the better. But sometimes we don't have that privilege to get super high accuracy, especially in certain situations where the data is limited, or in some cases the label could also be vague and subject to interpretations. It is important to identify what is the end-goal threshold.

Inference Speed

Does it need to be fast? Or does it need to be realtime-fast? What is the longest acceptable inferencing time or processing time?

I bolded the keyword "processing" because that is what often mislooked when deploying the model in production. Processing consist of data preprocessing + data validation + model inferencing + model output validation + data upload + other relevant operations. Let say the maximum acceptable processing time is 2 seconds but the model inferencing time itself already took 1.5 seconds, you would not have enough time to complete all the other operations. You would need faster model let say with 0.5 second of inferencing time in order to keep everything within constraint.

This is more critical if you are doing inference over a series of huge data. Let say you are doing object detection on a series of frames of a video. The video's frame per second is 10 and your inference time per frame is 1s. For 1 minute video, it would take up to 10 minutes to process! Which is very not practical, very long processing time and very high cost as you need to run the server longer. Ideally you would want the processing time to be <= the video duration. In this case, you would want the inferencing time per frame lower than 100ms to make it production-ready. Of course, all depends on the situation and model complexity.

Deployment Environment

Where do you need to deploy the model? On cloud or on edge devices? Running on CPU or GPU? x86 or ARM? Specifically, what edge device model will be used?

Cloud tends to be flexible and have more powerful compute power while edge devices are very limited. Certain edge devices also limited to certain machine learning libraries only. This is ultimately the biggest reason to answer these questions first before training the model.

So Which Model Should I use?

Start with Something Simple

Rule of thumb, the simpler the better. Even though your marketing team or even your CEO sell it as the "cutting-edge state-of-the-art deep learning AGI model", just ignore it. It's all for marketing purposes. From your side, focus on your end goal. People don't care how you do it, they only care if you get the job done or not. And you also need to do it in the most efficient way.

Start with something simple or small. This are generally the flow that I would usually go:

  1. If-else logic
  2. Mathematical formula
  3. Non-deep learning model (logistic regression, decision trees, clustering algorithms etc)
  4. Deep learning model
  5. "AGI"-level model (LLM, multi-input, multi-output model)

Based on the list above, the first 2 steps are not even machine learning. If your problem could be solved by just using if-else or some simple mathematical calculations, why do you even need to train a model right? Less effort needed, less development time needed and ultimately less cost! (ain't that what your CEO want to hear?). Non-ML solutions are also easier to deploy (you can even run it on your frontend/browser), easier to maintain, easier to tune, and easier to debug. Unlike ML models that behave like a black box, full of mystery.

And then gradually go to more complex approaches when the simpler ones won't cut it or producing sub-par results. If you can skip ML models altogether that is the best. If you can't, business as usual.

Follow the Constraints and Evaluate the Trade-offs

We have wide options of models, for convolutional-based models we have AlexNext, VGG, Inception V3, MobileNet V3, and ResNet to name a few. Even for each model architechtures, we have variations of sizes such as Resnet18, Resnet34, Resnet50, Resnet101 representing the number of layers in the model.

Each architectures have its own pros and cons. You need to learn at least in theory on how it works and then cross-match it with your situation. From the way it learns the data pattern, how it solves certain known problems during training, what it is intended for by the researchers, and how it performs compared to the other models.

The more the layer, the better the prediction (not always) as more latent features are extracted from the raw data. But that also means it will take longer time to infer. This is why we need to ask ourselves the questions specified earlier. How much accuracy % is good and how fast this model should be. There will always be trade-offs. High accuracy but super slow or bad accuracy but super fast, perhaps the best one is somewhere in the middle. You might need to do hyperparameter tuning to identify the best one.

If your fastest model can also achieve a very good result, I see no reason to not use it. Faster inferencing -> faster processing -> better user experience and less computation cost.

Check your ML Libraries too

The trend nowadays is to use Pytorch for deep learning. It might suited for most of the environments (x86, ARM and so on). But that might not always be the case. Devices from Nvidia Jetson might support it but device from Coral AI only supports Tensorflow Lite. So if you are planning to use Coral AI, you need to train your model on Tensorflow instead of Pytorch.

When deploying on edge devices, you also need to understand that it has limited computation power. Especially compared to the cloud that usually use industrial-graded GPU models. Your model might perform way slower on the edge devices compared to on your development machine. If you are planning to deploy on edge, as best as possible keep your model small or you can perhaps implement certain model optimization techniques such as model pruning, knowledge distillation, quantization and so on.

Conclusion

Identifying project requirements and model constraints is a very important step you need to do before you even touch your data. It is much cheaper to spend more development time on researching than scraping off and redo everything after realising your model is not suitable for the production. After that, as long as you are within the constraints, you are free to do whatever you want!