Machine learning is quite a fashionable topic these days, and companies that leverage machine learning can have a significant competitive advantage over those that do not. At a high level, a successful machine learning project needs a machine learning expert, domain knowledge, machine-learning algorithms, and data. As executives and investors are trying to use machine learning to gain an edge, it is important to understand what makes a successful machine learning project. Our hope is that executives untrained in the art might find guidance in navigating the many decisions related to a machine learning project.
Machine Learning Expert vs Domain Expert
The first frequent mistake is to not pair a domain expert with a machine learning expert and assign the project to either of these experts as a solo project. ML is hard and takes a trained expert to perform successfully. A domain expert most often cannot select the appropriate algorithms, and they tend to choose the most fashionable ones of the day. It is unlikely that an untrained domain expert would be able to choose the correct algorithm for a problem and wield it with the required expertise such that the results would be valid and reliable. They also do not have a full understanding of all the nuances of a machine learning pipeline to be able to run an ML project successfully, and there really isn’t enough time to learn this skill within the scope of a project.
Conversely, machine learning experts do not have the domain expertise needed to make sense of the data, and cannot learn a field deeply enough to be effective within a few short months. Therefore, an executive or investor must make sure that both a domain expert and a machine learning expert are engaged to work together on ML projects.
Algorithms
A second mistake has to do with algorithms. By now, there are many machine learning algorithms to choose from, and each algorithm is only appropriate for certain classes of problems. The choice of the algorithm should lie with the ML expert, of course. However, it may be tempting to try and invent new algorithms where existing ones could do the job. An ill-prepared expert may not be aware of the myriad of choices of open source and freely available alternatives. Here the mistake would be to authorize a project that would involve inventing a new machine learning algorithm before all available alternatives are explored. Because of all the free choices, new ML algorithms are seldom the source of competitive differentiation.
Similarly, if a startup is pitching a new algorithm as its competitive advantage, thorough due diligence is advised to confirm this and to make sure that existing algorithms are not mathematically equivalent to the claimed innovation.
Data
The third mistake has to do with data, which can be categorized as freely available or proprietary. Competitive differentiation is seldom found in data that was found freely on the internet. Free data combined with open source machine learning algorithms do not lead to a competitive advantage. Only data that was collected in a proprietary way and kept secret can form the basis of such an advantage.
Data must also be handled expertly, and if not, presents many potential pitfalls. There are many ways to represent data, the data must be properly cleaned, normalized, and fused (enriched) with related information, and in the absence of the needed data, new data may need to be gathered in the field in the proper way. This last point reinforces our previous point about the need to deploy machine learning experts in addition to domain experts.
Takeaways
In conclusion, what makes machine learning a competitive advantage is proprietary data that is handled by a machine learning expert in combination with domain expertise. Investors and executives can decide whether to invest in such a project if these prerequisites are not met. Sidespin Group provides world-class machine learning experts for various projects, including machine learning expert witnesses for litigation support.
No tags for this post.