The global artificial intelligence market size was valued at USD 39.9 billion in 2019 and is expected to grow at a compound annual growth rate (CAGR) of 42.2% from 2020 to 2027.
The continuous research and innovation directed by the tech giants are driving the adoption of advanced technologies in industry verticals, such as automotive, healthcare, retail, finance, and manufacturing. However, technology has always been an essential element for these industries, but AI has brought technology at the center of the organizations .
Following this market trends, AI-based solutions will be present in almost all the aspect of our life from mobility, homes, social interactions (social media, e-commerce, etc.) to crucial healthcare life-saving medical applications, impacting the quality of the human decision making process. It is true that AI would transform our life standards in the coming years, but an AI model with near-perfect accuracy can be also problematic. As the accuracy of the model goes up, AI’s ability to explain why it arrived at a certain answer goes down, raising an issue companies must confront: the lack of AI transparency of the model and therefore, our human capacity to trust its results. In other words, the lack of clarity provided by the model become the decision maker to lose control of the situation, so the mandatory question is: beside all the advantages of the AI-based applications, are we 100% comfortable with the current maturity level of the market solutions to give the full control of our live to some application that we probably do not understand?
The market needs more visibility and more transparency to be able to trust models that others are building. Should you trust that model that the cloud provider is providing? What about the model embedded in the tool you depend on? What visibility do you have to how the model was put together and how it will be iterated? The answer right now is little to none.
In order to improve future AI-based developments, Forbes highlight the following problem to be solved by data scientific, developers, solution providers and customers :
- Unexplainable algorithms: One issue that has been well acknowledged is that some machine learning algorithms are unexplainable. Especially, deep learning algorithm suffers from this problem. For example, when an image recognition model recognizes a turtle as a rifle, what is the reason why the model does so? We don’t actually know why, and therefore those looking to take advantage of model weaknesses can exploit such vulnerabilities by throwing these deep learning nets for a loop. Nevertheless, not all machine learning algorithms suffer from the same explainability issues. For example, decision trees by their nature are explainable, although when used in ensemble methods such as Random Forests, we lose elements of that explainability. So, the first question any model consumer should ask is how explainable is the algorithm that was used to build the model.
- Lack of visibility into training data sets: However, simply knowing if a given model uses an explainable algorithm is not enough to understand a model’s performance. As mentioned above, the strength of a model depends significantly on its training data. If the training data doesn’t the process, then even these highly trained, cleaned, well-labelled models will perform poorly. So, the question that a model user will need to ask is about the training data.
- Lack of visibility into methods of data selection: Simply having access to the training data does not answer all questions of transparency. Full transparency also means knowing how data was selected from the available training data, and ideally even being able to use the same selection methods on the training data yourself as a model consumer to see what data was included and what data was excluded. You might realize that your use of the model is failing because the data you are using in real life just happens to be the data that was excluded from the data used for training. Without having full visibility into the methods of data selection, a lack of transparency still exists.
- Limited understanding of the bias in training data sets: Many times, models run into problems not because of bad data or even poorly selected data, but because there’s some inherent bias in the training data. There have been many issues with regards to facial recognition models trained on limited data sets due to the bias of model developers, and issues of loan decision models that have used historically biased data sets to determine availability of credit. Societal bias in data sets are significant and rife, and organizations need to find ways to eliminate such bias if they result in models that perpetuate biases in unwanted ways.
- Limited visibility into model versioning: Another problem with model transparency is that models might be continuously versioned. Indeed, good models should be iterated on so that they can be even better. For example, cloud-based models, seemingly random and spontaneous model versioning could cause problems. For full transparency, those producing models should also make it clear and consistent how models will be versioned, model versioning frequency, and the ability to use older model versions if new models start to perform poorly.
Unfortunately, today almost all AI-based applications do not cover the problems listed before, following in many cases a black box business model. The black box problem was acceptable to some degree in the early days of the technology but lost its merit when algorithmic bias was spotted. For example, AI that was developed to sort resumes disqualified people for certain jobs based on their race, and AI used in banking disqualified loan applicants based on their gender. The data the AI was trained on was not balanced to include enough data of all kinds of people, and the historical bias that lived in the human decisions were passed on to the models .
Another clear example of lack of transparency from AI-based applications come from a report published by CNET, 2020 about UK border control systems. The decision to suspend the use of the “streaming tool,” which has been used by the UK Home Office since 2015, comes in direct response to a legal threat by tech accountability organization Foxglove and the Joint Council for the Welfare of Immigrants. Together they allege the tool is racist due to its use of nationality as a basis on which to decide whether applicants are high risk .
According to Cori Crider from Foxglove: “Racial bias in algorithms is a well-documented issue in facial recognition technology, but it’s also widely considered to be a problem in algorithms across the technology industry. Potentially life-changing decisions are partly made by a computer program that nobody on the outside was permitted to see or to test. Furthermore, decisions made by the algorithm could have far-reaching implications, she argues.”
Arriving to this point, the main question is how to reach a balance? As with any other computer program, AI needs optimization. To do that, we look at the specific needs of a certain problem and then tune our general model to fit those needs best. When implementing AI, an organization must pay attention to the following four factors :
- Legal needs: If the work requires explainability from a legal and regulatory perspective, there may be no choice but to provide transparency. To reach that, an organization may have to resort to simpler but explainable algorithms.
- Severity: If the AI is going to be used in life-critical missions, transparency is a must. It is most likely that such tasks are not dependent on AI alone, so having a reasoning mechanism improves the teamwork with human operators. The same applies if AI affects someone’s life, such as algorithms that are used for job applications.
- Exposure: Depending on who has access to the AI model, an organization may want to protect the algorithm from unwanted reach. Explainability can be good even in the cybersecurity space if it helps experts reach a better conclusion. But, if outsiders can gain access to the same source and understand how the algorithm works, it may be better to go with opaque models.
- Data set: No matter the circumstances, an organization must always strive to have a diverse and balanced data set, preferably from as many sources as possible. Eventually, we’ll want to rely on AI as much as we can, and AI is only as smart as the data it is trained on. By cleaning the training data, removing noise and balancing the inputs, we can help to reduce bias and improve the model’s accuracy.
In future posts, we will be continue writing about technology and business trends for enterprises. Furthermore, we recommend consulting the following literature to continue your digital transformation journey:
- Designed for Digital: How to Architect Your Business for Sustained Success, MIT review
- The Future Is Faster Than You Think: How Converging Technologies Are Transforming Business, Industries, and Our Lives, by Simon & Schuster
- Artificial Intelligence: The Insights You Need, by Harvard Business Review
- The Year in Tech, 2021: The Insights You Need, by Harvard Business Review
- The Deep Learning Revolution, by MIT Press
- Competing in the Age of AI, by Harvard Review Press
The objective of this blog is to provide a personal vision of how digital transformation trends will be impacting in our daily activities, businesses and lifestyle.