We will like to start our weekly post with some figure of merits that are no longer a secret for executives, digital leaders and companies nowadays. Machine learning and Artificial Intelligence-based solutions will be experimenting one of the biggest Compound Annual Growth Rate (CAGR) in the coming years. According to a Forbes report , only the global machine learning market was projected to grow at a CAGR of 42.8% from 2018 to 2024. In other words, the global machine learning market will worth $30.6B in four years. On the other hand, global Artificial Intelligence software revenue will grow from $10.1B in 2018 to $126.0B by 2025, achieving a CAGR of 43.41%.
Based on the expected market exponential growth, machine learning and artificial intelligence’s applications will be adopted in business across industries. A survey published by MMC Ventures  in 2019 shows the distribution of AI-based applications in different sectors: one in ten enterprises now use ten or more AI applications; chatbots, process optimization and fraud analysis lead a recent survey’s top use cases. Prevalent applications include consumer/market segmentation (15%), Computer-assisted diagnostics (14%), call centre virtual assistants (12%), sentiment analysis/opinion mining (12%), face detection/recognition (11%), and HR applications (10%).
In order to have a better idea of the market impact, only in the first quarter of 2019, it was invested $28.5B in machine learning applications, leading all other AI investment categories. In total over $82B was invested in all AI categories as it’s shown in following figure.
After analyzing the current and predicted market size for machine learning and artificial intelligence applications, it is good to survey what are the most common use cases for customers and companies today. In this direction, Algorithmia has published an interesting report  about the ways their companies are using machine learning to ensure our understanding of the landscape is accurate or that we aren’t missing a key use case entering the enterprise. The results showed in Algorithmia dashboard anticipates a trend toward using ML to automate time-consuming processes and cut down on the number of human resources needed to do a given task.
Once arrived at this point based on the reported before, we are almost complete sure that machine learning and artificial intelligence will be impacting directly to our business, life and society in the coming years, empowering human-centered decision making capabilities, data-driven phenomena understanding and analytics-based competitive advantages to many cross domain sectors, but as other many technological advances in the humanity history, the AI-based solutions have a flip side.
In parallel with the AI-based solutions expansion from the lab to the market, the AI community has become to complain about the lack of transparency, quality & versioning control and ethic aspects of the machine learning and artificial intelligence solutions is the marketplace. According to many experts, nowadays almost all the AI commercial models are lacked transparency. This phenomenon is known as black box AI and it has turned out to be quite problematic in the coming years.
The black box problem was acceptable to some degree in the early days of the technology but lost its merit when algorithmic bias was spotted. For example, AI that was developed to sort resumes disqualified people for certain jobs based on their race, and AI used in banking disqualified loan applicants based on their gender. The data the AI was trained on was not balanced to include enough data of all kinds of people, and the historical bias that lived in the human decisions were passed on to the models .
Following we will like to share three clear examples of AI lack of transparency reported during this year by authority of different countries and regions:
- US Predictive policing algorithms are racist. They need to be dismantled : Yeshimabeit Milner director of Data for Black Lives, a grassroots digital rights organization reported to MIT Technology Review that there are two broad types of predictive policing tool. Location-based algorithms draw on links between places, events, and historical crime rates to predict where and when crimes are more likely to happen for example, in certain weather conditions or at large sporting events. The tools identify hot spots, and the police plan patrols around these tip-offs. The problem lies with the data the algorithms feed upon. For one thing, predictive algorithms are easily skewed by arrest rates. According to US Department of Justice figures, you are more than twice as likely to be arrested if you are Black than if you are white. A Black person is five times as likely to be stopped without just cause as a white person.
- An Algorithm Determined UK Students’ Grades. Chaos Ensued : In Scotland, the government was forced to completely change tack after tens of thousands of students were downgraded by an algorithm that changed grades based on a school’s previous performance and other factors. Anticipating similar scenes for today’s A-level results, the government in England has introduced what it’s calling a ‘triple lock’ whereby, via stages of appeals, students will effectively get to choose their grade from a teacher assessment, their mock exam results, or a resit to be taken in the autumn. Forget the triple lock, ethnic minority students from poorer backgrounds could be hit with a triple whammy. First, their teacher assessments may be lower than white students because of unconscious bias, argues Pran Patel, a former assistant head teacher and an equity activist at Decolonise the Curriculum.
- Tech companies want facial recognition laws, but maybe not this one : The Facial Recognition and Biometric Technology Moratorium Act  is the first facial recognition legislation introduced since that request came, and the companies calling for regulations have remained silent on the proposal. Tech companies understand those concerns, listing human rights issues and ethical questions when they announced their own moratoriums on providing facial recognition to law enforcement.
As it could be appreciated, despite the great advances on the AI-based applications in the last decade, there is still all lot to do in order to achieve the expected performance, accuracy and trust levels. The proper implementation of quality assurance, testing processes, and standards can quickly spot any bugs or deviations from established programming norms. Application can be running on controlled tests to make sure that new patches and fixes don’t cause more problems and we have ways to continuously test our capabilities as it continuously integrates them with increasingly more complex combinations of systems and application functionality.
The challenge as any other computer-based solution is how to balance the algorithms potential with human ethical and privacy aspects. As a model consumer, you just have the model. Use it or lose it. Your choice is to accept the model as is or go ahead and build your own. As the market shifts from model builders to model consumers, this is increasingly an unacceptable answer. The market needs more visibility and more transparency to be able to trust models that others are building. Should you trust that model that the cloud provider is providing? What about the model embedded in the tool you depend on? What visibility do you have to how the model was put together and how it will be iterated?
According to Ron Schmelzer, Principal Analyst at AI Focused Research and Advisory from Cognilytica in a publication for Forbes , the answer right now is little to none. Furthermore, in the same publication, it has been highlighted five key aspect to be considered in any future AI-based solution:
- Unexplainable algorithms: One issue that has been well acknowledged is that some machine learning algorithms are unexplainable. When a decision has been made or the model has come to some conclusion, there is little visibility into the understanding of how the model came to that conclusion. Deep learning neural networks, the current celebutante of machine learning these days particularly suffers from this problem.
- Lack of visibility into training data sets: However, simply knowing if a given model uses an explainable algorithm is not enough to understand a model’s performance. The strength of a model depends significantly on its training data. A good, clean, well-labelled data will positively impact on the models’ performance. On the other hand, if the training data doesn’t represent your real-world data, then even these highly trained, cleaned, well-labelled models will perform poorly.
- Lack of visibility into methods of data selection: the access to the gigabytes or petabytes of data used to train the model doesn’t mean you know what aspects of that data was actually used to train the model. Simply having access to the training data does not answer all questions of transparency. Full transparency also means knowing how data was selected from the available training data, and ideally even being able to use the same selection methods on the training data yourself as a model consumer to see what data was included and what data was excluded.
- Limited understanding of the bias in training data sets: Many times, models run into problems not because of bad data or even poorly selected data, but because there’s some inherent bias in the training data. The word “bias” is somewhat overloaded in the machine learning sense since we use it in three different ways: in the sense of the weights and “biases” set in a neural network, in the separate sense of the “bias”-variance tradeoff that balance overfit or underfit, and in the more widely understood sense of informational “bias” that is imposed by humans making decisions based on their own preconceived notions.
- Limited visibility into model versioning: Another problem with model transparency is that models might be continuously versioned. Indeed, good models should be iterated on so that they can be even better. This is part of well-established methodologies. For full transparency, those producing models should also make it clear and consistent how models will be versioned, model versioning frequency, and the ability to use older model versions if new models start to perform poorly. While good model providers will provide full visibility into model versioning, this is not done in a consistent or guaranteed manner, especially when comparing between different model providers.
In future posts, we will be continue writing about technology and business trends for enterprises. Furthermore, we recommend consulting the following literature to continue your digital transformation journey:
- Designed for Digital: How to Architect Your Business for Sustained Success, MIT review
- The Future Is Faster Than You Think: How Converging Technologies Are Transforming Business, Industries, and Our Lives, by Simon & Schuster
- Artificial Intelligence: The Insights You Need, by Harvard Business Review
- The Year in Tech, 2021: The Insights You Need, by Harvard Business Review
- The Deep Learning Revolution, by MIT Press
- Competing in the Age of AI, by Harvard Review Press
The objective of this blog is to provide a personal vision of how digital transformation trends will be impacting in our daily activities, businesses and lifestyle.