AWS announced Amazon SageMaker Ground Truth to help companies create training data sets for machine learning. This is a powerful new service for folks who have access to lots of data that hasn’t been consistently annotated. In the past, humans would have to label a massive corpus of images or frames within video to train a computer vision model. Ground Truth uses machine learning in addition to humans to automatically label a training data set.
Also: Will AI need therapy in the future? CNET
This is one example of an emerging theme over the past year or so — machine learning for machine learning. Machine-learning data catalogs (MLDCs), probabilistic or fuzzy matching, automated training data annotation, and synthetic data creation all use machine learning to produce or prepare data for subsequent machine learning downstream, often solving problems with data scarcity or dispersion. This is all well and good until we consider that machine learning in and of itself relies on inductive reasoning and is therefore probability-based.
Let’s consider how this may play out in the real world: A healthcare provider would like to use computer vision to diagnose a rare disease. Because of sparse data, an automated annotator is used to create more training data (more labeled images). The developer sets a 90 percent propensity threshold, meaning only records with a 90 percent probability of being accurately classified will be used as training data. Once the model is trained and deployed, it is being used on patients whose data is linked together from multiple databases using fuzzy matching on text data fields. Entities from disparate data sets with a 90 percent chance of being the same are matched. Finally, the model flags images with a 90 percent or greater likelihood of depicting the disease for diagnosis.
Also: Top 5: Ways AI will change business TechRepublic
The problem is that, traditionally, data scientists and machine-learning experts only focus on that final propensity score as a representation of the overall accuracy of the prediction. This has worked well in a world where the data preparation leading up to training has been deductive and deterministic. But when you introduce probabilities on top of probabilities, that final propensity score is no longer accurate. In the case above, there’s an argument to be made that the probability of an accurate diagnosis diminishes from 90 percent to 73 percent (90 percent x 90 percent x 90 percent) — not ideal in a life-and-death situation.
As the emphasis on the need for explainability in AI increases, there needs to be a new framework for analytics governance that incorporates all the probabilities included in the machine-learning process — from data creation to data prep to training to inference. Without it, erroneously inflated propensity scores will misdiagnose patients, mistreat customers, and mislead businesses and governments as they make critical decisions.