Machine learning (ML) models are only as good as the data you feed them. That’s true during training, but also once a model is put in production. In the real world, the data itself can change as new events occur and even small changes to how databases and APIs report and store data could have implications on how the models react. Since ML models will simply give you wrong predictions and not throw an error, it’s imperative that businesses monitor their data pipelines for these systems.
That’s where tools like Aporia come in. The Tel Aviv-based company today announced that it has raised a $5 million seed round for its monitoring platform for ML models. The investors are Vertex Ventures and TLV Partners.
Aporia co-founder and CEO Liran Hason, after five years with the Israel Defense Forces, previously worked on the data science team at Adallom, a security company that was acquired by Microsoft in 2015. After the sale, he joined venture firm Vertex Ventures before starting Aporia in late 2019. But it was during his time at Adallom where he first encountered the problems that Aporio is now trying to solve.
“I was responsible for the production architecture of the machine learning models,” he said of his time at the company. “So that’s actually where, for the first time, I got to experience the challenges of getting models to production and all the surprises that you get there.”
The idea behind Aporia, Hason explained, is to make it easier for enterprises to implement machine learning models and leverage the power of AI in a responsible manner.
“AI is a super powerful technology,” he said. “But unlike traditional software, it highly relies on the data. Another unique characteristic of AI, which is very interesting, is that when it fails, it fails silently. You get no exceptions, no errors. That becomes really, really tricky, especially when getting to production, because in training, the data scientists have full control of the data.”
But as Hason noted, a production system may depend on data from a third-party vendor and that vendor may one day change the data schema without telling anybody about it. At that point, a model — say for predicting whether a bank’s customer may default on a loan — can’t be trusted anymore, but it may take weeks or months before anybody notices.
Aporia constantly tracks the statistical behavior of the incoming data and when that drifts too far away from the training set, it will alert its users.
One thing that makes Aporio unique is that it gives its users an almost IFTTT or Zapier-like graphical tool for setting up the logic of these monitors. It comes pre-configured with more than 50 combinations of monitors and provides full visibility in how they work behind the scenes. That, in turn, allows businesses to fine-tune the behavior of these monitors for their own specific business case and model.
Initially, the team thought it could build generic monitoring solutions. But the team realized that this wouldn’t only be a very complex undertaking, but that the data scientists who build the models also know exactly how those models should work and what they need from a monitoring solution.
“Monitoring production workloads is a well-established software engineering practice, and it’s past time for machine learning to be monitored at the same level,” said Rona Segev, founding partner at TLV Partners. “Aporia‘s team has strong production-engineering experience, which makes their solution stand out as simple, secure and robust.”