IBM Chief Data Scientist advocates building AI factories

0


[ad_1]

Join the leaders of the Conversational AI & Intelligent AI Assistants Summit, presented by Five9. Look now!


Well-respected IBM engineer and chief data scientist John Thomas claims that to really take advantage of AI, companies must adopt a factory model that automates the model building process as much as possible.

Just as a traditional factory would reliably manufacture physical products on a large scale and at high speed, an AI factory would enable companies to quickly build and scale trustworthy AI models.

VentureBeat met with Thomas to better understand how an AI factory would actually work.

VentureBeat: What are the biggest challenges companies face with AI today?

Johannes Thomas: There are some recurring themes that we came across. Pretty much every major customer we work with has a data science team. You already have some kind of data science project going on. But many of these projects are just experiments. They don’t make it into production, and even if they do, it takes forever for things to move from concept to production.

When we start figuring out why it is happening and what is going on, there are a number of different things. Sometimes it’s a misalignment between the company’s expectations and what the data science team is building. Sometimes it’s about the model – it’s great to develop, but companies struggle to get it through the model validation and risk management processes and get it approved for use in production. Sometimes it’s about what happens after it goes into production. All of these are our challenges outside of the actual model building piece itself.

Not enough attention has been paid to these different phases of the life cycle. We see this again and again, even with some of the most advanced data science teams. They are very talented at using the algorithms and the libraries and frameworks for model building, but when it comes to deploying, managing, monitoring and aligning with the ongoing business impact it seems to be problematic.

VentureBeat: How do you fix this?

Thomas: Software development went through this phase a long time ago. Application developers just wrote code and it was just difficult to get everything into production. They needed a structured approach and DevOps came along. It’s the same mindset but now in the world of AI and machine learning. Just like a physical factory has a set of processes, a set of best practices, and people with specific skills to produce some goods on a large scale and at high speed. You need a similar construct. You need people, processes and technology.

If you look at the different phases of the life cycle, the first part of planning and scoping is an important step. IBM uses design thinking to work out all aspects of the project in a very structured way. The next stage is data exploration and the third stage is modeling. Then you start looking at trustworthiness and whether the data is skewed. All trustworthiness challenges should be part of the model building phase itself. Then the next phase is the validation and deployment phase, where we set best practices. A validation team separate from the model development team must run the validation performance metrics, check for fairness, review the explanations of the model, generate reports and ensure that the company has defined certain criteria or thresholds are met. The final phase is ongoing monitoring and management. Here you have guard rails to check the ongoing performance of the model. Once you set this up, it’s like being in a physical factory.

VentureBeat: Whose job is it to build this AI factory?

Thomas: It’s usually not the data science team because they don’t want to be in the middle of it. What we saw is, it’s the stakeholders. Each business unit has its own data science team working on a number of models as part of a hub-and-spoke construct. The person who takes care of consistency and scale in these areas of the business is a person who is committed to building a factory. You will involve people from different departments in the factory. IBM is helping them build the factory.

It’s not that everything has to flow through him. We don’t say that. We say the spokes have the freedom to innovate, but they follow the same guidelines. You follow the same design thinking process for defining and creating the action plan. They follow the same governance model. You have complete freedom in which algorithms and frameworks you use.

VentureBeat: Where do machine learning operations (MLOps) and DevOps fit into this factory?

Thomas: I didn’t use the term MLOps because there is so much more that is needed beyond MLOps. Understanding trustworthiness, bias, fairness, explanations, etc. The nature of AI and ML is a probabilistic, nondeterministic paradigm. A typical application development paradigm need not concern itself with this.

VentureBeat: Do the AI ​​factory and the software development factory have to merge?

Thomas: There are similar constructs at a very high level, but there are unique challenges in the world of AI. I don’t think AI factories and software development factories will become just one thing. There will be similar constructs and similar paradigms, but unique challenges need to be addressed individually.

VentureBeat: A factory implies automation. How is the data engineering process automated?

Thomas: I don’t think we’re about to automate everything. We want to automate the labor-intensive, manual drilling work as much as possible. When you are working with a very large data set with hundreds or thousands of functions, it is quite a tedious, manual, and labor-intensive job. You want to rely on automation as much as possible. Building a pipeline for model deployment should be automated, but with a human in the loop. The point is to ensure that the domain experts are properly deployed at the various stages of the lifecycle while automating some of the more mundane tasks. That is the reality of where we are.

VentureBeat: We hear all the time about the democratization of AI where end users will be building their own little AI frameworks. How does that fit into a factory model?

Thomas: We look at the different phases from the beginning. Before even writing a single line of Python code, the business owner must be involved in the scoping and planning phase. Often the data science team runs after data science metrics. ‘My model is great because you look at the precision.’ However, how this translates into actual business KPIs (Key Performance Indicators) is not entirely clear. Sometimes it isn’t. It is important to understand beforehand how your model will relate to the business KPI before creating a single line of code. The company has to be part of this lifecycle.

VentureBeat: Many end users are told, as part of the democratization argument, that they don’t need a data science team to build an AI model. Where is the line between the two?

Thomas: There are tools that certainly lower the entry barrier, but at some point you need the domain expert and the data science employee. If the businessman and the data scientist don’t work hand in hand, you can’t get that last mile. You can’t get to something that’s going into production. You can’t just throw data into a magic box to produce AI. It’s not real.

VentureBeat

VentureBeat’s mission is to be a digital marketplace for tech decision makers to gain knowledge of transformative technologies and transactions. Our website provides essential information on data technologies and strategies to help you run your organization. We invite you to become a member of our community to gain access:

  • up-to-date information on the topics of interest to you
  • our newsletters
  • closed thought leadership content and discounted access to our award-winning events such as Transform 2021: Learn more
  • Network functions and more

become a member

[ad_2]

Leave A Reply

Your email address will not be published.