Ben Lorica, Host, The Data Exchange Podcast

Data Exchange Podcast (Episode 96): Bob Friday

Industry VoicesLeadership Voices AI & ML

The screen shot shows a picture of Bob Friday, Bob Friday, VP, CTO, Juniper Networks, on the bottom left with the title of the podcast, The Data Exchange with Ben Lorica,” above. A quote from Bob Friday is shown. It reads: “Machine learning has been around for a long time, we’ve solved a lot of problems. I think the big transition is, are you really doing something on par with humans? I’m a big fan of what they call human-assisted AI…you’re talking about an AI assistant where the human domain expert still solves the corner cases. You use AI to solve 80% of the problem, but there’s still 20% of the problem that requires some domain expert. And to some extent, that’s what we’re doing. Marvis solves about 70% of the problem; 30% of the time we still need a domain expert to come in and help push it over the finish line.”

Entrepreneur Bob Friday on the Mist AI journey

In this edition of The Data Exchange Podcast, Juniper’s Bob Friday does a deep dive into the past and future of Mist AI™. Listen as he describes how his team uses data technology, machine learning, and AI to optimize user experiences and simplify operations across the wireless access, wired access, and SD-WAN domains.

00:00:42 Vision of Mist: Building an AI Assistant

00:04:48 First projects in Data Mining

00:09:10 Deep Learning for Anomaly Detection

00:13:41 Deploying and Managing Models in Production

00:16:20 High-impact Data Teams

00:16:55 Marvis: AIOps Engine and Chatbot

00:23:35 Human-Assisted AI

00:26:43 Graph Neural Networks

00:29:38 How to organize Data Teams

You’ll learn

Why Juniper tied its data science team to its customer service team
How location services were built with a machine learning algorithm

Who is this for?

Network Professionals Business Leaders

Host

Ben Lorica

Host, The Data Exchange Podcast

Guest speakers

Bob Friday

VP & CTO, AI-Driven Enterprise Business, Juniper Networks

Resources

Transcript

0:00 [Music] we have bob friday vp and cto at miss

0:08 systems a juniper company uh welcome bob to the daily exchange podcast yeah thank you for having me ben

0:15 great to be here so uh i know you folks are doing a lot of interesting and quite challenging

0:22 things in machine learning ai and and data

0:28 infrastructure so to set the stage bob let's first describe

0:34 the use cases for data science and ai and big data at mist

0:40 yeah so miss is focused in kind of the networking space you know you know and one of the inspirations for mist was if

0:46 you remember watson playing jeopardy yes our mission here is really see if we can't build something

0:52 they can answer questions and managing networks on par with uh human domain

0:57 experts right and so that was kind of the original vision of miss and that's what we've been working on for the last six years or so

1:03 and so bob for the people who don't follow networking so what what are the data sources

1:10 and what kinds of data are we talking about yeah so when you look at miss when we started really the question we're trying

1:16 to answer is you know why are you having a bad zoom experience why are you having a bad internet experience so when we

1:21 started miss we started on the wireless side yeah the wi-fi access point it turned out that about 80 of the data

1:28 we needed we got from that access point now as we've joined juniper we're starting to

1:34 expand our data sources from the access point to the switch to the router even to the client now where we basically

1:40 ingest data from all these devices you know if you kind of visualize anything between you and the internet or the

1:46 cloud we're starting to pull data from all those different sources that you know as your packet moves from

1:52 the laptop to the whatever the application is you're trying to get to so it sounds like so so

1:59 just generally network quality in in infrastructure the health of infrastructure it seems like uh

2:06 you have the classic ingredients of uh what we what the we used to call the

2:13 three v's right so volume velocity and variety particularly volume and velocity

2:19 um so how important bob is the technical foundational layer for the rest of what

2:25 you do so in other words you know before i i imagine before you do anything interesting you have to be

2:31 able to collect ingest store and kind of uh clean up clean the data

2:37 right yeah yeah no interesting you know when we started mist you know before we even could get to ai we probably spent two years

2:44 just getting the cloud infrastructure built up you know and one of you know one of the reasons i left cisco to start miss was

2:51 really what i noticed is okay if you're going to do stuff in real time like i want to process data in real time you've got to

2:57 have all the pipelines in place right you got to have slack link all those real-time processing to get done so

3:03 that's that was like a two-year process just getting the cloud built the infrastructure the pipeline so we could actually ingest data from all the

3:10 networking elements and what uh what kind of volume of data are we talking about

3:16 uh i mean right now we're talking about like 200 terabytes a day you know so you kind of think about you

3:22 know if you look at a networking devices we may have maybe a half you know at this point we have a half million

3:28 ap switches and routers out there across our customers sending us data back every minute every

3:35 minute or every second right so we have constant data flowing back at us from you know half a million networking

3:41 devices and growing out there so before you even get to machine learning

3:46 and ai i imagine you have to kind of have some basic applications like dashboards and

3:53 business intelligence and things like this and uh um so what about the uh challenges

4:01 around data quality and things like this so do you guys have to address any of

4:06 these yeah i mean the interesting thing what we found when we started the exercises we actually tied our data science team

4:12 to our customer support team because you had to make sure that you had the actual data needed to answer a question make sure we can get that to

4:19 the cloud and so that was kind of the first step in the process is making sure we had the right data to answer the question in the

4:26 cloud and and uh so so at what point

4:31 bob so so you spent two years you said building that foundational data infrastructure in the cloud

4:38 so what were your first initial projects in machine learning

4:43 uh once once you got all of this in place yeah i think the first thing i wouldn't even call it machine learning it was

4:50 almost uh data mining was mutual information right i mean so we basically had created

4:55 a framework that basically could allow us to apply uh you know mutual information where you can take one

5:01 random variable like why did the customer have a problem and basically correlate it with a network feature

5:06 you know so that led us to start to solve problems like hey if you have a robot in a distribution center that's

5:12 acting up we could start to isolate it down to the os type and the driver type right and that's

5:19 this machine learning so that was kind of the first thing what we call sle metrics and amazingly i will tell you

5:25 what we learned right off the bat was customers just appreciate having the data in the cloud you know getting the

5:31 data into a framework that you could apply a ml model you know that alone customers that

5:37 wasn't you know because that was the first time they didn't have to go to the device they actually had the data so that alone had a lot of value to our

5:42 customers when we started by the way i've been involved in many uh surveys in in

5:49 across the data and machine learning space and i think one of the key things that keep always

5:56 emerges is that the people who seem to succeed are the people who who did what you did which is

6:01 basically invest time on that foundation yeah you know the interesting thing i would say you know i've done several

6:08 startups in my career usually when you do a startup you usually have to kind of stop what you're doing when you scale

6:13 the business you know the interesting thing i missed we built the actual cloud foundation we still have the same cloud

6:19 architecture we had when we started and that's that that's all due to the fact that we did the upfront work and spent

6:24 the time making sure we had a very scalable foundation to build on top of so so in the area of network quality bob

6:32 is location data another piece of information you guys collect yeah you know the interesting thing you

6:38 look what we did you know we solved the network problem but interesting it turns out almost all our b2c customers

6:43 hospitals hotels retail those are all about customer experiences

6:50 so part of our problem is really around location and interestingly that was probably another one of our very first

6:56 machine learning algorithms was really how to learn what i call the path loss model this is

7:02 basically how to correlate rssi worth the distance and that's different in every building type and that was

7:08 actually the very first thing we did is build this virtual ble machine learning algorithm that would

7:14 basically automatically calculate the path loss so you didn't have to do any fingerprinting right we didn't have to

7:19 have people gathering around collecting label data and you think about that's really what people used to do that was

7:25 basically people walking around collecting label data for their models to work so we got rid of that that was

7:31 one of the first things we did here um so are you saying so this this uh

7:36 this model is unsupervised yeah this is unsupervised machine learning model that

7:42 basically will learn the path loss which is the relationship between signal strength and distance uh in a you know

7:49 in for a location engine do you uh i'm curious do you do you

7:54 folks use any kind of graph technologies like graph databases yeah we're actually just starting we

8:00 actually have a big graph database now for all our devices that basically uses that uh you know to start to do the

8:06 graph neural networking the first application we're working on really is what i call temporal correlation

8:12 you know so now we have a graph of where the packet goes between you know the device and the cloud

8:18 and the first thing is to try to solve the problem like configuration changes if you if you ever work in the networking industry it turns out that

8:24 the configuration changes are usually the root the root cause of a lot of problems you know where you shoot

8:31 yourself in the foot by the way define uh define uh give us an idea of the size of

8:36 your graphs nodes and edges uh i think that you know sizes right now

8:42 we're up to millions of nodes uh with thousands of edges because of the clients you know if you think of a

8:49 network right we may have thousands of nodes in the network but the clients could have almost have a 10 to 1 ratio

8:56 you know as you can kind of visualize the graph going out you know for every access point there could be 10 clients

9:01 attached to it and it's very dynamic right the network itself is pretty static but the clients are constantly

9:06 changing in the graph so i i would imagine so we're talking

9:12 quality service service uh service levels

9:18 uh and things like this i would imagine anomaly deduction would be something you

9:24 folks would be interested in anomaly detection was an interesting journey for us i mean we started the

9:30 anomaly detection we kind of started with the moving average uh then we went to this arena

9:35 that's good enough in the beginning right well it turned out arima was not good enough i mean when you do anomaly

9:40 detection especially in networking um it guys do not like to be woken up at three o'clock in the morning false

9:47 positives are not a good thing it really wasn't we it really wasn't until we got to these lstm models where

9:54 we really got anomaly detection with low enough false positives that it was useful for an i.t person right it's

10:01 like we got the false positive down to you know a couple percent where hey if you see an anomaly now it's worth making

10:06 attention to and for those who are familiar with llcm models the big thing was really you're predicting multiple dimensions right

10:13 you're predicting both the average value and the confidence window and so that was kind of the magic of these and it's

10:19 probably a great example where deep learning actually made a difference um in our networking industry space

10:25 right now you know where deeper learning makes a difference inside of networking so um

10:31 in so because you're using lstm i'm assuming it's not uh

10:36 it's not something you trained frequently right so um so

10:42 but what do you think you'll get to the point bob where you will need uh models that uh can be trained more

10:49 quickly than lstm i think we'll get that i mean right now even our lstm models right you know we

10:54 have 20 every site has an lstm model so we have like 25 000 sites

11:00 those 25 000 models get trained every week on the last three months of data so

11:06 every week we update the the model for every site in the network

11:11 and uh because uh you know because i i've played around with lstms for time series and i've

11:17 one thing i found was uh they're kind of slow huh

11:23 yeah the train so and then i guess there's these new things the temporal convolutional

11:30 convolutional networks that uh are starting to become competitive that might be faster to train yeah and

11:37 that's what we're looking at and i think that's the same thing you know on the graph side you know now we're starting to look at

11:43 if you think of a graph as kind of an image with the edges um you know the question is can we start to use

11:48 convolutional techniques on networking data you know can we take the networking data and make it look more like an image

11:54 where you can start to apply convolutional techniques to it so so you you folks spent some time

12:00 building your foundational data infrastructure uh i'm curious what sorts of uh

12:08 tooling and and tooling and infrastructure do you have

12:13 now for ml so now you have ml ops tools

12:19 uh that include you know the model server experiment tracking

12:24 feature stores so are you starting to dabble in any of these i would say we're

12:29 dabbling but i would say for the most part we built all those tools a lot of those tools we built ourselves

12:36 because they were because they weren't around probably right they weren't around and i would say in our case it's

12:41 more about what i call digital twin you know right now we basically have what we call a staging network that

12:47 basically reflects our production network you know so everything goes into the staging network now there's a whole set

12:54 of continuous integration test tools that go through every time we check in whether whether it's co whether it's a

13:00 model or code every time you check something in it has to go through continuous integration test to get

13:05 checked out so i would say that's probably the biggest part of bringing this thing to life

13:10 i would say the machine like we i think we talked about this right the ml model by itself

13:17 is probably the simplest piece of the puzzle in terms of trying to scale this for production

13:22 um you know if you look at the day of the life of a data science the model itself

13:28 is the smallest piece of the problem right now you know yeah yeah so bob uh

13:34 talk to me about the staging environment so how so the staging environment is kind of a mirror of

13:41 the what's happening in production so if i have a new model i push it out to staging

13:46 and so then how do but uh

13:52 so so then do i start getting metrics around performance of this new model that

13:57 are mimicking what's happening in the real environment yeah i mean this kind

14:02 of follows you know if you look like cloud vendors like google netflix this kind of follows a cloud model of usually

14:08 you go into a staging you know digital twin then it goes into kind of like um a small subset of customers right yeah

14:15 yeah yeah like uh what what do they do is it called canary

14:20 canary testing or red blue testing or that you know that's kind of the classical way of you know releasing

14:26 things into a cloud environment you know you started staging digital twin bring it into a small set of you know scaled

14:33 customers right because that starts to test scale right making sure nothing breaks you know at some scale and make sure things still work and nothing has

14:39 changed and then you start to open it up to a larger set of customers cloud cloud customers

14:45 so we we know that models once you deploy them uh

14:51 you have to retrain them like you folks retrain them uh once a week have you had situations where actually

14:58 uh something went wrong in between those two regularly scheduled events that you

15:04 had to diagnose what happened and then and then push out a fix earlier

15:11 we haven't had i mean we haven't had that happen you know what typically happens is you don't pass staging you know something

15:17 goes into staging uh there's production pushes every week you know and usually what typically

15:23 happens is you usually will catch it in staging and back it out it's like it doesn't work in staging you back it out

15:28 there we haven't had anything really get into the wild yet where we've had a major bug or major break a model

15:34 situation where a model breaks out in the in production so how do you how do you folks uh

15:41 uh uh what what's your tools around collaboration so that people can collaborate

15:47 across maybe even teams like you have data scientists collaborating with data

15:53 engineers collaborating with devops so what what's your tooling to to uh

15:58 enhance collaboration now the the amazing thing is you you probably seen with the engineers right they they could

16:05 be sitting across the table from each other and they would prefer to slack each other than talk to each other

16:11 so i say slack is the uh probably the preferred tool of the engineering team right now in terms of communicating i

16:17 would say probably the more the more interesting thing is we found is you know to make this whole thing work

16:23 is we had to basically tie the data science team right to our customer support team

16:28 you know and that was probably the other thing i found kind of going from a big company to a small company is

16:33 organizationally you have to get the domain experts tied to your data scientists and that was probably one of the key parts of our

16:40 success here that you know every ticket every support ticket that comes into miss right now we basically use marvis

16:46 to answer that ticket right because you know if you think about us we are basically a gigantic service provider you know we have

16:52 visibility across all our customers right now and for our listeners marvis this

16:58 marvis is our ai ops engine that's basically our ai assistant that's uh that's what our customers use basically

17:04 to help them answer and manage questions on par with the domain experts in but this is a conversational a chatbot yeah

17:13 there's two components to marvel's right now one is what i call a conversational interface which is really around

17:19 troubleshooting problems uh there's really another self-driving actions where we have customers who are now

17:25 letting marvis generate support tickets or make changes to the network automatically right

17:30 so if you look at there's two dimensions one is you know if marvis can figure it out let marvis fix it if marvis can't

17:36 figure it out marvis helps you troubleshoot and get to the data you need to solve a problem

17:42 so if you were to take marvis and point it to a

17:47 a different area so how much data does it need in order to provide value

17:53 uh well i mean if you if you look at the top of the pipe right you kind of think of a funnel you know at the top of the

17:59 pipe we have uh 200 terabytes of data coming in every day right

18:05 at the very bottom of the pipe you know out of that 200 terabytes of data you may have

18:11 you know three or four actions you know and that's kind of the filtering that's going on here right trying to get down

18:17 to actionable events right something that an i.t person needs to take care of and that's that's the paradigm shift of

18:24 generating tons of events that they couldn't it was too much noise you know and that's what we're doing with ai ops right we're basically taking

18:31 all those events and trying to filter it down to some insightful action that an i.t guy cares about

18:38 so uh so there's two metrics that uh as you

18:45 were talking i started thinking of two metrics precision and recall yes so you can have very high precision

18:52 but how do you know you have good recall yes uh how do you know you're not

18:57 missing something well i would say there's always the uh i think right now the thing is more

19:03 about making sure that when marvis gets into correction that it's 99 accurate

19:09 which is more important than actually missing something i see i see right and so i think if i look at the uh i think

19:16 the other metric if you look at our support team right now this on the troubleshooting questions i

19:21 mean there's usually two types of questions troubleshooting and kind of question questions the troubleshooting side we're at

19:26 seventy percent so seventy percent of the time marvis can basically solve some problem uh which is probably

19:33 our big metric right now is how close are we getting to playing jeopardy level networking championship you know how

19:40 close can marvis actually perform on par with the network domain expert

19:45 and and uh so the remaining 30 percent uh is there a way to describe what uh

19:52 what characterizes that remaining thirty percent i mean usually what happens like in our support team right they'll use

19:58 marvis to try to answer a question and now usually come back that doesn't come back with an answer

20:04 marvis will typically come back with the data you know most relevant to that question

20:10 and so that's probably the other paradigm shift of if i look at conversational interfaces and nlp

20:16 i think this is the new new user interface you know for networking other industries

20:22 right we're kind of moving from the cli dashboard into this conversational interface to

20:28 make it much easier to get to the data you need to get to so either you get the answer or marvers gets you to the data

20:33 that's you know gets you closer to the answer so what's uh how does uh so does marvis provide you a link to the

20:40 data so how does that work what's the ui for that i mean so the ui for that is basically marvis will provide you the

20:47 top three car you know top three most likely causes you know if you're basically saying hey you know why has

20:52 been having a bad zoom experience it'll get down to well it looks like he's either had got interference on his wi-fi

21:00 you know he's got congestion on his router these are the three most likely reasons for ben having a problem on a

21:05 zoom call and then uh as you guys gather more and more data

21:11 this uh this and the answers that marvin's marvin supplies gets better and better

21:16 right yeah i mean i think if you look like where we started the journey with the wi-fi access point

21:22 you know that gave us we could answer a lot of questions about why you're having poor connectivity problems now that we

21:27 get the data from the router the router has a lot of visibility into the sessions and the applications right

21:35 so that's lets us start to answer more questions about you know why is zoom or why is teams having a problem

21:41 and it starts to let us answer more questions of more granularity right you know as i start to get data from switches i start to understand exactly

21:48 what went wrong in the switch you know did the vlan get misconfigured and then i can start correcting it right

21:53 if i have control of the switch marvis can actually fix the vlan if it got misconfigured

21:58 so uh our so i'm not sure uh

22:04 if this is something you've thought about or something that you guys uh grapple with but

22:09 uh this this notion that uh machine learning while it's great

22:14 there are areas where domain knowledge could help um

22:20 and if if you subscribe to that belief do you how do you folks

22:25 combine machine learning and domain knowledge well i think you know as i said between the support team data science because

22:31 for me i think people confuse the fact that you know ai is usually the concept of doing something on par with a human

22:39 there's a lot of things we solve with ml right i mean ml has been around for a long time we solve a lot of problem i think the big transition is

22:45 are you really doing something on par with a human are you really building something that's going to be on par with

22:50 a human which really comes down to the main expertise right you know we're basically trying to take

22:57 domain expertise and stick it into marvis you know we're trying to build something that actually

23:02 does something on par with a network domain expert right

23:08 yeah sometimes uh so i'm not sure if this is this uh is the case in networking but there's

23:14 areas for example in industrial applications where uh maybe you just don't have that many

23:20 anomalies because the equipment is super super reliable and so but the humans

23:26 might have some level of expertise where you can build an initial system that's mostly

23:32 human knowledge and then over time the machine learning gets better and better right yeah yeah i know i'm i'm a big fan

23:38 of what they call humid assisted ai you know this is kind of like you know the robot in your house yeah you've got a

23:44 robot in your house and all of a sudden it gets stuck in a corner case right you know and basically and i've seen i

23:50 think the startup there it does is you know the robot will call back for human assistance right if it gets stuck in a corner or something i think that's what

23:57 you're talking about is more of a ai assistant where the human domain still you know brings the solves the corner

24:03 cases you know you use ai to solve 80 of the problem but there's still 20 percent of the problem that requires some domain

24:09 expert and to some extent that's what we're doing with marvis right marvis solves about 70 of the problem 30 of the

24:16 time yeah we still need a domain expert to come in and help push it over the finish line

24:21 so so what directions are you looking forward to bob in the la next six to 12

24:28 months in terms of you know things that you you and your team will push out or things that you

24:34 and your team will investigate yeah i mean we talked about a little earlier was you know i think right now

24:40 nlp conversational interfaces you know for me for a virtual assistant to really become a trusted member of a

24:48 team you need to be able to interact with that assistant right you know as you

24:53 would a a normal person and so getting that whole conversational natural language thing down we're

24:59 investing a lot of energy on that and you know to help marvis become a trusted trusted member of an i.t team uh the

25:05 other thing we're starting to work on is the other thing that's kind of natural in human behaviors you know as

25:11 you hire a new intern or someone they get trained to get better over time

25:16 you know i think that's the other piece we're starting to work on right now with marvis is how do we bring back that domain expertise feedback you know so we

25:23 can transfer domain knowledge into marvin's now how does marvis really learn from a you know an i.t domain

25:29 expert that is interacting with it on a daily basis right right right so how do you how do

25:34 you how do you do that in a principled way right so yeah it's easy to do when you think about

25:40 training but you think about continuous you know how do you continuously improve the actual algorithm

25:47 with human you know human interaction so uh is bob is reinforcement learning

25:53 something on your radar it sounds like it could [Music] be because of the nature of your data

26:00 it could be potentially something worth investigating yeah i mean so we use reinforcement learning you know what we

26:07 call network radio resource management uh so that's one application where we

26:13 actually use reinforcement learning and it's really about you know taking the user experience as that reward you know

26:18 now that we have actual visibility of the user experience you can start to adjust the power and the channels on the

26:26 radios now to kind of optimize that user and that's kind of the words the user experience gets better you kind of learn

26:32 you know what's the right channel and power assignment that optimizes the user experience metric and you had

26:38 mentioned something about graph neural networks so what what are you folks planning in that area i mean i think we thought

26:45 you know that's where we were starting to look at you know you know can we leverage the power of all the convolutional you know we got lstm kind

26:51 of solving kind of the time series problems um when you start to look at networking as kind of an image right you know if

26:58 you look at in a graph you know where we have switches aps and clients you can kind of

27:04 start to see if you see something disconnected you can treat that like an image problem right you treat each little node as a pixel

27:10 and the edges are basically giving you information what's connected to that connected to that pixel right and that's

27:15 kind of what an image is right you got pixels and you got pixels near it um and that's what the graph

27:21 brings right it brings that topology into the thing which kind of let's

27:27 start pushing down the convolutional neural net graph neural networking path yeah yeah actually the graph neural

27:33 network the reason i bring it up there's a couple of uh uh great use cases that uh

27:40 uh seem like uh they're right up your bailiwick the first one is recommenders

27:46 so people are using starting to investigate it for recommenders and it's you know it sounds like some of

27:52 the things your your you folks are doing are kind of in that realm and then the other one is uh

27:59 i know that some people have had some good success around graph neural net networks around uh

28:05 traffic you know traffic prediction like i know deepmind has has worked with the google maps team

28:11 to to improve kind of real time estimated time of arrivals for example yeah yeah i saw that one right they're

28:18 starting to be able to predict the traffic jams right we leave you know they're starting to look into the future and say yeah you better add an extra 10

28:23 minutes because by the time you get to that that intersection it's not going to be as uh it's going to be a little more crowded than is that so yeah that's

28:30 perfect it's the same sort of traffic pattern right if you look at the networks they kind of look like traffic

28:35 uh you know streets and highway thing you got a bunch of little streams all going back to

28:40 the land router turns to be the bottleneck and everything so uh in closing so you know

28:47 sometimes we in industry we over focus a lot on tools right so tools

28:53 models and infrastructure and things like that but when we last talked uh

28:59 i know that you have a lot of uh strong opinions and also a lot of good

29:05 lessons and best practices for our audience around organizations and best

29:11 practices for data and ai teams so um

29:16 anything you want to highlight bob in terms of uh lessons you've learned in terms of

29:22 in your career in terms of uh how to organize data and ai teams yeah i

29:28 mean i i think you know on the missed adventure i mean the one thing i would kind of recommend if you're headed down the ai path you know the first thing you

29:34 really want to start with is try to answer the question of what human behavior are you trying to mimic you

29:41 know if you're building you know are you building an ai solution or just an ml algorithm um you know once you get that

29:47 down that kind of really defines what data you need because that kind of leads to make sure you know what question you're

29:54 trying to answer and then organizationally i would say my experience you know between data science and domain expertise organizationally

30:01 organization is almost as important as the architecture you know you got to have that right cloud foundation you

30:06 know set to actually collect and process data but organizationally you got to make sure you have the data science and

30:12 domain experts kind of organizationally aligned to work together to solve problems and that's what we've done here is

30:18 basically you know it's not just modeling you know data science here lives from gradle cradle to grave they

30:23 start with the customer the problem and then they basically have to at the end of the day verify they've actually made a happy customer

30:29 right right right so how do you how do you uh how do you really move the needle in terms of the business and and improve

30:36 business value as opposed to just tweaking and tuning models all day

30:42 yes yeah at the end of the day someone's got to care that your model actually did you know did you do something useful that

30:47 the customer cared about it and by the way uh as we discussed previously uh

30:53 many cases actually the the best investment of your time might be working on data problems not

30:59 model problems yeah you know as we said before for me it's like i can train i could anybody who wants to be a data

31:05 scientist can become a data scientist in three days i mean you can be up and running and training a model you know in

31:11 probably three days almost all the work around this ai data science stuff is really around feature engineering data

31:17 pipelines um and getting all that stuff correct before you can even actually get the model you know get the model up and

31:23 running so for me the model is like the smallest piece of the problem you know when you're actually trying to get something in production working

31:30 so when you uh when you when you on board the new data science scientists

31:35 uh to your team you kind of set out the expectation that uh hey

31:41 in this place you've got to build your own pipelines you got to work with business

31:46 uh users and domain experts and and uh you've gotta just not

31:53 expect that you're gonna be building models all day yeah if you if you go through an interview here there's kind of the typical you know do you know your

32:00 ml you know math and theory but the other big thing hurdle here is really around

32:05 you know how good of a problem solver are you you know can you actually you know can you actually work with

32:10 customers and customer support and actually understand the features you need to solve a problem

32:16 do you understand the problem right you actually can actually understand the problem that needs to be solved before you even start the modeling

32:22 process that's actually harder than you think hey uh one last thing i forgot to ask you

32:29 which is uh we talked about conversational assistance and marvis uh is there going do you

32:35 envision a point where you're also going to be looking at speech and voice

32:40 yeah you know that's interesting that's a hard problem because uh i've actually looked at that and kind of like meeting summarization problems

32:47 you know and just getting the speech-to-text problem that's that's that's

32:52 by time you look at the speech-to-text you've got so much noise there alone solving that problem that's that's a whole nother piece of the puzzle that

32:58 needs to be uh improved upon right right right and with that thank

33:03 you bob and uh super exciting i mean because basically what i love about what you guys are

33:08 doing is it's basically the scale and the immediacy and also kind of uh

33:17 you it's the kind of thing that you you can't really make a lot of mistakes because network quality every a lot of

33:24 people require very uh reliable networks right yeah i mean this has been i have to say

33:30 of all the journeys we're getting closer to this jeopardy networking championship you know i i feel like we're almost

33:36 there we're at 70 percent i feel like in the next couple years we will be at 97 percent playing uh jeopardy on par with

33:42 these networking domain experts thank you bob

33:47 okay thank you

33:54 you