Ask Me Anything:IISc and TalentSprint's Advanced Programme in Computational Data Science
Good morning everyone. Thank you so much for joining in today. I see both Professor Shashi as well as Professor Deepak has joined us. I hope you can see the screen if you can, can you just see a yes on the chat window? Now I can see a couple of yeses so I'm guessing we are good thank you so much. So Professor Deepak Professor Shashi very warm welcome to you. I can see Professor Deepak dog can see Professor Shashi. I have some issues with the camera, so I'm no problem. So what we're gonna do is that today we go we have a presentation, we do have I mean, we've been kind of getting a lot of questions from people as well. We are going to be answering some questions. While we will be talking about bureau data science, and you know, how I see is doing very well. My name has an introduction. My name is Ernesto pituitary. I'm Senior Director for sales and marketing at talentsprint. I would say I'm also my chief role is to counsel a lot of professionals I have been doing that work my entire professional career of more than a decade, I have worked with the Indian School of Business before I've worked with Pearson before that are primarily in helping professionals upskill themselves in various areas need technology, feed softer skills, etc. I'm also very keen adopter of AI. I'm I don't know too much about AI, though. I'm learning I'm at the phase where we are doing that. And we are using AI very interesting talentsprint in terms of using it in marketing using data that we have in marketing, etc. So that's me, Professor Deepak, would you like to give a small introduction about yourself, followed by Professor strategy or, you know,
whichever order you prefer? Sure. So I am Deepak subramani. I'm an assistant professor in the Department of computation and data sciences at the Indian Institute of Science, Bangalore. So I have a PhD in computational engineering from MIT, Massachusetts Institute of Technology in the US. And before that, I did my BA tech from IIT, Madras. I've been at IAC for a little more than two years now. And my expertise is in using computation, data science, specifically machine learning artificial intelligence techniques and tools for application problems in climate sciences, environment, modeling, autonomous underwater vehicle routing, and things like that. So that's what I do. Audio processors. Good morning. Good morning.
Good morning, Soto, about being members of talentsprint. It's nice to meet you all. I'm prosthetic mechanism, currently the chair of computational and data science department at ISC Bangalore. So basically, I trained as a computational mathematics, mathematics, especially with the finite element method. So the CFD and the high performance computing and nowadays mostly on the hybrid cloud computing. So these are my specialist areas. And recently, I'm mainly concentrating on a data driven science, what I said how it is different from machine learning AI. So conventional a machine learning is about using the algorithms and then you find the reasonings or classifications, probably you're going to learn all those things in the course. But what we do it with the machine learning all these tools, we are trying to model or we are trying to design the flow parameters or scientific parameters that comes from a different applications. So that's we call it test. And there are three gun science, though we are using all the tools, but not for the conventional applications, but more of the scientific applications. So that's the current research in first of our group. And yeah, so about this program, maybe a little Can I can produce a little bit talk about and people can take it over, right? Yeah, so the idea of the award deca advanced programming program on computational data science came probably two years ago then the idea that I would sprint has approached us and then of course, Department of computation science, and we are offering data science program last five, six years. And in fact, we were the first department in India when we were at this party or somewhere around the end of 2002.
We are starting the computational data science and the discourse of the full fledged department at the time and we have been training around 35 students of impact program regular posts every year. And in addition to that, we are also having an impact research program. And also we have several computational data scientists coming out of our department to PhD. So let's say our background. So with this x provides, what talentsprint wants us to do is we want downstream oil has done an industrial survey, and then they want to have, they want to really scale up this because if you look at the demand for the data, separately, it's growing exponentially compared to the two years now even more, especially after the lockout. So what demonstrate quarters to do is instead of going into the regular programs, because we have all the restrictions of 3030 visits per year, so beyond that, we cannot extend that to several other reasons. So the idea is to scale up and to share our knowledge, data science with the industry professionals, and so that they can, they can also take it, and then they can also use it as a carrier. So that's the whole idea. So they don't need to leave the job, but they can become a data scientist, by doing getting trained to this program. So not how to scale up. That's how the idea comes off, like our bonds program, like a 10 month program. That's how, and also we have the flexibility because being an online mode, so we have the flexibility of permitting more than 3050 students. That's how we got started. So that's the background of why we have started the program. And then Okay, once we have decided to Yes, that's a good idea. So we need to put it in the industry professionals and we need to train the industry professional, then the client then comes to like, what kind of modules what kind of course what kind of training do we want to provide? So because we do not want to be one among the many courses that are available, but what we want to have is we want to bring some uniqueness, the what kind of uniqueness that can we bring in, that's their discretion. So, of course, this discussion event, several rounds between talentsprint our colleagues in department and even from other departments management and easy and so on. So then we have come up with the idea that, okay, it's not only about the training, the tools, but also those who are getting trained under this program should know what is going behind the tools. And this is really, if you want to really become a successful data scientist, one should know what is going behind just not using some of the TensorFlow some tools and solving some problems classification identifying is not going to help. So if at all, you want to become a leader in your organization, read computational data science group, and so on. One needs to know from the fundamentals reading from the data collections and data analytics, and also the machine learning or AI how these tools and one should know what is going well. So that's how the program has been module starting from the core maths what's what are the mathematical theory behind the data science, and also the what are the computational programming concepts that's going behind the tools. And so these are the fundamental tools that we concentrate on the first couple of modules, then we want to the advanced programs like machine learning, and data science, data engineering, and even business analytics. So that's how it needs to be designed. So with this, what we have done is we have called the faculty from different departments. And from this department, we have come up with the six, seven modules, including the model zero, or the interface modules, and so on. And the program has been designed in such a way. So that's how we can say that our unique feature of this program, but in addition to that, they will also be discussing about what we are really planning what kind of applications and also the capstone project assignments and all these things. So that's all I would like to say by maybe after that, I will also come back up here any questions about the faculty and other things, I will also share?
Yeah, thanks, Professor, shiso, you're covered most of the things that we want to talk about. So maybe we'll keep a very short presentation. So maybe about 1010 minutes or so. And then we'll have a discussion about what are the different questions that you have? and things like that. So I throw Do you can you give me access to change the slides of?
Yes, I will. I can change it for you. I mean, we can, you know, you can just tell let me know, I can just give me a second. Remote Control. I am giving you remote control. You have that? Okay. I think there is an issue with I'm going to work on that Professor. We can start off with these slides. And I'll just let me know when to switch it. I am going to get that I'm going to move it around this.
Sure. So I suppose so she was saying right. So data science is growing. So there is this famous quote that data is the new oil but it's really valuable only if it's processed, right so and it has been broken down and information is collected. So information is really the currency. And analytics is the compression engine that is there for really getting data science in business operations. So next slide. So when now what do we do mostly right? So when we start wanting to learn some topic, we Google for it, right? So then if you see the search trend of how data science, Python machine learning, all of these terms have grown in the past 1015 years, so you will see a clear uptick, right? So in the use of Python around 2015, ish, around 2015 1415 times, that is where there is a clear uptick in the search terms for Python. And around the same time is when you are also seeing the increase in the search term for data science and machine learning, right? So, so much so that people have said that data science is the most sexiest job of the 21st century, especially with this lockdown, you can see the amount of salary and all that people in data science industry are getting right. So it's really, really good. So when we come to data science, right, so and when we start searching for this, people, have we we drown in this jargon universe, right? So it's a comp jargon world out there. People say machine learning, Ai, deep learning, like what does all these things mean? Right? So not just the utter, if you saw, the search for Python was much more than machine learning and data science per se, right? So Python is a tool. c++ is a total program is a tool, right? So progress. pytorch is a tool. These are all packages and tools that are there. But if you think about it, right, so these tools change really fast, right? So pytorch was not there for five years ago, TensorFlow was not there five years ago. But regression classification, the concepts, right, so those techniques have been around for a long time. And much longer than that is what the fundamental mathematical ideas of statistics, curve fitting optimization, matrix algebra, all of these have been there for a really long time, like they have stood the test of time, like 100 200 years, they have been there. And these techniques are also been around for a long time. But tools change really fast. So what is the main issue that we see over here is that people find it extremely difficult to separate the tools and techniques. And if you think about these fundamental concepts as the foundation of your building, so these fundamental concepts are the foundation of your building, on top of which you have built the techniques and tools, which are the different floors, and tools are just the interior, right? So when the paint or the fittings that you keep in your house, it's like that really can change very fast. And the strength of the building really comes from the firm foundation and the techniques that the knowing that what are the different techniques that are there. Of course, tools are also important. But we need to look at it as a holistic approach and understand what really these different terms mean, and study them to apply them in practice. Right. So that's the goal with which we are proceeding.
Yes. So if we take a unified view, right, so we will take a unified view of what data sciences many of you might be wondering what really data science means, right? So data science is an umbrella term. It involves a lot of different things like problem formulation, Internet of Things, data visualization, saying stories from data, right? And of course, machine learning, right? So machine learning is also a part of this data science. Artificial Intelligence per se, is slightly different, right? So artificial intelligence per se is slightly different, in that it has existed as a field on its own for a long time. And it's really the intersection of artificial intelligence and data science that is where the idea of machine learning lies. and machine learning has a lot of tools within it likes a lot of techniques within it mostly what is called as regression classification, and support vector machines, decision trees, all these are data driven models or machine learning algorithms that are within the context of the intersection of data science and AI. So at that place, within machine learning is this emerging field of deep neural networks. That is what has caught the attention of most people because deep learning deep neural networks today are able to learn from data at an unprecedented fashion than what was possible before. So all of this has been made possible by our throat next. Right? Yeah. Keep all of this Next one all of this has been made possible by the advent of data everywhere right. So, one internet minute in 2020 So, it has generated millions and trillions of data points in just one minute on the internet has generated so much data. So, is this big data the availability of data all of your devices are now smart right. So, your smartwatches smart TVs, smart fridge smart washing machine right. So, that is what is called as the internet of things right. So, all these things are connected to the internet, they are smart, they collect the data, they send it to some server, right and on the server lies many machine learning algorithms and deep learning algorithms that make sense of this data, they process that data right and create information from it, and businesses can act on this information and make decisions right. So, that possibility has come because of this availability of large amounts of data that people are generating today. And that is where the data science has become really, really important today. Right. So, so, if you want some concrete examples, let's look at how the finance industry uses data right. So, there is a lot of data that gets generated in the finance industry. And let's look at how some of the applications which uses data, predictive models and real time simulations in the finance industry, so, one major thing that the finance world is worried about is fraud detection, right. So, is there online fraud, credit card fraud, so, as soon as you swipe your credit card, you get a message saying that Okay, so much and so, was five, but there needs to be a decision made at that point of time whether the transaction was your own or the it was a legitimate transaction or it was a fraud transaction right. So, how do you make that decision and it has to be done in real time. So, that problem of fraud detection is a typical example of how big data data science at scale really operates, right. So, there's tons of data that gets generated on what is a correct transaction and what is an anomalous transaction and this has to be done in real time across large infrastructure right. So, there are so many points of sales machines that are there around the country around the world and it has to be done in that fashion. Right. So, that is one application fraud detection, the other is in lending right. So, that is the loans that are processed right. So, how does lending company make the decision whether a particular loan is going to be a non performing asset or not right. So, what is the interest rate I must charge on a particular person or not. So, that decision is increasingly made from machine learning applications data science applications, right. So, information, such as your profile, what your job is, what are your past expenditure patterns,
all of these are used to in classification models to analyze the probability of you defaulting or not, they quantify the risk of lending you that amount of money and decide based on that the amount of interest that must be charged to you. So, this is another major use of data. And of course, I did also mentioned data is used a lot in marketing right. So identifying new customers, new investments, new credit card, sales of new credit cards, all of this also depends on analyzing the customer's purchasing power and credit history. So, that also there are models that will predict whether a particular profile is a potential new customer or not right. So all these things are also used. So, you can see finance industry as a lot of these application areas where data is used, where machine learning is used for data engineering at scale is used. And now IPL is started right so Friday IPL has started. IPL is another place where a lot of data is used to make decisions, especially during the auctions and even there are these wearable devices, which these players wear so you can see here wearable device that is own right so what happens really is that these wearable devices collect data about the player's physical performance and the coach is able to make decisions to minimize the injury risk or the level of strain that these players are placed under decision review system that we all know and we are allowed to hate. Right. So what is this umpires call and things like that? Right? So that Hawkeye decision review system It relies a lot on high resolution cameras that are placed around the ground. And it involves both database models and physics based models a combination of how the ball is going to travel after it bounces, right and based on data and physics, these models are built in such a way that you make a prediction of the trajectory of the whether the ball is going to hit or not. Right. So based on that calculation, so there's a lot of data science machine learning, in fact, scientific machine learning that goes behind the scenes in in the in predicting where the ball is going to go and itself is a big data problem right. So, all these 16 odd cameras that are placed high resolution cameras, the data must come in real time It must be processed through a big data platform that must be taken in analytics must be run on top of that and all the all of this must happen in real time right. So, that is the power of data science or big data in sports, right. So, this is another interesting example. So, we we all understand the power of data science. And of course, a question comes right. So, how businesses been able to use data driven decision making in their day to day business activities, right. So, to answer this, there was actually a very interesting survey that was conducted in last year, among the fortune 1000 companies, the CXO of these fortune 1000 companies, were asked this question with the intention of understanding whether data has transformed your business or not, right. So, the findings are in fact very interesting. So,
the key findings are that the investments in big data and AI initiatives have actually leveled off most companies 98.8% of the companies have said that they have already invested and only half of them are saying that they are investing more. So that is their investments is increasing. So, what that means is most of the companies have invested and are not investing further. However, the surprising aspect is that only a very small fraction 15% also has actually used AI in production. So, why is that right? So, we want to understand that to bridge that, we would have expected that at least 50% of the companies or 75% of the companies have done AI in production. So, what was found is that building a data driven organization is a challenge like 75% of the firm's said that building a data driven organization is a challenge and only a quarter of the company said that they were successful in building a data culture. So, what does data culture mean? data culture is the principal challenge which most people are saying and data culture is all about people and business related processes and technology tools was not a challenge it just trained data scientists, availability of trained data scientists, who know the technology, who knows the business and who can bridge the gap right. So, next slide, who can really bridge the gap between a proof of concept and taking it out as a system prototype and in production lead, so that gap this gap is what is called as a innovation valley of death in most product life cycles. So, trained people and champions are needed who know both the data science or the math behind data science, the different models that are there the different tools that are there, and the vendor industrial needs right. So, that is required and that gap, that is where you people come in right. So, you as trained. So, through this program, we hope that you get trained in the use of computational data science techniques and tools and with your existing business knowledge, you can fill this gap and help your company take data driven decisions and take it into production right. So, let's the overall goal in which we are starting this program of training industry professionals in getting what is needed in the data science field. Right. So next slide. So, the data science, why data scientists right so new data scientists with knowledge of concepts, techniques, tools and domain are needed, you bring the domain knowledge, we will teach you the concepts, techniques, tools that are needed to be successful data scientists, right. So you learn by doing and translate this to your practice, right. So, this is actually a challenging new way of thinking it requires this diverse skill set and this is what we are trying to bridge over here right. So next, next slide. So many of You may have this question about where you are in the whole hierarchy of this data science program right. So, usually a software engineer is involved in collecting data, right. So in writing software that collects the data, then there is this data engineer role that is all about ensuring the reliable data flow infrastructure pipelines, extract, transform and learn operations, right, working with structured or unstructured data and big data platforms, that is usually where the data engineers role comes in. And then there is this traditional data analysts who does the cleaning, anomaly detection preparation, then perform some basic analytics on top of it right, and maybe do a B testing simple machine learning algorithms that is traditionally what the role of the data analyst is, right. And this data scientist, in fact, depending on the size of your company, a data scientist might be required to do the entire hierarchy, right. So entire starting from data collection up until applying advanced machine learning data engineering pools. Or if it is a big companies that have separate verticals, for each the role of data scientists usually comes in from the problem formulation stage, building the models and drawing insights from it, right. So that's how the different roles that you might already be doing are related to the field of data science and where you can fit, right? So why the data scientist where you can fit as a data scientist, right? So that's the idea.
Yeah. So to help you learn all that, right, so to get yourself trained as data scientists, our course curriculum, we have six modules. In addition to the six modules, there is actually a module zero, which is about training you to use Python, right. So the Python is the tool, programming language that we will be using throughout the curriculum throughout our course. Right, so that module zero is all about getting you up to speed with using Python. So if you already know Python, it's a refresher course, if you don't know Python, then the idea is that over the three to four weeks or one month that you know, module zero will run that is when you will use start using Python really hands on right. So, that's the idea. Then we have computational data science in practice module in which you will learn about high performance computing aspects of data science, how different parallel programming algorithms are all used in in really developing these algorithms and packages that to solve these algorithms right. So behind the scenes of most of these popular packages like TensorFlow pytorch, how how the parallel programming aspects are there behind it, then to give you a base on to understand machine learning neural networks, deep networks, the mathematics including probability, linear algebra, calculus and optimization, right. So that mathematics of data science one module, then we get into data engineering aspects like especially using pi Spark, so, the SPARC paradigm of doing the Big Data Platform right. So, by the SPARC platform, and how distributed machines can be used to process data, the MapReduce paradigm and why spark should be used and get a hands on experience of Spark, then we get into the machine learning module, where we will look at both supervised and unsupervised learning algorithms. And we will mostly be using psychic learn in the machine learning module. So, scikit learn is really powerful that is the tool that we will be using, but behind those tools, we will learn about regression regularization, decision trees, random forests, support vector machines, ensemble methods, principal component analysis like dimensionality reduction, all these techniques and tools we will learn in the machine learning module, then we get into the deep learning or neural network module multi multi layer perceptron. So, we will learn about new deep neural networks, convolutional neural networks, recurrent neural networks and graph neural networks right. So, these graph neural networks are an emerging paradigm. We will look at all that and we will conclude with reinforcement learning which is also another important machine learning AI tool, a technique that is there to gain information from data. And we'll close all of that with the business analytics module in which you will be learning most about financial analytics, time series analysis, how to analyze financial time series data using the machine learning, deep learning all these things that you will learn Put it into context of a business use case, which is basically a financial business use case. And then finally, there are a lot of bring your own project and capstone project. So, maybe next slide. So, the different capstone projects that you can do. So, you can bring your own project if you feel like it or we will have already assigned projects to you, which you will complete as a team, working with your cohort members as well as your mentors. So, there are multiple different aspects of machine learning, data engineering, all these aspects will be covered. And you will get a good hands on experience about this Capstone by working on this capstone project. And we have a unique module, which is called as a data stories module, right? So in that you will Chronicle your learning throughout the process, and you will prepare a portfolio of what you know, right. So this is to help serve as exhibition of your skills, right. So it can come from the capstone project that you work, or all the different assignments and projects that you will complete along the way mini projects and assignments that you complete along the way. So all of this can be put together in a nice form that you maintain. Right? So we'll tell you what are the best principles of showcasing your data science journey, right. So that is another unique aspect of our program.
Yes, that's it right. Yeah. So, our faculty, so, you already know me, I'm Deepak Subramanian, I have PhD from MIT and processor, she also you have met already right. So, he, he has a PhD from Germany, with you in Germany. And a professor Yogesh Simpson will be handling the data engineering module and so, he is an expert in data and computing systems Big Data Internet of Things, I will be handling the machine learning module and processor aaditya funny processor she and myself we will be handling the mathematics modules. That is their process Andy will be handling the deep learning module so he's an expert in computational signal processing, deep learning, statistical inference graph neural network and process Shashi Jain, he will be handling the business analytics module right. So, where we work on quantitative finance, he he works on quantitative finance, derivative pricing, real option analysis and so on and so forth. processor, she will be handling the competition data science in practice, the high performance computing aspects of data science, right. So that is how the modules and learning as planned. So we can have a discussion, take questions.
Absolutely. Thank you. You know, professors, Sachi, Professor Deepak as well for, you know, taking us through this, one thing that we would like to say is that you can send your questions on chat, you know, v v, talk to a lot of people every day in this field. And I have some questions around that. But your questions, we will take them up, you can send your questions on chat, we would love to answer all of them as and when they come in, in terms of if he really looked at it, it would also be apt to talk a little bit about the success of the cohort that we have planned and even, you know, post, you know, the board starting. So primarily, if you really look at it, in an executive program, our diverse batch is a very important aspect, because a lot of the learning that happens happens, not only inside class, but from each other. So, from that perspective, this this slide on the batch diversity is of the first cohort, incidentally speaking, people and for the first cohort was to just have 50 people in the class. But such was the demand of the program that we started off with 100 people in class and a BB plan again for the second cohort. The good part about this is that and it speaks volumes about the quality of the learning that's there. The faculty as well as IAC, and the talentsprint partnership that there is that we had planned to launch the second cohort, sometime in the month of June, July. We as we speak today on 11th of April 90% of the classes full and there's been a huge demand from the cohort to start it off earlier. So now instead of a July kind of launch, we are going to start off the current cohort early in the month of May. So probably the first week of May, it's been the zero module that Professor Deepak spoke about is going to start. So the interesting thing is that in the current class stuff that is there for many of you are maybe interested in that as well is that, you know, we have, the number of women in the class have gone up, which is a very positive thing for us. So we will probably have instead of 20% women, we are close to around having around 30% women in class right now, once the final numbers come in, that number will anywhere be between 25 to 30, or maybe a little higher than that average years of experience. Again, we are around 1111 and a half years at the moment, we have similar route 20 cities, a couple of international locations participants are joining us around 1516 Industries is last of what I saw and around somewhere around 8485 companies as we speak, that number is going to go up very significantly that the number is going to change, you know, when the class starts sometime in the month of May. So if you really look at it, this is something this is from the first cohort, we have similar interesting things happening in the second cohort as well. Now, just add I can see some questions coming in Let me try taking some of these questions that have come in. Okay, so, we are asked here from business development in the energy sector with an engineering background. How can a data science program or a program like computational data science help him?
Yeah, hello. Yes. Professor. Yes. Yeah. So definitely, right. So, we will be teaching you about what data sciences right so how to use data science in practice with some examples and so on how the data can be really used for making predictive models, how can how it can be used for building regression models, or it can be used for building classification models, how the data can be arranged, and how we how you can manage and process all this data. And so this is the learning that you will take from here. And using that, with your business development and energy sector background, I'm sure you can do a lot of different things, right. So that really the sky's the limit for you, at that point of time, right. So we are building. So if you if you if you remember the Death Valley of innovation kind of plot that I had shown you. And we you already have the right hand side of that, right. So it was a decision a business aspects of things. And we are going to give you the left part of the the the skills that are needed, right, so the data science skills that are needed and help you bridge that gap, and, and improve your business more with the knowledge that you gain, right. So you will see a lot of examples. Or maybe it's not specifically from the energy sector. But broadly, you will see a lot of different examples from which you can abstract out and get an idea about how it how it is applicable to your specific background. Right. So that's what we'll be there.
Sure, and just to add to that, Professor, I mean, you always have an option of, you know, bringing your own project. So if you have data that's available from your sector, and you have identified from your company, and they're willing to share that with us identify the specific problem, I'm sure the program will allow you to formulate a problem to formulate that and kind of come up with a solution for that. I hope I'm right. Professor Yes,
yes. So, you can bring your own project you can convince your cohort members that this is a project deals should all work on get together a team work on that and make it really interesting right. So, we are open to that aspect. Right. So,
absolutely. Questions are coming I mean in between I will also like to before we get to the other questions, we I received a lot of questions when we are talking to a lot of the prospects over here. And this is relating to data, you know, we've been talking about data a lot etc. Right now, the complexity of data has increased a lot. And as well as the deep learning modules that are used to compute computing power that's available to us has increased, you know, so, for example, if we talk about you know, image segmentation or something like that, you will you may have 100 layer deep learning module that is used. There is this concept where, where people are looking at, you know, trying to solve similar problems with less relevant data and less complex models. What Your tape professors I mean either if you can take this up on all this thing that is that is there this is increasingly happening in the market.
Yes, so, definitely right. So, this deep learning neural networks are supposed to be called what they call as universal approximator that is, as you will see in our modules on machine learning and deep learning, so, it's at the end of the day solving an optimization problem. So, if you feed it with lots and lots of data, these deep networks right so, they are capable of approximating any function that you want it to approximate provided there is several examples right. So, but what those machines are not capable of like what those algorithms are not capable of doing is to say, to extrapolate beyond, right, so, what the data is seen. So, in this context is when things like explainable AI and good data driven model building comes right. So, we will cover a lot of models that are what are called as a white box model, like for example, decision trees, they are what is called as a white box model in which you can explain what is happening inside as opposed to what these neural networks and other things are called as traditionally called as a black box model right. So where you cannot really explain what is happening inside. So we will be discussing all these aspects in our machine learning and neural network modules on what it means to explain why a model is giving an output that does giving life so all those points will be really covered in the course.
Excellent, you actually I had a question on explainable AI as well, you know, why, you know, it's easy to interpret a decision tree etc, you know, how do we interpret models and but only one once it becomes more complex, it becomes inherently much more difficult to explain. So how do we handle like large scale real time problems? How can we, how can we have like these interpretable models? Which can be explained so to speak?
Yeah, so simplest sense. First thing to explain a model, right, is to understand what the model is doing. Right? So what it's supposed to do, if we know what the model is supposed to do, we can then explain why a model is behaving in the way it is behaving. Right. So most of the core confusion, or this idea about non explainable, deep models comes from the fact that we really don't know what is going on behind the scenes, right. So once we understand what goes on behind the scenes, then we will be in a position to explain. So even decision trees are called explainable, right? But you ask somebody who's not trained in decision tree like what what is it explaining it typical for them to get it right, so you need to know what the model is and what goes behind the scenes to explain it. So, the approach that we will take he used to show with examples, and through a process on how these models learn and what goes behind that, then we can explain what happens. So the whole field of explainable AI is much bigger than the application to data science that we are talking about, right? So we focus on practical applications, business decision applications, and through examples, so how we can interpret these different data driven models that we will build.
Absolutely, thank you, Professor, we'll take a question that has come in from Malika she's inquiring about the kind of resources that the participants of the program will have access to read books, I'm assuming I mean, physical books will not be there, I can answer that but videos, research papers, etc. So firstly, all the classes are live classes, they are just going to be happening like this, you can pause the sessions, I mean, you can you know, ask questions, raise your hand etc, you will have access to the whiteboard of the faculty, the platform itself is an excellent platform, which allows you to you know, look at all aspects of a physical class which are there you can be broken down into groups, to that the faculty can come in all of that is there as a part of the program unlike other programs, and and also kind of answering you know, Keisha, who asked, how's the program is different from it. This is a live class you will be, you know, getting, you know, being taught by faculty who are there that the course does not have recorded videos per se, where you know, you're just going to be on your own in terms of videos, research papers, Professor, would you like to clarify that?
So all modules will have assigned reading material and our own slides that have been prepared through after a careful process to help you learn. So that material we available to all of you, and the recommended reference material and books will be mentioned. So, the program itself will not give you books you will have to buy the books on your own if you want them. Right. So, the we will tell you what to focus on right. So, as I also mentioned at the beginning, right, so there is a deluge of data science material on the internet. So, Google for it, you will get hundreds or millions of hits on what are the different data science techniques and tools, everybody has an explanation for what are the different things are so what you gain from the program is learning that from experienced faculty. So, that's what you will gain. And it's going to be a live lecture and that live lecture will be recorded and that video will be made available.
Yes. And the videos actually even pass through an AI layer whereby you know, you have the entire video is transcribed, you can come to any section of that video, any topic that's being discussed. So that's, that's something that's there. The same platform, by the way is being used by iron Calcutta to run their main MBA program, during the pandemic, even with by IITJammu. And a couple of other institutions also are using the same platform to run their flagship programs. So it's an excellent platform Malika has another question in terms of the main training process.
Retro the question keeps on coming for the PDF and I think maybe just to summarize, we will summarize so that our questions can our doubts can be clarified. Yeah. So to start with the lectures, right, each module the in advance the slides at least, a few days in advance, or one weekend in advance, the slides will be shared, but the material for the module will be I mean, on either online material, or even the books were the PDFs, and I think the links will be shared most of the materials, most of you will be able to download and use it, maybe few you need to buy it, but that's we will make sure that most of them are available online. So those links if you link get it all the content and other things we read that's more than enough for the modules in addition to that, the assignments will be shared in advance and then you will be doing the assignments during the assignments then there will be a mentors mentors three, but I think every 10 or 15 participants there will be a mentor. So, the mentors will be taking care of guiding the explaining how the things are going on. So, in addition to that, there will be a mini project even the mini project you will be getting your help from the mentors. So, it is not like that we will teach and the review that you can go and read the material and God's not like that the content right. So, there will be like a teaching materials will be provided the slides will be provided in the PDF forms and then the online materials books will be shared assignments, on many projects there you will get a lot of help in fact, the group will get a dedicated mentor and then the mentors will be helping so towards the end right if you collect and keep all these materials, that's more than enough in order to revise your module even to take the day take up the exam or even after the program. So you can also have access to the video material that the class lectures and other things as well as I remember it is now at his previous service decided for two years but now the talentsprint is thinking of extending the validity period also. So even complete after completion of your program you will be having an access to the videos so it is not like that we are taking now one closing and going so let's let's say the whole content preparation of this program.
Right Thank you Professor Yeah, we are anyway you will have access to the LMS all of the content of PDFs that professor was talking about is there the video part of it is a work in progress but after you complete the program, you will have access to it for a couple of years. There are also some questions I think we can kind of bunch up some of the questions in terms of you know math required. So and you know PD kind of preparation better dl what what aspects of TL whether it will be NLP computer vision etc will be covered or not? professors. Would you want to take that off please?
Yes, sir. So the math required is a D undergraduate first year undergraduate level, right? So you must know what a vector means what a matrix means what probability of heads means, right? What probability of tails means that that level of math is what is required. And in our modules on mathematics, we will start from the basic, but it will go up to the advanced level that is needed to understand the advanced models for machine learning and neural networks that are there. So the basic requirement is the Undergraduate first year undergraduate level or calculus, you should know what a function means. So, you should know what probability means most things if you know that that is what is needed. And you should also have an open mindset that if I don't understand some concepts, I will read up on it right and get an idea of what is happening there. So that that open mindset is also required. So that is what is going to be needed. So, we will start from the basic right. So always you have to start from the basic but it will go up to very advanced levels, including what is information entropy, entropy and guidance, all these in the level of probability, right and then up to machine learning and deep learning, we will we will be doing that. Then that question about deep learning NLP right, so let's get to it. So yeah, so in the neural network module, right, so we will look at the different neural network architectures that are used for computer vision applications that are used for NLP application, and you are free to use, you are free to do your capstone project on any of those NLP computer vision tasks, and we have the focus is not on NLP or computer vision per se. So the focus is on getting you to know about the different machine learning deep learning, data engineering, business analytics, and all these different aspects in our six pillars on which our program stands on right. So computer vision and NLP are just the top small parts of it, you see that pyramid, right is right at the top, we will be will be doing the entire gamut of things that will help lead you to get into NLP in computer vision.
Absolutely. Thank you, professor. So a couple of other questions. There are a couple of profile related questions, but I think we can bunch in a couple of questions around placement assistance. And all of that. See, I see is you are not going to be part of the ISC placement process in any way. And there is also the two people who have asked this question. In terms of placement process, if you really look at it, it's it's it's not something that is a part of the program, primarily because of the fact that the audience that the program is primarily catering to our people with above, you know, on an average, like I mentioned is around 1112 years of experience. So in such a situation, it's not possible for placements to be there, because you're already a working professional. So you, you already are working somewhere. Now for working professionals who are going to do this program, the primary skill is on the primary requisite is not in terms of placement trials, because honestly speaking with a decade of experience, and more, you will probably have more contacts in the industry than, you know, the the placement team at talentsprint can ever provide you. So the objective of most people and that's what I say when you know when people come to us for counseling is that you build a skills up the job will follow you. Now, having said that, specifically answer the question, there is no placement assistance. What we have is a career accelerator, the career accelerator again, there are certain components of that. One of that will be one big eight. And that will be against the data stories that you have over here. The other aspects of it will be helping you build up your kaggle profile or your LinkedIn profile, as well as looking at providing new resources where you have access to certain blogs, which are posted as a part of our alumni network. So once you pass out of the program, you will be a part of the talentsprint alumni network 1000s of professionals from top institutions, not only IC vvF programs, with IMS, we have programs with IITs, etc, they are a part of the alumni network, and jobs get posted over there on a very regular basis in that network. So you will be able to use that we also have many companies who come in and ask us for certain specific profiles, kind of roles, etc. And these are people and these are jobs that you will also be able to be a part of. So in terms of log assistance, you should not look at this as a job Assistance Program. Look at this as building up your skills. And most cases, you know, we've been running these kinds of programs for years now. People do not look at this as a job take the job, most people are able to do it pretty successfully by building the skills up. Let's see if we have any other question capstone projects they will do in the program a professor, I think it's one project. Can you please clarify?
Yeah, the capstone is the final project, which will be a culmination of all the things that you have learned. There will be many projects along the way, which is a specific end to end project, which abstracts out many of the other aspects and then just focuses on that particular tool that you would have learned that particular skill that you will learn. And so that is, he will do several mini projects, and one capstone project or team. And all of that is going to be done on the kolab Google colab platform. So it's free to use. And we will be focusing on using TensorFlow in machine learning, at the psychic learn and machine learning and TensorFlow, Cara's in TF to TensorFlow 2.0, Cara's indie deep learning modules, that and pi spark are the data engineering. So those those are the tools that you will be used, and all of all of them are free to use, right. So you don't need a specific license or anything to do that.
Absolutely. So let's see. Again, a question.
So the one clarification is for the what will be our who will be teaching right? Or there is ice reset one point what is going to be taught at I see, right? Yeah, all the modules, that's most important questions, all modules will be taught by IC faculty. And if the pandemic situation improves, the final one week, capstone project presentation will be held at ICS. Apart from that, everything will be online. So but all courses will be taught by a faculty from Ise except module zero, where the programming Python will be taught by the demand, I think ashokan or somebody else from details?
Yes. And also the mentors are all industry professionals. Right? Yeah, my mentors are all interesting industry professionals with lots of experience. They know how these tools are used in practice. Right. So the mentoring is done by talentsprint? Yes, because there are faculty office hours, I think there was a question on whether faculty will be doing the mentoring, but there are five there will be faculty office hours, in which you can book a slot with the faculty and interact and ask your questions during that.
Right, we also got this. Yes, Professor Jesse? No, no, go ahead. Go ahead, Erica. So I was just looking at other questions. So we are We also got this question around infrastructure, etc. While we your professor has clarified about the license software, I think any regular PC will be enough for most of the program. Am I right?
Yeah, that's right. So for example, given to us the HPC kind of things, I will be showing like how even if they don't have the license of underlying access and access to the lining systems, they should be able to use it online. So if they have like a reasonably good computer, especially the laptop or the workstation, with good internet connection, that's more than enough. But if their interest of doing the capstone project with more of like a large scale problem, big data problems, then they might need to have some access. But then anyhow, it will be done in a group. So maybe one of your partners might be having that access to that so that we can run it for the development, purpose, assignments, and mini projects, everything will be done on the mostly on the free platform, especially Google colab. So I think that you really do not need any specific infrastructure for completing this course.
So all our research papers are available online, right? So you can just download all our research papers, there is no you don't need to enroll for the program to read our research papers, right. So our funding mandate, mandates that all our research papers are available online, so you can just go to Google Scholar and search and get it. Absolutely.
Thank you, professor. In terms of there's a question on the duration of classes. During the week, it's again classes will be on weekends. And we may process you can clarify that for
the classes will be on Saturdays and Sundays. Mostly the lectures will be on Saturdays and Sundays. Especially morning is reserved for mostly, there will be no classes but there are few classes are planned for Sunday's
Sunday associating. So the cohort to most likely we will have classes on Sunday and mini project on Saturday. Yeah,
the discussion that we have to work it around so basically, yeah, yeah. So
classes on a week day, Ah, no, verify that.
No, there will be no classes on weekdays. So of course, the participants have to spend some time at least a couple of hours weekends weekdays, right. So in order to revise and prepare them for the next class, otherwise there will be no continuity.
Absolutely. Okay, there's a question on the certificate. It's it's not a diploma. It's not a degree cert. figured let me clarify at an advanced, you know, certificate program that you are doing this is so it's not a degree or a diploma in case you're asking. So it's going to be a certificate. This is a certificate program that IAC is providing the certificate will come from ISC Vcc, Department of ICT will be providing the certificate.
You don't have access to any of IFC library, no access to people.
So this access to the library to her with respect to this data science machine learning AI advanced books, but as far as I remember, almost all advanced books in the books on these fields are available online. I think none of these books like printed version, nobody reads them nowadays. So I think you can get everything online. Absolutely.
Interesting question from Venkatesh. You know, is it possible to feed or pass on lessons learned from Cohort One to cohort two candidates, you may not really know but you know, the aspects of the programs are being changed. To a certain extent I can talk from a talentsprint perspective, one change that we are implementing from Cohort One, I don't know whether we'll be able to operationalize it by the time Cohort Two starts. But definitely after this is that we are going to be having a special other than module zero that the professor spoke about, there's going to be an ongoing mathematical preliminaries and programming course which is going to be running month on month, every month. So whoever enrolls can come into the program, I mean, if you enrolled like three months before you will have probably six ops six bureau opportunities to participate in that program at no additional costs, where you can revise all of the mathematical preliminaries that are there basic programming all kinds of programming concepts that you will require to succeed in programs like you know, data sets, computational data science or any of the other programs that we run. So, we have received feedback from people and this is something that we are implementing, mostly it should be in place in the next 30 days. So, people will have been fighting by then this cohort at least will be over but I even if you want to attend those classes, we will not prevent you you know in case you want to make time over there. So that's one thing that is there in terms of any feedback to professor's we do conduct a lot of surveys, we've already conducted a couple of surveys I believe, with the current participants of Cohort One and changes that they have suggested in the program are being incorporated real time even for them and learnings from that are also going to be incorporated for cohort to professors you can add
add here right. So, actually for every session, we are collecting feedback and in addition to that, every session that towards the end of every module also we are collecting feedback. So which means that even though something is going wrong or the face is very fast, very slow, so all these things we are dynamically adjusting to each session wise that's how we are managing it for for instance, when when we started with HPC object oriented program, how these concepts are used in machine learning almost all libraries and other things there are some caution some saying that okay, maybe the object oriented thing needs to be go in detail right. So, we are adjusting all those things session wise. So, these are the things which we follow read on a regular basis, more or less the content wise, everyone is so far happy. So we know on Friday, we have completing around the three modules. So three modules the content wise, everyone is happy and then we have not changed anything in terms of the content. But there are other things for example, that we already Deepak was discussing about the Sundays, we want to have a class and then the afternoon we want to preform the the projects in our mini projects, two Saturdays, all those things are just meant that we are accommodating. But apart from that, yeah, so we are open to all the adjustment as long as this is the genuine interest a genuine request to make the program success, we are accommodating dynamically. So apart from that, I do not see much change much anything needs to be changed, or maybe the book you can add.
I think I'll just clarify the timing aspect again. So during a weekend, there will be three hours of live faculty lecture. Right so live faculty with lecture will be three hours to one and a half hour session. So with a break in between, and there will be a mentor lead assignment session of the lab session which we'll cover which will which will be an interactive session again where you will work out on problems. So that is again another three hour or two or three hour session. That session is offered So the faculty session is compulsory, you have to attend the assignment session is optional, you can work it on on your own, but mentors will be available to help you in case you require it, then there will be another three hour session or three, three and a half hour session with a break on many projects, so both of these classes and mini projects will be held on the morning. So it's either on Saturday or Sunday, right. So we will work that out. And that will be how the timing will be. And in Cohort One, what we have noticed is that, in addition to what for processor she has mentioned, almost I think about 90 95% of the participants have said that their knowledge of computational data science at the beginning of the cohort, to the end of just the module one has increased exponentially, right. So that that is one outcome that is what we have seen so far. And most of the candidates, what they have said is they spend about three hours beyond the what the class is in learning. And so that's what they have said in the part of the cohort one?
Absolutely. I have a few interesting questions, which I received the bureau from part, I don't know whether we are they are participants in the webinar, TV worker, and we'll be talking about data a lot. Now, data, mostly there's a lot of noise in the data, like most of the data is not perfect, that's one of the biggest problems that we have with working with data these days. So and and a lot of the data is also incomplete. So how do we go about, you know, training a good model, where a lot of the data is incomplete in a very noisy environment. So China we can use, you know, concepts like snort up sampling down sampling and all of that, but is there are there better ways of working around with data like this?
I will invite you to attend our modules. Right. So, yeah, so cleaning data is as as an important part of building a solution as is anything else right. So, that is really, really important. So basically, this is what is called as data munging. Right? So as part of the module zero, there are all these assignments on how to do this data munging approach, right, so that you will use and when we discuss each and every individual data model, there, we will make statements about how to work with the data, for example, do you do bootstrapping or not? Do you train an ensemble? Do you perform what is called as regularization? So regularization will help you work with small amounts of data and ensure that your model generalizes? Well, so all these techniques and tools will be covered dropout regularization, all these techniques will be discussed in detail in our modules, right. So, broadly, yes, there are several techniques and all of this, you need to really go through the course to get an idea of what exactly to do. Thank you, professor. Yes. I mean, this is a question that had come in.
In terms of this, the other question that I'm seeing is keshavarz, how many students in Cohort One are from the data science field? If memory serves me, right, I have seen at least 12 to 15 people who had returned data scientists in their recipe, that number could be higher as well. So we already have some of them who are there who want to, you know, build in more fundamental concepts. There's also something that we have seen is many people who have already passed out of a data science program from some other players coming in and doing a program with us, because of the rigor of the program and the kind of outcomes that people have. And obviously, you know, great brands that I see, that certificate will provide us. So that's something that we have seen as well over here. I think we are a little past time. But you know, we will take two more questions if that's okay. Yes, or no quitting is also there. I think most of you will be knowing her. She manages joining this program from premier Institute, of course, we've had cases from where people have passed out of IITs coming in and joining the programs. So that's, that's there. And again, it's not really about the premium institution that they come in, you should be when you juxtapose that while they do come in when you juxtapose that with the kinds of companies that they're working in with the kinds of roles that they are working, the amount of years of work experience that they bring in. That's the real rich land that you get and all of these combined minds when they are there in class or as a part of your group is is it One of the biggest takeaways of the programs that you will have, I mean, I have been to the iron calculator program and using AI in marketing, again, similar numbers around 100 people, the ideas that have come in and and we are still in touch with our cohort, which I'm sure happens across all of the programs that you know, we run. That's that's a big takeaway for a lot of you, most of you. Last question. Okay. Any campus visit? Professor did mention that if overt protocols permit, you may have a campus visit at the end of the programs. I am I right, Professor? Yes.
You know, that's right. So in fact, it is part of the program designed but unfortunately, because of this COVID situation, so we are not able to confirm or assure. Otherwise, the project last one week, especially the capstone project presentation will be held at ISC. And during that period, we'll give the people the participants will be having a time to spend at least three, four days different from the duration. But all these things now depends on how this pandemics improves. So that's the it's I cannot say it under percent. Yes. But it is. It is in our It is part of the program.
Yeah, and if it happens, I believe it's it's a mandatory requirement tag currently, mandatory? Yes. Right. So it will be going any some people aren't there if if they cannot, I think some exceptions have to be solved.
As always, there will be sessions. Okay, just to summarize, right? So more than what you're going to learn is not the tools, not just the tools, it's all about how to learn the tools, right? So learn the learning. So this is what we are going to teach. Right. So and also there is a question about like a premium Institute doesn't matter from where you're coming from, get at the end of the program, where you're going to get placed and how much knowledge that you gain is most important than the background. This is what we believe in at IC E and this is what we are going to do believe also.
Absolutely. Thank you, Professor Shashi. Thank you, Professor Deepak for spending time with us this small poll that I've opened up, it's been there for a while now, for those who have not voted, if you can vote, we'd love to get your opinion on that. Thank you so much for spending an hour You know, our art and a half almost for the professor's All
right, one more question, I think I can take that. So what should be our approach of revision or practice after attending your classes? Right. So I think that is an important question. So the idea is that in our classes will be called contents, right. So that is power packed in. So you will, you will learn maybe six to 13 concepts during that day. And we will give you sufficient examples and practice materials that you will need to revise. So the ideal approach would be that, let's say a class gets over on the morning, that evening itself, you just read through it right? So for maybe two or three hours and just make it concrete and keep it in your head, right. So that will be the best approach. And then the assignments you can do over the week. And then when you come to the mini project and finish off with the assignment sessions, you know, the concept will be crystal clear for you. Right. So that is the design of the program. So how it has been done. And there has been a lot of participants who asked for free reading material. So one week in advance, we will release the material that you will need to read to come to the class. But again, we understand that working professionals might not have time to read prior to the class. So that is not expected. If you want you can read before. But what you should definitely do is after the class maybe that same day itself, just revise the concepts and get everything cleared that day itself and use the office hours and the mentor sessions to the maximum that you can. And infrastructure wise anything that was a question what infrastructure Do you need. So this I recommend, it's not unnecessary, but I recommend that you have at least two screens to make full use of the platform that is there, right? So right now if you see if you have seen only in one screen, we are going to come to the small box at the top right. So in zoom, if you have two screens, you can put one where the professor you can see and it will be really learning from the professor, it'll be like I'm there in your living room or whatnot, right. And if you switch off your mobile for the three hours that you are there in the class, then your learning will be better. Right. So that's the other other recommendation that I give to all the participants in Cohort One and even our own students. This will help you A lot in getting. Right? Yeah.
So this the last bit about email the registered participants about the actual class schedule, I think we will finalize the actual class should you will in the next couple of weeks,
is that right before start up the program, we will be sharing the schedule for the entire program. But in addition to that, of course, they will be knowing like exactly what day they arrive to have like doing the mini project, there will be classes, everything will be scheduled in advance before starting the program. To answer this question, yes, but guess once this LMS the students join participant joint right, they will be having access to the pool, and then they can also interact in the virtual classroom. And in addition to that, they also have the whatsapp group. And I think nowadays, everything is in hand, so I don't see any reason why they cannot interact.
And so the program is packed, right? So there is no free weekend. So there is a question about what are the free weekends so there will be no free weekends, there will be class every weekend, right? for eight months, every weekend, we'll have a class. If you miss some weekends, that's okay. There will be recorded lectures that are available and you can of course interact with the mentors and get the your doubts etc, clarified.
So I think we are mostly done with the questions. Like always, we unfortunately, we overshoot our time, but then that's a good thing to know, we can answer all the questions that people have. Thank you so much, professors again, and thank you, everybody who joined us for this session today. It's been lovely having you, given you given your perspectives and tell our prospective, you know students about the program, how it's different, etc. And I look forward to all of you joining the cohort and seeing you in class in the month of May. So thank you again, professors. And thank you everybody for joining in.
Thank you. Thank you. Thank you. Good luck to all participants.
Watch the entire interview here https://youtu.be/vAc4EgmxVrE