Thursday, January 24, 2019

presentation is usually done by leveraging visualizations and tables

The concepts of overfitting and underfitting would be explained further in the next chapter

The evaluation metric on the train and validation splits enable us to debug the model to discover whether it is underfitting or overfitting to the training set. If it is underfitting (not learning enough), we can increase the power of the model else we apply regularization if it is overfitting (learning noise). The concepts of overfitting and underfitting would be explained further in the next chapter.

Data Presentation

The last stage is all about presenting our findings in a form that is intuitive and understandable to non-technical professionals such as managers, marketers or business leaders. The importance of this step cannot be overemphasized as it is the crowning jewel of the data science process. Presentation is usually done by leveraging visualizations and tables. The purpose of this step is to communicate the insights discovered from the data science process in such a way that the information provided is actionable. This means data presentation should enable a decision making process. It should be clear from the presentation what steps need to be taken to solve the original problem which was posed as a question in the first step. It may also be desirable to automate the process as the insights produced may be so valuable that they need to be returned to regularly. Another possible outcome is bundling the model into a data product or application that is served to end users. To do this, the model would need to be optimized for production and deployed in a scalable fashion.