Mew isn’t really predicting the end of data science, as he says in his first post. He claims that the field is improving as more tools are made available to us for managing the technical aspects of our work. Mew writes:Problem solvers and critical thinkers are the most in-demand skills for the job. They have to be able to understand the industry, its stakeholders and the business. The ability to use a handful of software packages and to recite a few lines code will not suffice to define a data scientist.I agree completely! This is our philosophy since the beginning of the Elmhurst University Master’s program in Data Science and Analytics. It’s the main reason our graduates have been so successful.

Data Science Skills for the Next Ten Years

Shah’s optimistic post also fits well with the program’s approach.Shah reminds us data science is science, and has been around for centuries in many forms. Science is about using information to generate theories that can then be applied to solve problems.Shah said that data science is exactly what Shah does. Our terminology is slightly different. Information is called “data” and models are “models.” Data science can be used to solve a wide range of meaningful problems thanks to the explosion in computing power and readily available data.Shah gives some tips on data science projects. Shah tells us that models are only a small part of real projects, and that iterative development is necessary for data science projects in the real world.These ideas, once again, are a perfect fit for Elmhurst’s project-based curriculum that focuses on creating stakeholder value.

Case Study: Students Apply Modern Data Science Skills

These ideas were demonstrated in action during a recent student capstone project.The setting offers software and platforms for survey design. They want to improve the user experience. How do they do thisA lot of customization is required when creating a survey on the platform. This means that you will need to set up specialized configuration settings for your survey. There are more chances for error when there is more customization. Unintentional user error can be decreased if default settings can be adjusted dynamically.These surveys use a lot of open-ended questions. These questions’ sentiment structure, which is the degree to which they have an overall positive or negative tone, can be attributed to the survey configuration settings. If sentiment can be predicted by the text in the questions, then better default configuration settings may be possible.The project goal was clear: Predict survey sentiment with at least 80% accuracy.You can see the emphasis on the problem statement and the iterative processes required to get there. This is not technical. It all comes down to good domain knowledge and scientific method.More

Case Study: Conclusions

Next came understanding and correcting the data. When working with text data, this is not an easy task. To identify and quantify patterns in text, techniques as simple as determining word frequency were used. Techniques as complex as Latent Dirichlet Allocation or Short Text Topic Modeling were also used. These patterns were used to create candidate models that could predict survey question sentiment.Although it may seem simple, many iterations were required to reach this point. Some ideas didn’t work. Some ideas were promising and led to useful refinements later.

The modeling phase was the same. Numerous models were created, tested and refined. Each model had its strengths and weaknesses. SHAP plots, which are powerful techniques that reveal how models make their decisions, provide valuable insights. They highlight everything from word counts to the number of comparative adjectives they use.The final decision was made on technical and business grounds. This included balancing accuracy with interpretation, ease of deployment, and ability for long-term maintenance.Although this was a great effort, only 70% accuracy was achieved. This is a little less than the 80% threshold. These results were still very useful. Importantly, the systematic and careful approaches used revealed areas of improvement that could eventually allow the 80% goal.

The modeling phase was the same. Numerous models were created, tested and refined. Each model had its strengths and weaknesses. SHAP plots, which are powerful techniques that reveal how models make their decisions, provide valuable insights. They highlight everything from word counts to the number of comparative adjectives they use.The final decision was made on technical and business grounds. This included balancing accuracy with interpretation, ease of deployment, and ability for long-term maintenance.Although this was a great effort, only 70% accuracy was achieved. This is a little less than the 80% threshold. These results were still very useful. Importantly, the systematic and careful approaches used revealed areas of improvement that could eventually allow the 80% goal.birds facts

Elmhurst: Refine your data science skills

The Data Science and Analytics program at Elmhurst University helps professionals excel in the business world. You can also earn your master’s degree online with our flexible format. Are you ready to find out more? Fill out the form below.

Also Read

gsm tool