The Data Science Foundations (CS2132.01)

Meltem Ballan

The Data Science Foundations provides an interactive introduction to common algorithms and techniques in data science. This class covers data preprocessing, regression techniques, supervised and unsupervised learning algorithms, decision trees, neural networks, ensemble methods,  model evaluation techniques and ethics.

Throughout the course, students are encouraged to focus on developing an applied understanding of topics in data science. Each section covers topics from a conceptual standpoint without assuming prerequisite knowledge in statistics and programming. Concepts are reinforced by learning questions and coding questions that build understanding and confidence. At the end of the course, students will have the necessary skills to dive further into the more quantitative and technical aspects of data science and machine learning.

With the integration of Jupyter Notebooks, students gain access to a web-based computing platform that allows them to write and edit live code and create data visualizations. Students can experiment by changing the parameters of different models to evaluate their performance.

This class will be offered as an 2000 class; however, students should be familiar with computation and python basics. Students need to have a basic understanding of python and Jupyter Notebook environment. Students who want to explore and understand their capacity can download Jupyter Notebook on https://jupyter.org/ and test their abilities to take the class.

If the students are interested in the ethics and public actions in data protection, bias and ethical use they should talk with the instructor for exemption.

The Friday classes are the hands-on lab work and students are required to complete the work either with a group or alone. The tutors will lead the class and will address all your questions. If for any reason you need extra time and support please email me.


Learning Outcomes:
In this class we will follow the zybook with well-known problems and use cases. This course is designed towards students who want to learn and pursue data analysis, data science and AI. You will learn:

1. Introduction to Data Science

2. Python for Data Science

3. Probability and Statistics

4. SQL for Data Science

5. Data Wrangling

6. Data Exploration

7. Regression

8. Evaluating Model Performance

9. Supervised Learning

10. Unsupervised Learning

11. Decision Trees

12. Artificial Neural Networks

13. Ensemble Techniques





Delivery Method: Hybrid
Course Level: 2000-level
Credits: 4
T/F 2:10PM - 4:00PM (Full-term)
Maximum Enrollment: 25
Course Frequency: Every 2-3 years

Categories: 2000 , All courses , CAPA , Computer Science , Four Credit , Hybrid
Tags: , ,