Subject: Data Science

Scientific Area:

Computing

Workload:

80 Hours

Number of ECTS:

7,5 ECTS

Language:

English

Overall objectives:

1 - Enable students to learn to think independently and creatively about how to collect, process and analyze data to solve problems and answer posed questions.
2 - Introduction to data science process, tools and related technologies.
3 - Enable the students to develop complete machine learning-based approaches with supervised and unsupervised learning, including ensemble techniques.
4 - Enable students to create data products in visual formats and oral presentations, supported in rigorous comunication,but also appealing and persuasive.

Syllabus:

1 - What is data science and the process of information discovery - question formulation
2 - Data acquisition and storage
3 - Exploratory data analysis and visualization
4 - Data preprocessing, dimensionality reduction, and balancing
5 - Feature creation and selection
6 - Classification and regression with supervised learning: training algorithms, overfitting, and cross-validation
7 - Model combination with boosting and bagging
8 - Performance evaluation with metrics, graphs, and significance analysis
9 - Unsupervised learning algorithms
10 - Case studies with time series, image processing, and text mining.

Literature/Sources:

EMC Education Services , 2015 , Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data , Wiley
William McKinney , 2022 , Python for Data Analysis 3e: Data Wrangling with pandas, NumPy, and Jupyter , O'Reilly
Joe Reis, Matt Housley , 2022 , Fundamentals of Data Engineering: Plan and Build Robust Data Systems , O'Reilly

Assesssment methods and criteria:

Classification Type: Quantitativa (0-20)

Evaluation Methodology:
Teaching methodology: theoretical and theoretical-practical classes and tutoral supervising. Evaluation: 1. Theory component - 40% (individual): 2 tests - 20% + 20% (minimum grade of 8 values in each test); 2. Practical component - 60% (in groups of 3 students) with two projects - 25% + 35% (minimum grade of 8 values in each project); 2.1 First project on data analysis; 2.2 Second project concerning the planning and development of machine learning models.