The ability to create, manage and utilize data has become one of the most important challenges for practitioners in almost all disciplines, sectors, and industries. In this course, students become familiar with basic tools and processes used in Data Science. Students work through the whole data lifecycle from loading, through cleaning and modeling, to storing the data. The work is performed using Python stack consisting ia of: IPython, NumPy, Pandas, Matplotlib, and Jupyter Notebooks. Students learn to structure their work using CRISP-DM and Data Science Process (Ask, Get, Explore, Model, Communicate and Visualize).
Learning outcome
Knowledge :
Execute/Develop tools to load, parse, clean, transform, merge, reshape, and store data.
Compare regular Python, NumPy, and Pandas data structures and choose one for the given problem. Use the IPython shell and Jupyter notebook for exploratory computing.
Execute/Develop simple machine learning or data mining algorithms.
Skills:
Organize data analysis following CRiSP-DM and Data Science Process
Build engaging visualizations of data analysis using matplotlib
Optimize data analysis applying available structure and methods
Evaluate, communicate and defend results of data analysis
General qualifications :
Solve real-world data analysis problems following a well-structured process
Forkunnskapskrav
10 Credits in Programming, Databases or Software Engineering related courses.
Anbefalte forkunnskaper
Introduction to Programming (DAT120), Probability and Statistics 2 (STA500)
Exam
Form of assessment
Weight
Duration
Marks
Aid
Exam system
Withdrawal deadline
Exam date
Project work in groups
3/5
Letter grades
—
—
—
Written exam (Multiple Choice)
2/5
3 Hours
Letter grades
Inspera assessment
21.11.2023
05.12.2023
Project Work in Groups
The project is completed in groups. Project work is to be performed in the groups that are assigned and published. Absence due to illness or for other reasons must be communicated as soon as possible to the lecturer.
A project report, including source code, contributes to the grade.
If a student fails the project work, he/she has to take this part again the next time the subject is lectured.
The work will consist of 6 hours of lecture, scheduled laboratory, supervised group work per week. Students are expected to spend an additional 6-8 hours a week on self-study, group discussions, and development work.
Åpent for
Admission to Single Courses at the Faculty of Science and Technology
Exchange programme at Faculty of Science and Technology
Emneevaluering
There must be an early dialogue between the course supervisor, the student union representative and the students. The purpose is feedback from the students for changes and adjustments in the course for the current semester.In addition, a digital subject evaluation must be carried out at least every three years. Its purpose is to gather the students experiences with the course.
Litteratur
Book Python for data analysis : data wrangling with pandas, NumPy, and Jupyter McKinney, Wes, McKinney, Wes, Sebastopol, Calif., O'Reilly, 550 s., 2022, isbn:9781098104030, Både 2. og 3. utgave kan benyttes / Both the 2nd and 3rd edition can be used Book Python for data analysis : data wrangling with Pandas, NumPy, and IPython McKinney, Wes, Beijing, O'Reilly, xvi, 528 sider, October 2017, isbn:978-1-491-95766-0, Både 2. og 3. utgave kan benyttes / Both the 2nd and 3rd edition can be used Book Building machine learning systems with Python : explore machine learning and deep learning techniques for building intelligent systems using scikit-learn and TensorFlow Coelho, Luis Pedro,, Richert, W., Birmingham, England :, Packt, 1 online resource (394 pages), 2018., isbn:1-78862-222-7, https://bibsys-ur.userservices.exlibrisgroup.com/view/uresolver/47BIBSYS_UBIS/openurl?ctx_enc=info:ofi/enc:UTF-8&ctx_id=10_1&ctx_tim=2019-11-25T08%3A29%3A34IST&ctx_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&url_ver=Z39.88-2004&rfr_id=info:sid/primo.exlibrisgroup.com-BIBSYS_ILS&req_id=&rft_dat=ie=47BIBSYS_UBIS:5155298430002208,ie=47BIBSYS_UBB:51130169800002207,ie=47BIBSYS_UBA:5164517660002209,ie=47BIBSYS_UBTO:51114176660002205,ie=47BIBSYS_NMH:5116912120002309,ie=47BIBSYS_UBIN:5148118220002211,ie=47BIBSYS_NB:51243574860002202,ie=47BIBSYS_NTNU_UB:51212133650002203,ie=47BIBSYS_NMBU:5125993220002213,ie=47BIBSYS_NLA:5131219200002228,ie=47BIBSYS_MF:5128211320002227,ie=47BIBSYS_MH:5125391120002296,ie=47BIBSYS_HIB:5134925840002221,ie=47BIBSYS_HVO:5138469310002220,ie=47BIBSYS_HIO:5166313490002218,ie=47BIBSYS_HIT:5141235480002210,ie=47BIBSYS_HIOA:5166278170002212,ie=47BIBSYS_HIM:5134265290002223,ie=47BIBSYS_HH:5149075130002214,ie=47BIBSYS_BI:5172882850002215,ie=47BIBSYS_FHS:5118701700002275,ie=47BIBSYS_NETWORK:71520532450002201,language=eng,view=UBIS&svc_dat=viewit&u.ignore_date_coverage=true&user_ip=10.16.56.140&req.skin=primoView online Book Python data science handbook : essential tools for working with data VanderPlas, Jacob T., Sebastopol, Calif. :, O'Reilly, XVI, 529 sider ;, cop. 2017., isbn:978-1-491-91205-8,
The course description is retrieved from FS (Felles studentsystem). Version 1