Course

Reinforcement Learning (DAT605)

This course would introduce AI and Optimization in a fun, easy, interesting, immersive, and hands-on way. Optimization problems are becoming essential across multiple disciplines. The skills gained through this course would allow the use of efficient optimization strategies in work processes. These could include optimization of complex machine learning models making them more efficient, creating exploratory models that without training can evaluate a situation and gradually make positive decisions, explore financial data to discover patterns that lead to beneficial outcomes, and others.


Dette er emnebeskrivelsen for studieåret 2024-2025. Merk at det kan komme endringer.

See course description and exam/assesment information for this semester (2024-2025)

Semesters

Fakta

Emnekode

DAT605

Vekting (stp)

5

Semester undervisningsstart

Autumn

Undervisningsspråk

English

Antall semestre

1

Vurderingssemester

Autumn

Content

NB! This is an elective course and may be cancelled if fewer than 10 students are enrolled by August 20th.

Artificial Intelligence in this era has become synonymous with Supervised and Unsupervised Learning. Supervised learning is best suited for cases that have a vast set of examples of inputs and desired outputs and the objective is to learn based on such examples in order to generate output from some future, currently unseen input. Text classification, Image Classification, Object location, Regression problems, and Sentiment analysis are areas where supervised learning is extensively used. Whereas, Unsupervised learning aims to discover some hidden structure of the data without the need to have a specific distinction in the input and output values. Such learning techniques are commonly used for clustering of data that tries to combine data items into a set of clusters revealing relationships in data.

Reinforcement learning lies somewhere in between supervised and unsupervised forms of learning techniques. On one hand, it builds on established methods of supervised learning for function approximation, stochastic gradient descent, and backpropagation to learn data representation, however, on the other hand, it does not require supervision in order to discover hidden patterns and relationships in data. Reinforcement learning primarily focuses on the problem of automatic learning of optimal decisions over time in a complex environment by building on advances in computer science, behavioral psychology, and neuroscience. Due to its flexibility and generality, the field of RL is developing very quickly and attracting lots of attention, both from researchers who are trying to improve existing methods or create new methods and from practitioners interested in solving their problems in the most efficient way.

The target group for this course would be professionals and students working or interested in areas of artificial intelligence, machine learning, game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, and statistics.

Learning outcome

Concepts covered in this course would provide relevant theoretical and hands-on programming knowledge. Every topic is demonstrated using easy-to-understand real-world examples. The following topics would be covered during the course duration:

Knowledge:

  • Topic 1: Reinforcement Learning - an introduction
  • Topic 2: Course Materials, Supplementary Resources, and Development Environment
  • Topic 3: Tabular Methods
    • Multi-Armed Bandit
    • Markov Decision Processes
    • Cross-Entropy Method
  • Topic 4: Dynamic Programming
  • Topic 5: Monte-Carlo & Temporal Difference and Q-Learning
  • Topic 6: Policy Gradients
  • Topic 7: The Actor-Critic Method
  • Topic 8: Deep Q-Network - an Overview
  • Topic 9: Further Exploration

Skills:

  • Good programming knowledge
  • Knowledge of basic algebra, probability, and statistics
  • Python Programming Knowledge
  • Understanding of Numpy, Matplotlib

General competence:

  • Artificial Neural Networks
  • Deep Learning
  • Convolutional Neural Networks

Forkunnskapskrav

Ingen

Anbefalte forkunnskaper

Introduction to Programming (DAT120), Introduction to Data Science (DAT540), Probability and Statistics 2 (STA500)

Exam

Form of assessment Weight Duration Marks Aid Exam system Withdrawal deadline Exam date
Project report (group) 1/2 Letter grades 18.10.2024
Oral exam 1/2 Letter grades None permitted 22.11.2024 06.12.2024


Project Work in Groups

The project is completed in groups. Project work is to be performed in the groups that are assigned and published. Absence due to illness or for other reasons must be communicated as soon as possible to the lecturer.

A project report including source code, and an oral exam contribute to the grade.

If a student fails the project work, he/she has to take it again the next time the subject is lectured.

Vilkår for å gå opp til eksamen/vurdering

Assignments

Throughout the semester students work individually on 2 assignments. These assignments are mandatory and must be passed within the announced deadline so that the student has the right to take the exam.

Absence due to illness or for other reasons must be communicated to the laboratory personnel as soon as possible. One cannot expect that provisions for completion of lab assignments at other times are made unless prior arrangements with the laboratory personnel have been agreed upon.

Fagperson(er)

Head of Department:

Tom Ryen

Course coordinator:

Antorweep Chakravorty

Method of work

The work will consist of 4 hours of lecture and 2 hours of scheduled laboratory. Students are expected to spend an additional 4-8 hours a week on self-study, group discussions, and development work.

Åpent for

Admission to Single Courses at the Faculty of Science and Technology
Data Science - Master of Science Degree Programme
Exchange programme at Faculty of Science and Technology

Emneevaluering

There must be an early dialogue between the course supervisor, the student union representative and the students. The purpose is feedback from the students for changes and adjustments in the course for the current semester.In addition, a digital subject evaluation must be carried out at least every three years. Its purpose is to gather the students experiences with the course.

Litteratur

Book Deep Reinforcement Learning in Action Brandon Brown ; Alexander Zai, Manning Publications, 2020-03-29, isbn:1-63835-050-7, https://bibsys-ur.userservices.exlibrisgroup.com/view/uresolver/47BIBSYS_UBIS/openurl-XYEUBIS?ctx_enc=info:ofi/enc:UTF-8&ctx_id=10_1&ctx_tim=2024-03-10T21%3A30%3A12IST&ctx_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&url_ver=Z39.88-2004&rfr_id=info:sid/primo.exlibrisgroup.com-safari&rft_val_fmt=info:ofi/fmt:kev:mtx:&rft.genre=unknown&rft.au=Brandon%20Brown&rft.date=2020-03-29&rft.isbn=9781617295430&rft_id=info:doi/&rft.pub=Manning%20Publications&rft_id=info:bibcode/&rft_id=info:hdl/&rft_id=info:lccn/&rft_id=info:oclcnum/&rft_id=info:pmid/&rft_id=info:eric/&rft_dat=%3Csafari%3E9781617295430VE%3C/safari%3E,language=eng,view=UBIS&svc_dat=getit&user_ip=10.16.56.56&req.skin=primoView online Book Reinforcement Learning [electronic resource] : State-of-the-Art Wiering, Marco ; Otterlo, Martijn van, van Otterlo, Martijn ; Wiering, Marco, Berlin, Heidelberg :, Springer Berlin Heidelberg ; Imprint Springer, 1 online resource (653 p.), 12, 2012., isbn:1-280-79537-9; 9786613705761; 3-642-27645-8, https://bibsys-ur.userservices.exlibrisgroup.com/view/uresolver/47BIBSYS_UBIS/openurl-XYEUBIS?ctx_enc=info:ofi/enc:UTF-8&ctx_id=10_1&ctx_tim=2024-03-10T21%3A30%3A51IST&ctx_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&url_ver=Z39.88-2004&rfr_id=info:sid/primo.exlibrisgroup.com-proquest_opena&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=book&rft.btitle=Reinforcement%20Learning:%20State-of-the-Art&rft.au=Wiering,%20Marco&rft.date=2012&rft.volume=12&rft.issn=1867-4534&rft.eissn=1867-4542&rft.isbn=364227644X&rft_id=info:doi/10.1007/978-3-642-27645-3&rft.eisbn=9783642276453&rft.edition=1.%20Aufl.&rft.pub=Springer-Verlag&rft.place=Berlin,%20Heidelberg&rft.series=Adaptation,%20learning,%20and%20optimization,%20volume%2012&rft_id=info:bibcode/&rft_id=info:hdl/&rft_id=info:lccn/&rft_id=info:oclcnum/801364160&rft_id=info:pmid/&rft_id=info:eric/&rft_dat=%3Cproquest_opena%3EEBC973075%3C/proquest_opena%3E,language=eng,view=UBIS&svc_dat=getit&user_ip=10.16.56.56&req.skin=primo&rft_pqid=EBC973075View online Book Learning to play : reinforcement learning and games Plaat, Aske,, Cham, Switzerland :, Springer, 1 online resource (XIII, 330 p. 111 illus., 72 illus. in color.), [2020]; ©2020, isbn:3-030-59238-3, https://bibsys-ur.userservices.exlibrisgroup.com/view/uresolver/47BIBSYS_UBIS/openurl-XYEUBIS?ctx_enc=info:ofi/enc:UTF-8&ctx_id=10_1&ctx_tim=2024-03-10T21%3A30%3A51IST&ctx_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&url_ver=Z39.88-2004&rfr_id=info:sid/primo.exlibrisgroup.com-proquest_opena&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=book&rft.btitle=Learning%20to%20Play:%20Reinforcement%20Learning%20and%20Games&rft.au=Plaat,%20Aske&rft.date=2020&rft.isbn=3030592383&rft_id=info:doi/10.1007/978-3-030-59238-7&rft.eisbn=3030592383&rft.edition=1st%20ed.%202020.&rft.pub=Springer%20Nature&rft.place=Cham&rft_id=info:bibcode/&rft_id=info:hdl/&rft_id=info:lccn/&rft_id=info:oclcnum/1224141418&rft_id=info:pmid/&rft_id=info:eric/&rft_dat=%3Cproquest_opena%3EEBC6455970%3C/proquest_opena%3E,language=eng,view=UBIS&svc_dat=getit&user_ip=10.16.56.56&req.skin=primo&rft_pqid=EBC6455970View online Book Reinforcement learning an introduction Sutton, Richard S., Andrew G Barto, Cambridge, Mass. :, MIT Press, xviii, 322 p. :, 1998.; ©1998., isbn:0-262-30384-1; 9786612096785; 1-282-09678-8; 0-262-25705-X; 0-585-02445-6, https://bibsys-ur.userservices.exlibrisgroup.com/view/uresolver/47BIBSYS_UBIS/openurl-XYEUBIS?ctx_enc=info:ofi/enc:UTF-8&ctx_id=10_1&ctx_tim=2024-03-10T21%3A30%3A51IST&ctx_ver=Z39.88-2004&url_ctx_fmt=info:ofi/fmt:kev:mtx:ctx&url_ver=Z39.88-2004&rfr_id=info:sid/primo.exlibrisgroup.com-BIBSYS_ILS&rft_dat=ie=47BIBSYS_DIAKON:5135911770002247,ie=47BIBSYS_UBIS:51144316020002208,ie=47BIBSYS_UBB:51108250800002207,ie=47BIBSYS_UBTO:5177167990002205,ie=47BIBSYS_UBO:51175654250002204,ie=47BIBSYS_NTNU_UB:51296379000002203,ie=47BIBSYS_NISK:5115176090002267,ie=47BIBSYS_MF:5187680340002227,ie=47BIBSYS_HIO:5180922370002218,ie=47BIBSYS_HIT:5166412190002210,ie=47BIBSYS_HIOA:5187542720002212,ie=47BIBSYS_SIRUS:5110167560002256,ie=47BIBSYS_AHUS:5125234340002263,ie=47BIBSYS_NETWORK:71521093970002201,language=eng,view=UBIS&svc_dat=viewit&u.ignore_date_coverage=true&user_ip=10.16.56.56&req.skin=primoView online
The course description is retrieved from FS (Felles studentsystem). Version 1