UBC's Data Science Minor is an interdisciplinary program that enables students to gain the skills necessary to perform data science tasks in conjunction with the skills they learn in their major.
In this minor, students gain an understanding of key data science concepts such as how to program using data, use statistics on data, and how to use machine learning and statistical models. The Minor in Data Science is an interdisciplinary and interdepartmental undergraduate program administered through the Faculty of Science. This program is open to any UBC-Vancouver undergraduate student.
Prerequisites for admission to the minor
A minimum grade of 68% in DSCI 100 and a minimum grade of 68% in (a) one of CPSC 110, CPSC 107, CPSC 103, EOSC 211, MATH 210, PHYS 210, ECON 323, COMM 337, APSC 160 or (b) any CPSC course numbered 200 or higher.
Students applying for the Data Science minor must, at the time of application, satisfy the prerequisites for admission mentioned above and the requirements imposed by their home faculty.
Program requirements for the Data Science Minor
- Data Science: DSCI 100.
- Statistical Inference: STAT 201.
- Programming: One of CPSC 203, CPSC 210, CPEN 221. For most non-CS majors, we recommend CPSC 103 followed by CPSC 203.
- Calculus: One of MATH 100, MATH 102, MATH 104, MATH 110, MATH 120, MATH 180, MATH 184, SCIE 001.
There are 6 courses (18 credits) of upper-level requirements. The requirements are:
- Statistical inference: STAT 301
- Machine learning: CPSC 330
- Ethics: CPSC 430
- Three of the following five options:
- Reproducible data science: DSCI 310
- Data visualization: DSCI 320
- Cloud computing and big data: CPSC 416
- Databases: One of CPSC 368, CPSC 304, COMM 437
- Discipline-specific data science courses: one of COMM 335, COMM 365, COMM 414, COMM 415, CPSC 322, CPSC 340, CPSC 406, ECON 398, ECON 425, EOSC 410, INFO 419, LING 342, MATH 441, MATH 442, MICB 405, MICB 425, PHYS 410, PSYC 359, STAT 406, STAT 447B, STAT 450.
New Data Science courses
DSCI 310 Reproducible and trustworthy workflows for data science
Data science methods to automate the running and testing of code and analytic reports, manage data analysis software dependencies, package and deploy software for data analysis, and collaborate with others using version control.
- Credits: 3
- Pre-reqs: DSCI 100 and either (a) one of CPSC 203, CPSC 210 or CPEN 221 or (b) MATH 210 and one of CPSC 107, CPSC 110.
DSCI 320 Visualization for data science
Analysis, design, and implementation of static and interactive visual representations. Visualization literacy. Data communication. Exploratory Data Analysis. Application of theoretical principles to visualization development.
- Credits: 3
- Pre-reqs: STAT 201 and one of CPSC 203, CPSC 210, or CPEN 221.
CPSC 368 Databases in Data Science
Overview of relational and non-relational database systems. Role and usage of a database when querying data. Topics include data modelling, query languages, and query optimization.
- Credits: 3
- Pre-reqs: One of CPSC 203, CPSC 210, CPEN 221.
As an example path, consider a student in Psychology who is interested in the Data Science minor. This example student might take the following courses:
- DSCI 100
- CPSC 103
- CPSC 203
- STAT 201
- MATH 102
- STAT 301
- CPSC 330
- CPSC 368
- DSCI 310
- DSCI 320
- PSYC 359
Program learning outcomes
- Identify and collect data necessary to answer a given research question through sampling and/or through extracting data from pre-existing sources (relational databases, html web pages, web APIs, etc)
- Manipulate messy, ill-formed data to extract meaningful insights.
- Map and apply an appropriate data analysis approach to a given research question and the data at hand.
- Select data science methods to work with diverse data types across diverse subject-area domains.
- Build statistical models that are appropriate given the distribution(s) of the data, and appropriately quantify uncertainty of resulting estimates and predictions.
- Apply fundamental programming principles in the data analysis process to make analysis code readable, modular, accurate and scalable.
- Communicate results of data science experiments to diverse audiences through data visualizations, written work and oral presentations.
- Employ best practices for collaboration for projects that involve both code and people.
- Perform and communicate results from analyses that are fair, equitable and honest.
- Employ workflows that facilitate reproducible and transparent data analyses.