Department of Computer and Mathematical Sciences
STAC51: Categorical Data Analysis
Winter 2021
Instructor: Sohee Kang
E-mail: sohee.kang@utoronto.ca
Office: IC 483
Online Office Hours: Monday 5-6 pm and Wednesday 5-6 pm
(416) 208-4749
TA: Bo Chen TA: Lehang Zhong
E-mail: bojacob.chen@mail.utoronto.ca E-mail: lehang.zhong@mail.utoronto.ca
Course Description: In this course we discuss statistical models for categorical data. Contingency
tables, generalized linear models, logistic regression, multinomial responses, logit models for
nominal responses, log-linear models for two-way tables, three-way tables and higher dimensions,
models for matched pairs, repeated categorical response data, correlated and clustered responses
and statistical analyses using R. The students will be expected to interpret R codes and outputs
on tests and the exam.
Prerequisite(s): STAB27H3 or STAB57H3 or MGEB12H3 or PSYC08H3
Credit Hours: 3
Required Text: An Introduction to Categorical Data Analysis, 3rd Edition
Author(s): Alan Agresti
WebLink for 2nd edition: https://search.library.utoronto.ca/details?7961944
Sub-text1: Categorical Data with R, 3rd edition
Author: Alan Agresti
Sub-text2: Analysis of Categorical Data with R (2014)
Author:Bilder C. and Loughin T.
Course Objectives:
At the completion of this course, students will be able to:
1. use R software to conduct categorical data analysis.
2. identify designs of contingency tables and recommend appropriate measures of association
and statistical testsSTAC51编程代做
3. develop models for binary response and polytomous categorical responses, interpret results
and diagnose model fits.
4. interpret and communicate categorical data methods to a technical audience.
1
Grade Components:
Case Study and Presentation 15%
Assignments 15%
Quizzes 15%
Midterm Exam 20%
Final Exam 30%
Attendance 5 %
Course Policy:
• Communication
– Important announcements, lecture notes, additional material, and other course info will
be posted on Quercus. Check it regularly. You are responsible for keeping up with
announcements from instructors on Quercus and via e-mail.
– Check “Piazza” before you send an e-mail, make sure that you are not asking for
information that is already on “Piazza”. In general, I will not answer questions about
the course material by e-mail. Such questions are more appropriately discussed during
office hours of me or TAs.
– E-mail is appropriate for private communication. Use your utoronto.ca account and
include STAC51 in the subject line.
• Oral Assessment
If the instructor has a suspicion on your assessment result (the deviance is great) then she
will conduct an oral assessment after. If the oral assessment result confirms the suspicion
then the previous assessment score will be replaced to 0.
• No makeup quizzes or exams will be given.
Learning Components:
• Tutorial
Students are expected to attend the weekly tutorial to gain practical R programming experience.
Quizzes will be conducted in tutorial. You need to turn on videos so that TAs can
invigilate.
• Assignments
Three assignments (each 5%) will be distributed. All assignments are group works (two team
members) unless you prefer individual work.
• Quiz
Three quizzes (each 5%) will take place after the assignments handed in.
• Case Study and Presentation
Students will be required to work on a case study as a group and to submit a report. The
size of the group is maximum of FOUR. You can choose your group members. For a report,
students will write R codes and interpret R outputs and will use R Markdown (R package).
More details, such as the content and deadline, will be communicated later. No late report
will be accepted. Each group will present the case study (5 minutes) at the last day of
lecture.
2
• Attendance Attendance is expected and will be taken each class and tutorial.
• Computing Statistical computing is a key part of the class. In-class analysis will be conducted
in R and all course material (code and data) is in R format. R is free and available for
download at http://www.r-project.org, and you can find manuals and installation guidelines
on this site.
For basics in R, here are suggested documents: R for beginners by Emanuel Paradis, An
Introduction to R by W. N. Venables, D. M. Smith, and the R Core Team, A (very) short
introduction to R by Paul Torfs and Claudia Brauer. More information and documentation
are available on The R Project website. Students are expected to write R codes and interpret
R outputs on assignments, tests, and the exam.
Outline of Topics:
Chapter Content
Ch. 1
• Introduction
• Distributions for categorical data
• Statistical inference for categorical data
Ch. 2 • Describing contingency tables, independence of categorical variables
• Comparing proportions, Relative risk, Odds ratio
Ch. 2 • Inference for contingency tables, Chi-squared tests of independence
• Exact tests for small samples
Ch. 3 • Introduction to Generalized Linear Models: Generalized linear models for binary
data, Poisson log linear models, Negative binomial GLMs
Ch. 4 • Logistic Regression
Ch. 5 • Building, Checking, and applying logistic regression models.
Ch. 6 • Models for multinomial responses.
Ch. 7 • Loglinear models for two-way tables, Loglinear models for three-way tables,
Inference for loglinear models.
Ch 8 • Models for matched pairs.
3
University Policies
• Academic Integrity:
Academic integrity is essential to the pursuit of learning and scholarship in a university,
and to ensuring that a degree from the University of Toronto is a strong signal of each students
individual academic achievement. As a result, the University treats cases of cheating
and plagiarism very seriously. The University of Torontos Code of Behaviour on Academic
Matters (http://www.governingcouncil.utoronto.ca/policies/behaveac.htm) outlines the behaviours
that constitute academic dishonesty and the processes for addressing academic offences.
Potential offences include, but are not limited to:
In papers and assignments:
– Using someone elses ideas or words without appropriate acknowledgment.
– Submitting your own work in more than one course without the permission of the instructor.
– Making up sources or facts.
– Obtaining or providing unauthorized assistance on any assignment.
On tests and exams:
– Using or possessing unauthorized aids.
– Looking at someone elses answers during an exam or test.
– Misrepresenting your identity.
In academic work:
– Falsifying institutional documents or grades.
– Falsifying or altering any documentation required by the University, including (but not
limited to) doctors notes.
All suspected cases of academic dishonesty will be investigated following procedures outlined
in the Code of Behaviour on Academic Matters. If you have questions or concerns about what
constitutes appropriate academic behaviour or appropriate research and citation methods, you
are expected to seek out additional information on academic integrity from your instructor
or from other institutional resources (see http://www.utoronto.ca/academicintegrity/).
• Accessibility:
Students with diverse learning styles and needs are welcome in this course. In particular,
if you have a disability/health consideration that may require accommodations, please feel
free to approach me and/or the AccessAbility Services Office as soon as possible. I will
work with you and AccessAbility Services to ensure you can achieve your learning goals in
this course. Enquiries are confidential. The UTSC AccessAbility Services staff (located in
S302) are available by appointment to assess specific needs, provide referrals and arrange
appropriate accommodations (416) 287-7560 or ability@utsc.utoronto.ca.
如有需要,请加QQ:99515681 或WX:codehelp
相关文章
- 02-03《python for data analysis》第五章,pandas的基本使用
- 02-03学习笔记之Python for Data Analysis
- 02-03Python for Data Science - Summarizing categorical data using pandas
- 02-03Generic recipe for data analysis with general linear model
- 02-03Proj THUDBFuzz Paper Reading: 南京大学软件分析课程2020, 05 Data Flow Analysis - Foundations I
- 02-03<1> Python for Data Analysis — 数据结构、函数和文件
- 02-03特征工程-EDA(Exploratory Data Analysis)
- 02-03关于data analysis的一些信息收集
- 02-03STAC51: Categorical Data Analysis
- 02-03STATS 201/8 Data Analysis