Survival Data Analysis
Assignment One
Due: 5PM, Monday, 02/04/19
This assignment includes the following tasks:
Part 1. (Submission not required)
1. Install R. Use this URLs or go directly to a mirror site such as that at
Duke U to download and install R. Use Windows version for a PC and
(Mac) OR X for a Mac. The package “base” is what we need. R studio
is a user interface that supports the use of command menu and can be
helpful to new users who are less familiar with R commands. It is a good
idea to explore, although the class will not emphasize R Studio.
2. To become familiar with basic R operations (see R notes), including
- Start and close an R session
- Execute commands in the Console window
- Execute commands in a Script window
- Save commands in the Script Window into an R script file (program
file) for future reference
- Open an existing script file in a Script window
- Save and update the content of an R session into a Workspace for
future use; use different workspaces for different projects;
- Use online help
- Import data: upload datasets by using the command read.csv,
read.table etc. (R commands will be in italic)
- Learn about the various data structure in R: vector, matrix, data
frame, and array using commands is.data.frame, is.vector, is.matrix,
is.list, as.data.frame, as.vector, as.matrix, as.list
- Learn about data type: numeric, character, factor, by using commands
is.numeric, is.character, is.factor, is.logical, as.numeric,
as.character, as.factor, as.logical
- Become familiar with viewing and manipulating data in R. Basic
commands include: nrow, names, length, and those of data subsetting
and extraction (<-, [,], etc) and statistical operations (mean, var, sd
etc)
- Become familiar with command for data displaying: table, plot,
boxplot, etc
- Learn about statistical analysis commands: Fisher.test, chi.squared,
t.test, etc
- R notes are included posted for your reference
Part 2. (Due 5PM 02/04/19) Analyses of 250-day Mortality & Survival
Data
3. Download the file “Death250” from canvas, and read it into R using the
command read.csv. Also utilize R command “?read.csv” to review
online help of read.csv. Following the sample script file of R, compare
the mortalities between groups B and C at three distinct days (other than
day 120 and 250). For mortality rates determined at each time point,
construct a 2 by 2 table, and apply both the Chi-squared test and Fisher’s
exact test. State your hypotheses, report and interpret the results. Before
the analyses, we need to convert the data according to the dates of your
choosing. For this exercise, we will ignore group A.
4. In contrast to comparing mortalities, we conduct a weighted log-rank test
to compare time-to-death (survival) between groups B and C. State the
hypotheses and interpret your findings. One can further select different
value of rho=0 (default), 0.5, and -0.5 to weigh data differently to reflect
different types of alternative hypotheses. Explain how the weights
impact the significance (p-value) of the tests.
5. Obtain and plot Kaplan-Meier estimate of the survivorship for both
groups B and C. Report (a) the point estimate and its confidence interval
for the median survival as well as (b) survivorship at day 250.
6. In one’s own language, discuss the difference between the analysis of
mortality (measured in a fixed time period) and the analysis of
survivorship. Feel free to speculate the advantages of analyzing
survivorship over mortality.
Additional learning objectives:
(1) Review the rational for and approach to testing association between a
classification factor (treatment) and dichotomous response (mortality,
prevalence or cumulative incidence) using either the Chi-squared or Fisher’s
exact test in the setting of a 2-by-2 table
(2) Contrast time-to-event data with incidence or prevalence data through
examples
(3) Be exposed to Kaplan-Meier estimate and log-rank tests as an hands-on
introduction to survival data analyses
_______________________________________________________________
General Requirements for Assignments
Homework shall be summarized in a brief report following the style of a
scientific report. It shall include
(1) A statement of the problem, and a brief description of the background. The
description aims to help a lay person (e.g. one who is not in the class) to
understand what one tries to address. In addition there shall be a statement and
description of the statistical problem that is translated from the substantive
problem.
(2) A precise description of the statistical approach or methods taken in the
work.
(3) Results section that presents the results of your analyses. Try to re-organize
your results into tables and figures with relevant and succinct information.
Neither there is a need nor is it helpful to present every piece of computer
output in the results section.
(4) Interpretation of the results that addresses the substantive question. This
requires a translation of the statistical results back to the substantive question.
Simply stating that the results are significant because p-value is less than 0.01
does not translate or interpret well.
(5) Conclusion. When feasible, draw a conclusion and discuss potential issues
and uncertainties.
(6) Appendix when necessary. Computer program, additional outputs that are
useful supplements can be attached in appendices. It is encouraged to include
comments in your statistical analysis programs for documentation purpose.
Attachment in email or submission to canvas is also highly encouraged to
reduce the use of paper. As a convention, please name files using assignment
number and one’s name. For example, Assignment1_Zhu,
Assignment1_app_Zhu.
In general, one may follow the style of technical and/or scientific journals. To
that end, figures and tables shall be labeled with a title (and footnotes) and
enumeration (e.g. Figure 1, Table 2). Appropriate references should be made
and citation given. There is no fixed page requirement as long as the report is
complete and informative.
因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com
微信:codinghelp