Sie sind auf Seite 1von 27

HNSC 7150X Fundamentals

of Biostatistics
Transplant Data Sets Report by
Ogbo Akpara
Professor: E.R. Pouget
11/29/17
Transplants
• I chose this data set because I was very interested in learning more about
Transplants.
• The definition of a transplant is to transfer (an organ or tissue) from one
individual to another.
• Many people in this world are in need of any kind of transplant so that they can
have a second chance at life.
• According to the University of New Orleans School Transplant Living, there is a
20% increase in transplants over 5 years. This is very impressive because each
year, individuals are getting Transplants.
• “Thousands more men, women and children are receiving a life-saving transplant
opportunity each year,” said Dr. Stuart Sweet.
Data: Xplant.xls
• There are 5 variables that were given to me and they are as follows:
• V1 ( Month)  Months represents either the time between the
transplant and death or the time between transplant and the end of
the patient's observation period in the study
• V2 (Status)  Status represents whether the patient was alive or
dead at the end of the duration they were observed in the study.
• V3 (Xplant)  Transplant Performed
• V4 (Age)  Age at Acceptance at which the transplant was done.
• V5 (Pheart)  Previous Open-Heart Surgery that was tested.
Data: Xplant.xls
• When I used SPSS for the transplant data sets, most of the variables
were coded as numeric making it easy to analyze.
• For this data set, I will use the topics that we learned in class and also
SPSS to calculate Transplants.
• The topics I will use are Descriptive Statistics which includes
(Frequencies and Descriptives), Survival Analysis, Regression,
Correlation which includes (Bivariate), Comparing Means (One
Sample T-test)
Descriptive Statistics
• N = 97 individuals
• No individuals were missing.
Descriptive Statistics
status xplant age pheart
N Valid 97 97 97 97
Missing 0 0 0 0
Mean 0.71 0.65 45.14367 0.16
Median 1.00 1.00 47.75300 0
Mode 1 1 47.981 0.00
Standard Deviation 0.455 0.480 9.813538 0.373
Minimum 0 0 8.787 0
Maximum 1 1 64.407 1

Sum 69 63 4378.936 16
As you can see from this table, in the central The maximum number was age.
tendency, which includes mean, median and The minimum number were xplant,
mode, age had the largest value. status and pheart.
The measure of dispersion which is standard Age had a higher sum than pheart,
deviation, age also had the highest. xplant and status.
Pheart Frequency
Frequency Percent Valid Percent Cumulative Percent
Valid 0 81 83.5 83.5 83.5

1 16 16.5 16.5 100.0


Total 97 100.0 100.0

As you can see from this frequency table, the number 0 represents alive and 1
represents dead. This sample contained a total of 97 individuals. Of those individuals, 16
of them died. If you look at the percentage out of 100, 83.5% of individuals are alive and
16.5% of individuals died. The valid percent is identical to the percent. The cumulative
percent for individuals alive is 83.5% and those who died are 100% because you are
adding all the numbers in the valid percent column.
Xplant Frequency
Frequency Percent Valid Percent Cumulative Percent
Valid 0 34 35.1 35.1 35.1
1 63 64.9 64.9 100.0
Total 97 100.0 100.0

As you can see from this frequency table, the number 0


represents alive and 1 represents dead. This sample contained
a total of 97 individuals. Of those individuals, 63 of them died.
If you look at the percentage out of 100, 35.1% of individuals
are alive and 64.9% of individuals died. The valid percent is
identical to the percent. The cumulative percent for individuals
alive is 35.1% and those who died are 100% because you are
adding all the numbers in the valid percent column.
Bar Graph on pheart versus frequency

Value living :
1 = ‘Dead’
0 = ‘Alive’
As you can see from this
graph, this graph plots
pheart versus frequency.
There are 80 people alive
and 15 people dead in this
bar graph.
Individuals who are alive
have a greater frequency
than those who died.
Bar Graph on xplant versus frequency

Value living :
1 = ‘Dead’
0 = ‘Alive’
As you can see from this graph, this graph
plots xplant versus frequency. There were
63 people dead and 30 people alive in this
bar graph.
Individuals who died have the largest
frequency than those who are alive.
Survival Function Block 1
As you can see in this table, age and xplant have a p-value
less than 0.001. This indicates that it is statistically significant.
Variables in the The Sig value is reported to be 0.000 indicating that it is less
Equation than 0.001 but not exactly 0.

B S.E Wald DF Sig Exp(B)


age 0.071 0.017 18.133 1 0.000 1.073
xplant 2.097 0.288 52.866 1 0.000 8.142

Variables not Residual Chi-


in the Square Score =
equation 0.522 with df of 1.
The significance
Score DF Sig
test is 0.470 which
pheart 0.522 1 0.470 is not significant
because the p
value is greater
than 0.05.
Survival Function Block 2
B S.E Wald DF Sig Exp(B)
age 0.070 0.016 17.907 1 0.000 1.072
xplant 2.035 0.298 46.627 1 0.000 7.654
pheart 0.272 0.378 0.519 1 0.471 1.313

As you can see from this


survival table, age and xplant
Covariate Mean are statistically significant and
they have a p-value less than
Mean 0.001. Pheart is not significant.
P>0.05 (Not significant)
age 45.144 P<0.05 (Significant)
xplant 0.351
As you can see from the
pheart 0.835 covariate mean, age had
the highest covariate
mean than is pheart and
than is xplant.
Survival Cox Regression
Frequency 1 to the power of C
Xplant 0 34 1
1 63 0
Pheart 0 81 1
1 16 0

The category variables are xplant As you can see from this survival cox regression table, the
and pheart. number 0 represents alive and 1 represents dead. This sample
contained a total of 97 individuals. Of those individuals, 63 of
them died in the xplant variable and were censored given the
number 1. 34 of those individuals were alive and were
observed.
In the pheart variable, 81 of the individuals were alive and
were censored. 16 of the individuals died and were observed.
As you can see from this table, the total
Survival Function for Case cases analyzed was 97. The number of
events in the cases available in analysis is
Processing Summary 69 and the percentage is 71.1%. The
number of cases censored was 28 and
the percentage was 28.9%.
N Percent
Cases available Event 69 71.1%
in analysis Censored 28 28.9%
Total 97 100.0%
Cases dropped Cases with 0 0.0%
Missing value.

Cases with
Negative time. 0 0.0%

Censored cases 0 0.0%

Total 0 0.0%
Total 97 100.0%
The horizontal axis represents time in months, and
Survival Function the vertical axis shows the probability of cumulative
surviving or the proportion of people surviving.
At time zero, the survival probability is 1.0 (or 100%
of the participants are alive).
A flat survival curve (i.e. one that stays close to 1.0)
suggests very good survival, whereas a survival
curve that drops sharply toward 0 suggests poor
survival.
The survival function plots
Cumulative Survival versus
months.
As you can see from this line
graph, as the months
increases, the cumulative
survival decreases.
The survival function provides the probability that a subject will
survive past time t. As t ranges from 0 to 45, the survival
function has the following properties : It is not increasing at
time t = 0, S(t) = 1. In other words, the probability of surviving
past time 0 is 1. FYI: At time t = infinity, S(t) = S(infinity) = 0. As
time goes to infinity, the survival curve goes to 0.
One Minus Survival Function
The survival function also plots one
minus the cumulative survival
versus months.
As you can see from this survival
function, as the month increases,
the survival increases.

The survival function does the antilog of the previous slide. You
can use the notation e to get the anti-log of the survival
function.
Regression
Are pheart and xplant correlated?

• Research Question #1:


• H0: Rp,x =0.0
• H1: Rp,x is not equal to 0.0
Coefficient Correlations
Model 1 pheart xplant
Correlation pheart 1.000 -0.152
xplant -0.152 1.000

The dependent
variable is age. • Looking at the table, the correlation coefficient has a value
This correlation between +1 and −1, where 1 is total positive linear
compares pheart correlation, 0 is no linear correlation, and −1 is total
versus xplant. negative linear correlation.
As you can see • In the section where it says pheart, pheart has a positive
from this table, linear correlation and xplant has a negative linear
xplant and correlation.
pheart • In the section where it says xplant, pheart has a negative
correlations are linear correlation and xplant has a positive linear
crisscrossed. correlation.
Model Summary for Regression
Model R R Square Adjusted Std. R Square F Change Df1 Df2 Sig F
R Square Error of Change Change
the
Estimate
1 0.130 0.017 -0.004 9.832830 0.017 0.812 2 94 0.447

As you can see from the table, this proportion


varies between 0 and 1 and is symbolized by R2 (R A negative adjusted r square simply means you have
Square). The value of our R2 is 0.017, which means a small R2 which turns negative when you calculate
that 1.7 percent of the total variance is very small. the Adjusted R2.
Not very impressive, but not bad either compared As the adjusted R2 is given by {[(n -1)/(n-k)]*R2 + (1-
with the adjusted R2 values which is -0.004. k)/(n-k)}, you'll get a negative value whenever n<k
(that is, whenever the number the observations is
What is important is that the Sig F change lower than the number of estimated parameters).
test is not statistically significant at the .05 A low R2, as well as a low Adjusted R2, means that
level. This means that this set of variables your model has a low fit.
didn’t add predictive value.
Unstandardized Coefficients vs Standardized
Coefficients vs 95% CI for B
Model 1 B Std. Error Beta t Sig Lower Upper
Bound Bound
Constant 43.395 1.703 25.477 0.000 40.013 46.777
Xplant 2.631 2.117 0.129 1.243 0.217 -1.572 6.835
Pheart 0.241 2.722 0.009 0.088 0.930 -5.163 5.644
Unstandardized Coefficients -> Standardized Coefficients -> 95% CI for B ->
Model1, B, and Std. Error Beta, t, and Sig Lower Bound and
Upper Bound
As you can see from this table, xplant and pheart are not significant because
there p value is greater than 0.05.
When you look at the 95% CI, the upper bound that contains xplant has a
greater value than the lower bound that contains the pheart.
In the Unstandardized Coefficient section, xplant has a lower standard error
than pheart.
Are age and pheart correlated?

Research Question #2

• H0: r a,p =0.0


• H1: r a,p is not equal to 0.0.
Bivariate( Pearson’s) Correlation
Age pheart
Age Pearson Correlation 1 0.029
Sig (2-tailed) 0.780
N 97 97
Pheart Pearson Correlation 0.029 1
Sig (2-tailed) 0.780
N 97 97

• Looking at the table, the correlation coefficient has a value between +1 and −1,
where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is
total negative linear correlation.
• In the section where it says age, age has a positive linear correlation and pheart
has no linear correlation.
• In the section where it says pheart, age has no linear correlation and pheart has
a positive linear correlation.
• In the section where it says age, pheart is not significant at the 2 tailed test.
• In the section where it says pheart, age is not significant at the 2 tailed test.
• P<0.05 (Significant) , P>0.05 (Not Significant)
Regression versus Residual
Anova Test

Model 1 Sum of Df Mean Square F Sig


Squares
Regression 156.983 2 78.491 0.812 0.447
Residual 9088.347 94 96.685
Total 9245.330 96

The dependent As you can see


variable was from this table, the
age. p value is greater
The predictor than 0.05, so we
or constant accept the null
was pheart. hypothesis in favor
of the alternative.
Scatter plot for the regression standardized
residual for age versus pheart

This scatterplot shows a linear trend,


but there also appears to be an
outlier.
The outlier in this graph is an
observation point that is distant
from other observations.
An outlier may be due to
inconsistency in the measurement or
it may indicate experimental error.
Does xplant and pheart have similar or
different t-test?
Research Question 3

• H0: t-testx-t-testp=0.0
• H1: t-testx-t-testp is not equal to 0.0
One-sample T-test
T Df Sig(2-tailed) Mean Lower Upper
Difference Bound Bound
Xplant 13.337 96 0.000 0.649 0.55 0.75
Pheart 4.335 96 0.000 0.165 0.09 0.24

As you can see in this table, pheart and xplant both have a p-
value less than 0.001. This indicates that it is statistically
significant. The Sig value is reported to be 0.000 indicating
that it is less than 0.001 but not exactly 0.

The cutoff value for determining statistical significance is


usually a value of .05 or less.

Xplant and pheart have the same t-test by looking at the Sig
values.
Conclusion
• Xplant.xls data had 97 individuals and no missing individuals.
• Xplant.xls data results displayed that there were significant results for
xplant and age in the survival function that were primarily the
product of small differences, but very large sample size (n).
• Xplant and pheart have the same t-test by looking at the Sig values.
• age and pheart are not correlated in the regression.
• pheart and xplant are not correlated in the regression.

Das könnte Ihnen auch gefallen