Beruflich Dokumente
Kultur Dokumente
DataSecondAnnualDataScienceBowl|Kaggle
Host
Competitions
Datasets
Scripts
Jobs
Logout
Dashboard
Home
Data
Make a submission
Information
Description
Evaluation
Rules
Prizes
About the DSB
Deep Learning Tutorial
Fourier Based Tutorial
Resources
Timeline
Forum
Leaderboard
My Submissions
Leaderboard
1. heart
2. Tencia & woshialex
Data Files
File Name
Available Formats
validate
train
train.csv
sample_submission_validate.csv
3. Mike
4. PaulG
5. Tim Hochberg
6. BoShuang
7. nagadomi
8. Keras.io
9. BioMedIA
10. h-wit
7 hours ago
The volumes at systole, VS , and diastole, VD ,form the basis ofan important clinical
measurement known as the ejection fraction:
100
yesterday
VD
yesterday
VD VS
This quantity represents the fraction of outbound blood pumped from the heart with
https://www.kaggle.com/c/secondannualdatasciencebowl/data
1/3
05/02/2016
DataSecondAnnualDataScienceBowl|Kaggle
yesterday
each heartbeat. An ejection fraction thatis too low can signify a wide range of cardiac
Sunnybrook data
problems.
yesterday
teams
players
entries
File descriptions
Each case has an associated directory of DICOM files. The exact number of images will
differ from case to case, either varying inthe number of slices, the views which are
captured, orthe number of frames in the time sequences.
The main view forassessing ventricle size is the short axis stack, which containsimages
taken in a plane perpendicular to the long axis of the left ventricle. Thesehave the
prefix"sax_" in the competition dataset. Most cases also have alternative views, which
you should feel free to incorporate into your methodology.
The structure is as follows:
train.zip- the train set directory, contains cases where you will have the
associated systolic and diastolic volumes
validate.zip- the validationset directory, used for the leaderboard in stage
one of the competition.You should predict the volumes for these cases
duringstage one.
test.zip - the test set,used for the leaderboard in stage twoof the
competition (a.k.a. the final standings).You should predict the volumes for
these cases during stage two. This file will not be released until the second
stage.
train.csv-contains the systolic and diastolic volumes for the cases in the
training set.
sample_submission_validate.csv- a sample submission file in the correct
format for stage one
sample_submission_test.csv - a sample submission file in the correct
format for stage two.This file will not be released until the second stage.
DICOM
The DICOM standard is complexand there are a number of different toolsto work
withDICOM files. You may find the following resources helpful for managingthe
competition data:
Thelite version of OsiriXis useful for viewing images on OSX
https://www.kaggle.com/c/secondannualdatasciencebowl/data
2/3
05/02/2016
DataSecondAnnualDataScienceBowl|Kaggle
FAQ
We will add to this section as relevant common questions arise.
How do I know where the left ventricle is? How do I compute its volume?
Watch this video for a primer on the anatomy and process used by clinicians:
I see more than one series at the same slice location. How should we deal with
those cases?
Generally, a slice location is repeated if there is an artifact on the images. You can use
either slice but the odds are that the last slice at a given slice location is the best the
technologist could acquire.
Some MRI images are not consistent (in size, shape, or structure). What should
we do about these?
We have opted to include as many cases as possible in this dataset. As this is real data
from many sources, it is bound to have some amount of unwanted variability. You
should do your best to handlethesefiles. Since this is a two stage competition and the
test set may have unseen abnormalities, we recommend including some formof error
catching as you write your code.
Citation
The data for the Data Science Bowl is available for research and academic pursuits.
Please cite as Data Science Bowl Cardiac Challenge Data.
https://www.kaggle.com/c/secondannualdatasciencebowl/data
3/3