Sie sind auf Seite 1von 3

Data Table Schema

Hourly BART ridership for 2011-2015. Contains the trip count between each entry/exit
pair of BART stations for each hour over the five-year period. Note: station entry/exit
pairs with zero riders have been removed; therefore, any missing entry/exit pair for a
given hour can be assumed to be zero.
~50 million rows & 6 columns. Source: BART (Bulk).

Field Type Description
day STRING Format of %Y-%m-%d
hour INTEGER Ranging from 0-23.
enter_abbr STRING BART station (abbreviated) - passengers enter.
exit_abbr STRING BART station (abbreviated) - passengers exit.
trip_count INTEGER Total passenger trips for given hour + stations.
datetime STRING Format of %Y-%m-%d %H:%M:%S
day_of_week INTEGER Day of the week as an integer, where Monday
is 0 and Sunday is 6.

Important details on each of the 45 BART stations, including address, lat/lon
coordinates, and zip code.
45 rows & 9 columns. Source: BART (API).

Field Type Description
name STRING BART station full name.
abbr STRING BART station name (abbreviated).
latitude FLOAT
longitude FLOAT
address STRING
county STRING
state STRING
zipcode INTEGER Geographic postal region.

Price of one-way ridership between each pair of BART stations. Since there are 45
BART stations, that makes 990 station-pair combinations. Each of the 990 station-pairs
has a unique regular fare as well as a discounted fare for select customers. Note: fare
between two stations is the same both directions. Fare for when exit station is the same
as entry station can be assumed to be $0.
990 rows & 4 columns. Source: BART (API).

Field Type Description
station1 STRING BART station name (abbreviated).
station2 STRING BART station name (abbreviated).
fare FLOAT One-way fare between station1 and station2.
discount_fare FLOAT Discount one-way fare between station1 and station2.
Available only to certain demographic groups.

Demographic data for zip codes that contain BART stations, sourced from the 2011-
2014 US Census, including population, income, education level, gini coefficients. US
Census data is generally self-reported.
148 rows & 16 columns. Source: US Census.

Field Type Description
zipcode INTEGER Geographic postal region.
totalpop INTEGER Total population.
some_hs INTEGER # persons reporting completed some
high school.
some_college INTEGER # persons reporting HS degree and
completed some college.
bach_degree INTEGER # persons reporting completed
bachelors degree.
grad_degree INTEGER # persons reporting completed
advanced degree.
pov_below_100 INTEGER # persons reporting income at 100% or
below poverty level.
pov_100_150 INTEGER # persons reporting income between
100% and 150% of poverty level.
pov_150_plus INTEGER
yes_public_assistance INTEGER # persons reporting they are on public
assistance programs.
no_public_assistance INTEGER # persons reporting they are not on
public assistance programs.
gini FLOAT Measure of economic inequality.
median_household_income FLOAT
Per_capita_income FLOAT

Hourly historical weather for each BART station for the entire time period in the ridership
table, including temperature, precipitation, and more.
~2 million rows & 16 columns. Source: Dark Sky.

Field Type Description
station STRING BART station name (abbreviated).
latitude FLOAT
longitude FLOAT
time STRING Format of %Y-%m-%d %H:%M:%S.
apparentTemperature FLOAT
cloudCover FLOAT
dewPoint FLOAT
humidity FLOAT
precipIntensity FLOAT
precipProbability FLOAT
precipType STRING
pressure FLOAT
summary STRING
temperature FLOAT
visibility FLOAT
windBearing FLOAT
windSpeed FLOAT

Monthly data from Zillow that represent overall real estate value, organized by zip code.
Specifically, monthly values from the Zillow Home Value Index (ZHVI) for all homes
(single family residence & condo/co-op).
35 rows & 253 columns. Source: Zillow.

Field Type Description
zipcode INTEGER Geographic postal region.
CountyName STRING
Zillow Home Value FLOAT Column for each month+year spanning April
Index (ZHVI) 1996 September 2016.