timetable
This example shows how to analyze bicycle traffic patterns from sensor data using the timetable
data container to organize and preprocess time-stamped data. The data come from sensors on Broadway Street in Cambridge, MA. The City of Cambridge provides public access to the full data set at the Cambridge Open Data site.
This example shows how to perform a variety of data cleaning, munging, and preprocessing tasks such as removing missing values and synchronizing time-stamped data with different timesteps. In addition, data exploration is highlighted including visualizations and grouped calculations using the timetable
data container to:
Explore daily bicycle traffic
Compare bicycle traffic to local weather conditions
Analyze bicycle traffic on various days of the week and times of day
Import a sample of the bicycle traffic data from a comma-separated text file. The readtable
function returns the data in a table. Display the first eight rows using the head
function.
bikeTbl = readtable('BicycleCounts.csv');
head(bikeTbl)
ans=8×5 table
Timestamp Day Total Westbound Eastbound
___________________ _____________ _____ _________ _________
2015-06-24 00:00:00 {'Wednesday'} 13 9 4
2015-06-24 01:00:00 {'Wednesday'} 3 3 0
2015-06-24 02:00:00 {'Wednesday'} 1 1 0
2015-06-24 03:00:00 {'Wednesday'} 1 1 0
2015-06-24 04:00:00 {'Wednesday'} 1 1 0
2015-06-24 05:00:00 {'Wednesday'} 7 3 4
2015-06-24 06:00:00 {'Wednesday'} 36 6 30
2015-06-24 07:00:00 {'Wednesday'} 141 13 128
The data have timestamps, so it is convenient to use a timetable to store and analyze the data. A timetable is similar to a table, but includes timestamps that are associated with the rows of data. The timestamps, or row times, are represented by datetime
or duration
values. datetime
and duration
are the recommended data types for representing points in time or elapsed times, respectively.
Convert bikeTbl
into a timetable using the table2timetable
function. You must use a conversion function because readtable
returns a table. table2timetable
converts the first datetime
or duration
variable in the table into the row times of a timetable. The row times are metadata that label the rows. However, when you display a timetable, the row times and timetable variables are displayed in a similar fashion. Note that the table has five variables whereas the timetable has four.
bikeData = table2timetable(bikeTbl); head(bikeData)
ans=8×4 timetable
Timestamp Day Total Westbound Eastbound
___________________ _____________ _____ _________ _________
2015-06-24 00:00:00 {'Wednesday'} 13 9 4
2015-06-24 01:00:00 {'Wednesday'} 3 3 0
2015-06-24 02:00:00 {'Wednesday'} 1 1 0
2015-06-24 03:00:00 {'Wednesday'} 1 1 0
2015-06-24 04:00:00 {'Wednesday'} 1 1 0
2015-06-24 05:00:00 {'Wednesday'} 7 3 4
2015-06-24 06:00:00 {'Wednesday'} 36 6 30
2015-06-24 07:00:00 {'Wednesday'} 141 13 128
whos bikeTbl bikeData
Name Size Bytes Class Attributes bikeData 9387x4 1412426 timetable bikeTbl 9387x5 1487736 table
Convert the Day
variable to categorical. The categorical data type is designed for data that consists of a finite set of discrete values, such as the names of the days of the week. List the categories so they display in day order. Use dot subscripting to access variables by name.
bikeData.Day = categorical(bikeData.Day,{'Sunday','Monday','Tuesday',... 'Wednesday','Thursday','Friday','Saturday'});
In a timetable, the times are treated separately from the data variables. Access the Properties
of the timetable to show that the row times are the first dimension of the timetable, and the variables are the second dimension. The DimensionNames
property shows the names of the two dimensions, while the VariableNames
property shows the names of the variables along the second dimension.
bikeData.Properties
ans = TimetableProperties with properties: Description: '' UserData: [] DimensionNames: {'Timestamp' 'Variables'} VariableNames: {'Day' 'Total' 'Westbound' 'Eastbound'} VariableDescriptions: {} VariableUnits: {} VariableContinuity: [] RowTimes: [9387x1 datetime] StartTime: 2015-06-24 00:00:00 SampleRate: NaN TimeStep: NaN CustomProperties: No custom properties are set. Use addprop and rmprop to modify CustomProperties.
By default, table2timetable
assigned Timestamp
as the first dimension name when it converted the table to a timetable, since this was the variable name from the original table. You can change the names of the dimensions, and other timetable metadata, through the Properties
.
Change the names of the dimensions to Time
and Data
.
bikeData.Properties.DimensionNames = {'Time' 'Data'}; bikeData.Properties
ans = TimetableProperties with properties: Description: '' UserData: [] DimensionNames: {'Time' 'Data'} VariableNames: {'Day' 'Total' 'Westbound' 'Eastbound'} VariableDescriptions: {} VariableUnits: {} VariableContinuity: [] RowTimes: [9387x1 datetime] StartTime: 2015-06-24 00:00:00 SampleRate: NaN TimeStep: NaN CustomProperties: No custom properties are set. Use addprop and rmprop to modify CustomProperties.
Display the first eight rows of the timetable.
head(bikeData)
ans=8×4 timetable
Time Day Total Westbound Eastbound
___________________ _________ _____ _________ _________
2015-06-24 00:00:00 Wednesday 13 9 4
2015-06-24 01:00:00 Wednesday 3 3 0
2015-06-24 02:00:00 Wednesday 1 1 0
2015-06-24 03:00:00 Wednesday 1 1 0
2015-06-24 04:00:00 Wednesday 1 1 0
2015-06-24 05:00:00 Wednesday 7 3 4
2015-06-24 06:00:00 Wednesday 36 6 30
2015-06-24 07:00:00 Wednesday 141 13 128
Determine the number of days that elapsed between the latest and earliest row times. The variables can be accessed by dot notation when referencing variables one at a time.
elapsedTime = max(bikeData.Time) - min(bikeData.Time)
elapsedTime = duration
9383:30:00
elapsedTime.Format = 'd'
elapsedTime = duration
390.98 days
To examine the typical bicycle counts on a given day, calculate the means for the total number of bikes, and the numbers travelling westbound and eastbound.
Return the numeric data as a matrix by indexing into the contents of bikeData
using curly braces. Display the first eight rows. Use standard table subscripting to access multiple variables.
counts = bikeData{:,2:end}; counts(1:8,:)
ans = 8×3
13 9 4
3 3 0
1 1 0
1 1 0
1 1 0
7 3 4
36 6 30
141 13 128
Since the mean is appropriate for only the numeric data, you can use the vartype
function to select the numeric variables. vartype
can be more convenient than manually indexing into a table or timetable to select variables. Calculate the means and omit NaN
values.
counts = bikeData{:,vartype('numeric')}; mean(counts,'omitnan')
ans = 1×3
49.8860 24.2002 25.6857
To determine how many people bicycle during a holiday, examine the data on the 4th of July holiday. Index into the timetable by row times for July 4, 2015. When you index on row times, you must match times exactly. You can specify the time indices as datetime
or duration
values, or as character vectors that can be converted to dates and times. You can specify multiple times as an array.
Index into bikeData
with specific dates and times to extract data for July 4, 2015. If you specify the date only, then the time is assumed to be midnight, or 00:00:00.
bikeData('2015-07-04',:)
ans=1×4 timetable
Time Day Total Westbound Eastbound
___________________ ________ _____ _________ _________
2015-07-04 00:00:00 Saturday 8 7 1
d = {'2015-07-04 08:00:00','2015-07-04 09:00:00'}; bikeData(d,:)
ans=2×4 timetable
Time Day Total Westbound Eastbound
___________________ ________ _____ _________ _________
2015-07-04 08:00:00 Saturday 15 3 12
2015-07-04 09:00:00 Saturday 21 4 17
It would be tedious to use this strategy to extract the entire day. You can also specify ranges of time without indexing on specific times. To create a time range subscript as a helper, use the timerange
function.
Subscript into the timetable using a time range for the entire day of July 4, 2015. Specify the start time as midnight on July 4, and the end time as midnight on July 5. By default, timerange
covers all times starting with the start time and up to, but not including, the end time. Plot the bicycle counts over the course of the day.
tr = timerange('2015-07-04','2015-07-05'); jul4 = bikeData(tr,'Total'); head(jul4)
ans=8×1 timetable
Time Total
___________________ _____
2015-07-04 00:00:00 8
2015-07-04 01:00:00 13
2015-07-04 02:00:00 4
2015-07-04 03:00:00 1
2015-07-04 04:00:00 0
2015-07-04 05:00:00 1
2015-07-04 06:00:00 8
2015-07-04 07:00:00 16
bar(jul4.Time,jul4.Total) ylabel('Bicycle Counts') title('Bicycle Counts on July 4, 2015')
From the plot, there is more volume throughout the day, leveling off in the afternoon. Because many businesses are closed, the plot does not show typical traffic during commute hours. Spikes later in the evening can be attributed to celebrations with fireworks, which occur after dark. To examine these trends more closely, the data should be compared to data for typical days.
Compare the data for July 4 to data for the rest of the month of July.
jul = bikeData(timerange('2015-07-01','2015-08-01'),:); plot(jul.Time,jul.Total) hold on plot(jul4.Time,jul4.Total) ylabel('Total Count') title('Bicycle Counts in July') hold off legend('Bicycle Count','July 4 Bicycle Count')
The plot shows variations that can be attributed to traffic differences between weekdays and weekends. The traffic patterns for July 4 and 5 are consistent with the pattern for weekend traffic. July 5 is a Monday but is often observed as a holiday. These trends can be examined more closely with further preprocessing and analysis.
timetable
Time-stamped data sets are often messy and may contain anomalies or errors. Timetables are well suited for resolving anomalies and errors.
A timetable does not have to have its row times in any particular order. It can contain rows that are not sorted by their row times. A timetable can also contain multiple rows with the same row time, though the rows can have different data values. Even when row times are sorted and unique, they can differ by time steps of different sizes. A timetable can even contain NaT
or NaN
values to indicate missing row times.
The timetable
data type provides a number of different ways to resolve missing, duplicate, or nonuniform times. You can also resample or aggregate data to create a regular timetable. When a timetable is regular, it has row times that are sorted and unique, and have a uniform or evenly spaced time step between them.
To find missing row times, use ismissing
.
To remove missing times and data, use rmmissing
.
To sort a timetable by its row times, use sortrows
.
To make a timetable with unique and sorted row times, use unique
and retime
.
To make a regular timetable, specify a uniformly spaced time vector and use retime
.
Determine if the timetable is sorted. A timetable is sorted if its row times are listed in ascending order.
issorted(bikeData)
ans = logical
0
Sort the timetable. The sortrows
function sorts the rows by their row times, from earliest to latest time. If there are rows with duplicate row times, then sortrows
copies all the duplicates to the output.
bikeData = sortrows(bikeData); issorted(bikeData)
ans = logical
1
A timetable can have missing data indicators in its variables or its row times. For example, you can indicate missing numeric values as NaN
s, and missing datetime values as NaT
s. You can assign, find, remove, and fill missing values with the standardizeMissing
, ismissing
, rmmissing
, and fillmissing
functions, respectively.
Find and count the missing values in the timetable variables. In this example, missing values indicate circumstances when no data were collected.
missData = ismissing(bikeData); sum(missData)
ans = 1×4
1 3 3 3
The output from ismissing
is a logical
matrix, the same size as the table, identifying missing data values as true. Display any rows which have missing data indicators.
idx = any(missData,2); bikeData(idx,:)
ans=3×4 timetable
Time Day Total Westbound Eastbound
___________________ ___________ _____ _________ _________
2015-08-03 00:00:00 Monday NaN NaN NaN
2015-08-03 01:00:00 Monday NaN NaN NaN
NaT <undefined> NaN NaN NaN
ismissing(bikeData)
finds missing data in the timetable variables only, not the times. To find missing row times, call ismissing
on the row times.
missTimes = ismissing(bikeData.Time); bikeData(missTimes,:)
ans=2×4 timetable
Time Day Total Westbound Eastbound
____ ___________ _____ _________ _________
NaT <undefined> NaN NaN NaN
NaT Friday 6 3 3
In this example, missing times or data values indicate measurement errors and can be excluded. Remove rows of the table containing missing data values and missing row times using rmmissing
.
bikeData = rmmissing(bikeData); sum(ismissing(bikeData))
ans = 1×4
0 0 0 0
sum(ismissing(bikeData.Time))
ans = 0
Determine if there are duplicate times and/or duplicate rows of data. You might want to exclude exact duplicates, as these can also be considered measurement errors. Identify duplicate times by finding where the difference between the sorted times is exactly zero.
idx = diff(bikeData.Time) == 0; dup = bikeData.Time(idx)
dup = 3x1 datetime
2015-08-21 00:00:00
2015-11-19 23:00:00
2015-11-19 23:00:00
Three times are repeated and November 19, 2015 is repeated twice. Examine the data associated with the repeated times.
bikeData(dup(1),:)
ans=2×4 timetable
Time Day Total Westbound Eastbound
___________________ ______ _____ _________ _________
2015-08-21 00:00:00 Friday 14 9 5
2015-08-21 00:00:00 Friday 11 7 4
bikeData(dup(2),:)
ans=3×4 timetable
Time Day Total Westbound Eastbound
___________________ ________ _____ _________ _________
2015-11-19 23:00:00 Thursday 17 15 2
2015-11-19 23:00:00 Thursday 17 15 2
2015-11-19 23:00:00 Thursday 17 15 2
The first has duplicated times but non-duplicate data, whereas the others are entirely duplicated. Timetable rows are considered duplicates when they contain identical row times and identical data values across the rows. You can use unique
to remove duplicate rows in the timetable. The unique
function also sorts the rows by their row times.
bikeData = unique(bikeData);
The rows with duplicate times but non-duplicate data require some interpretation. Examine the data around those times.
d = dup(1) + hours(-2:2); bikeData(d,:)
ans=5×4 timetable
Time Day Total Westbound Eastbound
___________________ ________ _____ _________ _________
2015-08-20 22:00:00 Thursday 40 30 10
2015-08-20 23:00:00 Thursday 25 18 7
2015-08-21 00:00:00 Friday 11 7 4
2015-08-21 00:00:00 Friday 14 9 5
2015-08-21 02:00:00 Friday 6 5 1
In this case, the duplicate time may have been mistaken since the data and surrounding times are consistent. Though it appears to represent 01:00:00, it is uncertain what time this should have been. The data can be accumulated to account for the data at both time points.
sum(bikeData{dup(1),2:end})
ans = 1×3
25 16 9
This is only one case which can be done manually. However, for many rows, the retime
function can perform this calculation. Accumulate the data for the unique times using the sum
function to aggregate. The sum is appropriate for numeric data but not the categorical data in the timetable. Use vartype
to identify the numeric variables.
vt = vartype('numeric'); t = unique(bikeData.Time); numData = retime(bikeData(:,vt),t,'sum'); head(numData)
ans=8×3 timetable
Time Total Westbound Eastbound
___________________ _____ _________ _________
2015-06-24 00:00:00 13 9 4
2015-06-24 01:00:00 3 3 0
2015-06-24 02:00:00 1 1 0
2015-06-24 03:00:00 1 1 0
2015-06-24 04:00:00 1 1 0
2015-06-24 05:00:00 7 3 4
2015-06-24 06:00:00 36 6 30
2015-06-24 07:00:00 141 13 128
You cannot sum the categorical data, but since one label represents the whole day, take the first value on each day. You can perform the retime
operation again with the same time vector and concatenate the timetables together.
vc = vartype('categorical'); catData = retime(bikeData(:,vc),t,'firstvalue'); bikeData = [catData,numData]; bikeData(d,:)
ans=4×4 timetable
Time Day Total Westbound Eastbound
___________________ ________ _____ _________ _________
2015-08-20 22:00:00 Thursday 40 30 10
2015-08-20 23:00:00 Thursday 25 18 7
2015-08-21 00:00:00 Friday 25 16 9
2015-08-21 02:00:00 Friday 6 5 1
The data appear to have a uniform time step of one hour. To determine if this is true for all the row times in the timetable, use the isregular
function. isregular
returns true
for sorted, evenly-spaced times (monotonically increasing), with no duplicate or missing times (NaT
or NaN
).
isregular(bikeData)
ans = logical
0
The output of 0
, or false
, indicates that the times in the timetable are not evenly spaced. Explore the time interval in more detail.
dt = diff(bikeData.Time); [min(dt); max(dt)]
ans = 2x1 duration
00:30:00
03:00:00
To put the timetable on a regular time interval, use retime
or synchronize
and specify the time interval of interest.
Determine the counts per day using the retime
function. Accumulate the count data for each day using the sum
method. This is appropriate for numeric data but not the categorical data in the timetable. Use vartype
to identify the variables by data type.
dayCountNum = retime(bikeData(:,vt),'daily','sum'); head(dayCountNum)
ans=8×3 timetable
Time Total Westbound Eastbound
___________________ _____ _________ _________
2015-06-24 00:00:00 2141 1141 1000
2015-06-25 00:00:00 2106 1123 983
2015-06-26 00:00:00 1748 970 778
2015-06-27 00:00:00 695 346 349
2015-06-28 00:00:00 153 83 70
2015-06-29 00:00:00 1841 978 863
2015-06-30 00:00:00 2170 1145 1025
2015-07-01 00:00:00 997 544 453
As above, you can perform the retime
operation again to represent the categorical data using an appropriate method and concatenate the timetables together.
dayCountCat = retime(bikeData(:,vc),'daily','firstvalue'); dayCount = [dayCountCat,dayCountNum]; head(dayCount)
ans=8×4 timetable
Time Day Total Westbound Eastbound
___________________ _________ _____ _________ _________
2015-06-24 00:00:00 Wednesday 2141 1141 1000
2015-06-25 00:00:00 Thursday 2106 1123 983
2015-06-26 00:00:00 Friday 1748 970 778
2015-06-27 00:00:00 Saturday 695 346 349
2015-06-28 00:00:00 Sunday 153 83 70
2015-06-29 00:00:00 Monday 1841 978 863
2015-06-30 00:00:00 Tuesday 2170 1145 1025
2015-07-01 00:00:00 Wednesday 997 544 453
Examine the effect of weather on cycling behavior by comparing the bicycle count with weather data. Load the weather timetable which includes historical weather data from Boston, MA including storm events.
load BostonWeatherData
head(weatherData)
ans=8×3 timetable
Time TemperatureF Humidity Events
___________ ____________ ________ ____________
01-Jul-2015 72 78 Thunderstorm
02-Jul-2015 72 60 None
03-Jul-2015 70 56 None
04-Jul-2015 67 75 None
05-Jul-2015 72 67 None
06-Jul-2015 74 69 None
07-Jul-2015 75 77 Rain
08-Jul-2015 79 68 Rain
To summarize the times and variables in the timetable, use the summary
function.
summary(weatherData)
RowTimes: Time: 383x1 datetime Values: Min 01-Jul-2015 Median 08-Jan-2016 Max 17-Jul-2016 TimeStep 24:00:00 Variables: TemperatureF: 383x1 double Values: Min 2 Median 55 Max 85 Humidity: 383x1 double Values: Min 29 Median 64 Max 97 Events: 383x1 categorical Values: Fog 7 Hail 1 Rain 108 Rain-Snow 4 Snow 18 Thunderstorm 12 None 233
Combine the bicycle data with the weather data to a common time vector using synchronize
. You can resample or aggregate timetable data using any of the methods documented on the reference page for the synchronize
function.
Synchronize the data from both timetables to a common time vector, constructed from the intersection of their individual daily time vectors.
data = synchronize(dayCount,weatherData,'intersection');
head(data)
ans=8×7 timetable
Time Day Total Westbound Eastbound TemperatureF Humidity Events
___________________ _________ _____ _________ _________ ____________ ________ ____________
2015-07-01 00:00:00 Wednesday 997 544 453 72 78 Thunderstorm
2015-07-02 00:00:00 Thursday 1943 1033 910 72 60 None
2015-07-03 00:00:00 Friday 870 454 416 70 56 None
2015-07-04 00:00:00 Saturday 669 328 341 67 75 None
2015-07-05 00:00:00 Sunday 702 407 295 72 67 None
2015-07-06 00:00:00 Monday 1900 1029 871 74 69 None
2015-07-07 00:00:00 Tuesday 2106 1140 966 75 77 Rain
2015-07-08 00:00:00 Wednesday 1855 984 871 79 68 Rain
Compare bicycle traffic counts and outdoor temperature on separate y axes to examine the trends. Remove the weekends from the data for visualization.
idx = ~isweekend(data.Time); weekdayData = data(idx,{'TemperatureF','Total'}); figure yyaxis left plot(weekdayData.Time, weekdayData.Total) ylabel('Bicycle Count') yyaxis right plot(weekdayData.Time,weekdayData.TemperatureF) ylabel('Temperature (\circ F)') title('Bicycle Counts and Temperature Over Time') xlim([min(data.Time) max(data.Time)])
The plot shows that the traffic and weather data might follow similar trends. Zoom in on the plot.
xlim([datetime('2015-11-01'),datetime('2016-05-01')])
The trends are similar, indicating that fewer people cycle on colder days.
Examine the data based on different intervals such as day of the week and time of day. Determine the total counts per day using varfun
to perform grouped calculations on variables. Specify the sum
function with a function handle and the grouping variable and preferred output type using name-value pairs.
byDay = varfun(@sum,bikeData,'GroupingVariables','Day',... 'OutputFormat','table')
byDay=7×5 table
Day GroupCount sum_Total sum_Westbound sum_Eastbound
_________ __________ _________ _____________ _____________
Sunday 1344 25315 12471 12844
Monday 1343 79991 39219 40772
Tuesday 1320 81480 39695 41785
Wednesday 1344 86853 41726 45127
Thursday 1344 87516 42682 44834
Friday 1342 76643 36926 39717
Saturday 1343 30292 14343 15949
figure bar(byDay{:,{'sum_Westbound','sum_Eastbound'}}) legend({'Westbound','Eastbound'},'Location','eastoutside') xticklabels({'Sun','Mon','Tue','Wed','Thu','Fri','Sat'}) title('Bicycle Count by Day of Week')
The bar plot indicates that traffic is heavier on weekdays. Also, there is a difference in the Eastbound and Westbound directions. This might indicate that people tend to take different routes when entering and leaving the city. Another possibility is that some people enter on one day and return on another day.
Determine the hour of day and use varfun
for calculations by group.
bikeData.HrOfDay = hour(bikeData.Time); byHr = varfun(@mean,bikeData(:,{'Westbound','Eastbound','HrOfDay'}),... 'GroupingVariables','HrOfDay','OutputFormat','table'); head(byHr)
ans=8×4 table
HrOfDay GroupCount mean_Westbound mean_Eastbound
_______ __________ ______________ ______________
0 389 5.4396 1.7686
1 389 2.7712 0.87147
2 391 1.8696 0.58312
3 391 0.7468 0.289
4 391 0.52685 1.0026
5 391 0.70588 4.7494
6 391 3.1228 22.097
7 391 9.1176 63.54
bar(byHr{:,{'mean_Westbound','mean_Eastbound'}}) legend('Westbound','Eastbound','Location','eastoutside') xlabel('Hour of Day') ylabel('Bicycle Count') title('Mean Bicycle Count by Hour of Day')
There are traffic spikes at the typical commute hours, around 9:00 a.m. and 5:00 p.m. Also, the trends between the Eastbound and Westbound directions are different. In general, the Westbound direction is toward residential areas surrounding the Cambridge area and toward the Universities. The Eastbound direction is toward Boston.
The traffic is heavier later in the day in the Westbound direction compared to the Eastbound direction. This might indicate university schedules and traffic due to restaurants in the area. Examine the trend by day of week as well as hour of day.
byHrDay = varfun(@sum,bikeData,'GroupingVariables',{'HrOfDay','Day'},... 'OutputFormat','table'); head(byHrDay)
ans=8×6 table
HrOfDay Day GroupCount sum_Total sum_Westbound sum_Eastbound
_______ _________ __________ _________ _____________ _____________
0 Sunday 56 473 345 128
0 Monday 55 202 145 57
0 Tuesday 55 297 213 84
0 Wednesday 56 374 286 88
0 Thursday 56 436 324 112
0 Friday 55 442 348 94
0 Saturday 56 580 455 125
1 Sunday 56 333 259 74
To arrange the timetable so that the days of the week are the variables, use the unstack
function.
hrAndDayWeek = unstack(byHrDay(:,{'HrOfDay','Day','sum_Total'}),'sum_Total','Day'); head(hrAndDayWeek)
ans=8×8 table
HrOfDay Sunday Monday Tuesday Wednesday Thursday Friday Saturday
_______ ______ ______ _______ _________ ________ ______ ________
0 473 202 297 374 436 442 580
1 333 81 147 168 173 183 332
2 198 77 68 93 128 141 254
3 86 41 43 44 50 61 80
4 51 81 117 101 108 80 60
5 105 353 407 419 381 340 128
6 275 1750 1867 2066 1927 1625 351
7 553 5355 5515 5818 5731 4733 704
ribbon(hrAndDayWeek.HrOfDay,hrAndDayWeek{:,2:end}) ylim([0 24]) xlim([0 8]) xticks(1:7) xticklabels({'Sun','Mon','Tue','Wed','Thu','Fri','Sat'}) ylabel('Hour') title('Bicycle Count by Hour and Day of Week')
There are similar trends for the regular work days of Monday through Friday, with peaks at rush hour and traffic tapering off in the evening. Friday has less volume, though the overall trend is similar to the other work days. The trends for Saturday and Sunday are similar to each other, without rush hour peaks and with more volume later in the day. The late evening trends are also similar for Monday through Friday, with less volume on Friday.
To examine the overall time of day trends, split up the data by rush hour times. It is possible to use different times of day or units of time using the discretize
function. For example, separate the data into groups for AM
, AMRush
, Day
, PMRush
, PM
. Then use varfun
to calculate the mean by group.
bikeData.HrLabel = discretize(bikeData.HrOfDay,[0,6,10,15,19,24],'categorical',... {'AM','RushAM','Day','RushPM','PM'}); byHrBin = varfun(@mean,bikeData(:,{'Total','HrLabel'}),'GroupingVariables','HrLabel',... 'OutputFormat','table')
byHrBin=5×3 table
HrLabel GroupCount mean_Total
_______ __________ __________
AM 2342 3.5508
RushAM 1564 94.893
Day 1955 45.612
RushPM 1564 98.066
PM 1955 35.198
bar(byHrBin.mean_Total)
cats = categories(byHrBin.HrLabel);
xticklabels(cats)
title('Mean Bicycle Count During Rush Hours')
In general, there is about twice as much traffic in this area during the evening and morning rush hours compared to other times of the day. There is very little traffic in this area in the early morning, but there is still significant traffic in the evening and late evening, comparable to the day outside of the morning and evening rush hours.
datetime
| head
| retime
| rmmissing
| sortrows
| summary
| table2timetable
| timerange
| timetable
| unstack
| varfun