For the week of your birthday in 2016, read in the pedestrian counts for all the sensors in Melbourne, using code like this:
myweek <- walk_melb(ymd("2016-10-31"), ymd("2016-11-06")) # Monday through Sunday
rwalkr
package? EARO WANG
myweek <- myweek %>% mutate(day = wday(Date, label=TRUE, abbr=TRUE, week_start = 1))
qv <- myweek %>%
filter(Sensor == "QV Market-Peel St") %>%
group_by(day) %>%
summarise(n=sum(Count, na.rm=T))
qv
# A tibble: 7 x 2
day n
<ord> <int>
1 Mon 2085
2 Tue 1630
3 Wed 2336
4 Thu 3258
5 Fri 3156
6 Sat 3347
7 Sun 3105
qv %>% filter(n==max(n))
# A tibble: 1 x 2
day n
<ord> <int>
1 Sat 3347
QV Market-Peel St
has a big peak in the middle of the weekend days. The week days are mixed. Monday and Wednesday have triple peaks, which look like commuter and lunch time traffic. These patterns are not there on Tuesday, Thursday or Friday. Tuesday has the smallest number of people walking by. Thursday and Friday appear to have the middle of the day peak, but also possibly the commuter traffic as well, which makes for a difficult pattern detection. The opening hours for QV market are Tues, Thu-Sun, and it looks like the Thu-Sun are busy market days, as well as commuting, but Tues doesn’t have a large number of people attending.
melb <- get_map(location=c(144.9631, -37.8136), zoom=14)
loc <- pull_sensor()
loc <- loc %>%
filter(Sensor %in% myweek$Sensor)
# sens_count <- myweek %>% group_by(Sensor) %>% summarise(n=sum(Count, na.rm=TRUE)) %>%
# left_join(loc, by="Sensor")
sens_count <- myweek %>% group_by(Sensor) %>% summarise(n=sum(Count, na.rm=TRUE)) %>%
left_join(loc, by="Sensor")
ggmap(melb) + geom_point(data=sens_count, aes(x=Longitude, y=Latitude, colour=n), size=4, alpha=0.7) +
scale_colour_distiller(palette="YlOrRd")
The corridor along Flinders to Swanston to Bourke is the busiest pedestrian traffic.
# A tibble: 2 x 1
Sensor
<chr>
1 Chinatown-Lt Bourke St (South)
2 Lonsdale St (South)
For me there are two missing sensors, but this might not be the case for everyone.
myweek <- myweek %>% mutate(weekend = ifelse(day %in% c("Sat", "Sun"), TRUE, FALSE)) %>%
mutate(lCount = log10(Count+1))
flinders <- myweek %>% filter(Sensor == "Flinders St-Swanston St (West)")
mod <- lm(lCount~factor(Time)*weekend, data=flinders)
#tidy(mod)
glance(mod)
r.squared adj.r.squared sigma statistic p.value df logLik
1 0.9196537 0.8881847 0.1639844 29.22415 3.02108e-47 48 93.62326
AIC BIC deviance df.residual
1 -89.24651 63.82772 3.226908 120
fit <- augment(mod, flinders)
fit <- fit %>% mutate(.raw_fitted = 10^(.fitted)-1)
Using an interaction term allows weekends to have a different pattern to week days.
ggplot() +
geom_line(data=fit, aes(x=Time, y=Count, group=Date), colour="hotpink", alpha=0.9) +
geom_line(data=fit, aes(x=Time, y=.raw_fitted), size=1.5) + facet_wrap(~weekend)
fit %>% filter((!weekend), Time == 17) %>% select(.raw_fitted) %>% distinct()
.raw_fitted
1 4970.749
With time treated as a factor there is flexibility in the type of pattern in counts that is captured by the model.
One point is reserved for easy to compile, spell-checked, nicely turned in work.