If you commute in Metro Manila, you've probably driven past an accident scene at least once this week. Maybe you slowed down, looked, and kept going. I used to do the same until I started wondering: is this random, or are there patterns? Do accidents cluster on certain roads, at certain hours, on certain days?
Turns out, the MMDA keeps records. I got my hands on a dataset of over 17,000 traffic incidents from 2018 through 2020, and the patterns were clearer than I expected.
What I Built
A time-and-location analysis of Metro Manila traffic accidents. The core output is a set of heatmap-style visualizations showing incident frequency by hour of day, day of week, and road. I also broke down the data by vehicle type and incident severity to see who's most at risk.
Why This Project
Road safety data in the Philippines doesn't get much public attention. The MMDA releases numbers, but they rarely make it into formats that regular people can understand or act on. I thought that if I could turn raw incident logs into clear visual patterns, it might be useful — for commuters, for urban planners, or at least for satisfying my own curiosity about which roads to be extra careful on.
How I Put It Together
The dataset needed serious cleaning. Timestamps were in inconsistent formats. Road names were spelled differently across records — "EDSA" versus "Epifanio de los Santos" versus "E. de los Santos Ave." I standardized road names using a mapping dictionary and parsed timestamps into proper datetime objects.
The hour-by-day heatmap was the most technically interesting piece. I grouped incidents by hour (0-23) and day_of_week (Monday-Sunday), counted occurrences in each cell, and plotted the result with seaborn.heatmap(). The visual instantly showed which time slots are the worst — no statistics degree required to read it.
Vehicle type analysis required some grouping too. The raw data had dozens of vehicle categories that I consolidated into broader types: motorcycle, private car, truck/bus, public utility vehicle, and other. This made the proportional analysis much more readable.
What the Data Showed
EDSA dominated everything. It wasn't even close. The country's busiest highway accounted for a disproportionate share of total incidents, which makes sense given its volume, but the concentration was still striking.
Friday evenings were the worst time slot by a clear margin. The 5PM to 9PM window on Fridays consistently had more incidents than any other period. My guess is it's a combination of rush hour congestion, end-of-week fatigue, and — let's be honest — some people heading out for the weekend after a drink or two.
The vehicle breakdown surprised me. Motorcycles were overrepresented in the incident data relative to their share of total vehicles on the road. They showed up in collisions, sideswipes, and single-vehicle accidents at rates that should concern anyone who rides one.
- EDSA had the highest concentration of incidents by a wide margin
- Friday evenings (5-9PM) were the single most dangerous time slot
- Motorcycles were involved in a disproportionate number of incidents
- Weekday mornings (7-9AM) had a secondary peak, but much smaller than evenings
- 2020 showed a sharp dip during lockdown months, then a quick rebound
Looking Back
This is the kind of analysis that could actually be useful if it reached the right people. Traffic engineering decisions should be informed by this data — which intersections need better signaling, where motorcycle lanes might reduce collisions, what enforcement should look like on Friday nights.
The 2020 lockdown data was an accidental natural experiment. When traffic volume dropped drastically during the strictest quarantine months, incidents nearly vanished. When volume returned, so did the accidents — almost immediately. That correlation between volume and incidents seems obvious, but having the data to show it makes the case for congestion reduction stronger than any opinion piece could.
Want to see all the charts and data tables?
View the Full Analysis →