Article
What I Learned Integrating a Messy Real-World API
The NTA's GTFS feeds are public, documented, and routinely inconsistent. Building NavEire taught me more about production API integration than any tutorial ever could.
The API Looked Fine in the Docs
The National Transport Authority publishes two sets of data for Irish public transport: a static GTFS feed with schedules and stop information, and a GTFS-Realtime feed with live vehicle positions, trip updates, and service alerts. Both are public. Both are documented. I assumed the integration would be straightforward.
It was not. The first real problem appeared within hours of pulling live data: the stop IDs in the static feed and the stop IDs in the real-time feed do not match. The same physical bus stop has two different identifiers depending on which feed you're reading. No warning in the docs. Just silent mismatches that produce wrong departure times if you don't catch them.
Reconciliation Before Everything Else
The fix was a coordinate-matching pipeline that builds a stop_id_map table at import time. For every stop in the real-time feed, find the nearest stop in the static feed within a tight radius and record the mapping. It works for the majority of stops, but the edge cases consumed more time than any user-facing feature I built.
Stops that moved slightly between feed versions. Stops that exist in one feed but not the other. Stops where two candidates fall within the matching radius and you have to decide which one is right. Each edge case required a rule, and each rule required testing against the full dataset to make sure it didn't break something else. This is the work that doesn't show up in screenshots but determines whether the product is actually correct.
Real-Time Data That Is Sometimes Neither
The real-time feed has another property worth knowing about: it goes stale. The NTA's GTFS-RT endpoint updates on a short polling interval, but there are periods — usually late at night or during network issues — where the feed stops updating entirely. If you naively display that data, users see departure times that are minutes or hours out of date with no indication anything is wrong.
NavEire handles this with a cache TTL and a staleness indicator. If the last successful real-time fetch is older than a threshold, the app surfaces a banner telling users the data might be delayed and falls back to the static schedule. That fallback path required as much engineering as the happy path. But users on a rainy night checking when the last bus is coming deserve to know whether the data is live.
Versioning and the Cold Start Problem
The static GTFS feed is not versioned in a strict sense. The NTA publishes updates whenever schedule changes happen, and the new feed can differ significantly from the previous one: stops renamed, routes changed, shapes adjusted. Building a system that could ingest a new feed without manual intervention took some care.
The pipeline I landed on runs in CI. When the NTA publishes a new GTFS zip, a scheduled workflow downloads it, imports it into SQLite, compresses the resulting database, and publishes it as a GitHub Release. The production server downloads the latest release on cold start. The whole boot sequence takes about thirty seconds, managed with a health endpoint and a startup banner on the frontend. No manual deploys, no stale data sitting in production.
What This Taught Me About API Integration
Documentation describes the intended contract. Production data describes the actual one. Those two things are never identical, and the gap between them is where real integration work happens. Any time you take on an API integration, budget time for what the docs don't mention: format inconsistencies, missing fields, rate limits that only appear at scale, and edge cases in the data that only surface with real users.
The other lesson is to build the unhappy path first. Timeouts, stale data, partial responses, and downstream failures are not edge cases to handle later. They are the conditions your integration will run in for a significant percentage of its lifetime. Designing for graceful degradation from the start is cheaper than retrofitting it after users have already complained.
Related Articles
Building a Real-Time Transit App Solo
What I learned designing and shipping NavEire, a full-stack public transport tracker for Ireland, from architecture decisions to production deployment.
Shipping Solo: What to Polish and What to Skip
When you are building alone with 10 to 20 hours a week, every call about quality is really a call about what else does not get built.