Skip to content
Case Study · Transport & Operations

Citi Bike usage analysis built for operational decisions

An end-to-end Python and Streamlit project that turned a year of NYC Citi Bike trip data, around 30 million rides, into an interactive operational dashboard: daily demand against weather, hourly usage rhythm, the busiest stations, and a geospatial map of the highest volume routes that tells an operations team where and when to focus bikes.

Python pandas Streamlit Kepler.gl Plotly NOAA Weather Data
Project overview

From raw trip records to operational priorities

The project was built around a practical operational question: where, and at what times, does Citi Bike run into bike availability problems across New York, and which stations and routes are worth focusing on first?

Business problem

A bike share operator running around 30 million annual trips cannot make fleet and rebalancing decisions from a monthly report. Customers complain that bikes are not available at certain times and places. The goal was to convert a full year of trip records into clear signals of demand by weather, hour, station, and route that support rebalancing, station planning, and service decisions.

Analytical goal: move beyond raw trip counts and build an interactive view of when and where demand actually concentrates, so operational effort can follow it.

Final output

The final deliverable was a five page Streamlit dashboard fed by a Python data pipeline: cleaned trip data, a NOAA weather merge, and aggregated station and route tables feeding the charts and map.

The dashboard helps stakeholders see the relationship between weather and demand, the daily usage rhythm, the busiest stations by season, and the high volume corridors that matter most for rebalancing and expansion.

Annual trips30Mfull year of NYC Citi Bike trip records for 2022
Missing-station rows resolved69,835cleaned during data preparation before analysis
Weather sourceNOAAdaily average temperature merged into the trips via API
Dashboard pages5weather, hourly, stations, interactive map, recommendations
Python-first workflow

How the analysis was built

The Python work converted raw, messy trip records into a clean, dashboard ready dataset enriched with weather, then aggregated it into the station and route tables that drive the charts and the map.

01
RetrievePulled the 2022 Citi Bike trip data and collected NOAA daily temperature for New York through the weather API, then structured both for analysis.
02
Clean and transformHandled missing values, duplicates, and data types, resolving 69,835 rows with missing station information so the trip counts could be trusted.
03
Weather mergeJoined NOAA daily average temperature onto the trips by date, enabling the analysis of how demand responds to weather across the year.
04
Feature engineeringCreated new variables for trip counts, time of day, season, and start to end station pairs, the building blocks for every view in the dashboard.
05
AggregateGrouped trips by station and by station pair with groupby to surface the busiest stations, the daily and hourly demand curve, and route volumes.
06
Geospatial mapIntegrated start and end station coordinates and built a Kepler.gl map of aggregated trips, filtered to station pairs with 750 or more trips to cut noise.
07
DashboardBrought everything together in a multi page Streamlit app with Plotly charts and season slicers, then deployed it live on Streamlit Cloud.
The method

Demand by weather, hour, station, and route

Rather than a single headline number, the analysis looks at demand from four angles that an operations team can act on: how it moves with weather, how it moves through the day, where it concentrates, and which routes carry it.

How the analysis reads demand

Each view answers a different operational question, and the geospatial map is filtered so it shows recurring high volume routes rather than one-off journeys.

01Daily trips are plotted against daily temperature to expose the seasonal demand pattern.
02Trips by hour of day reveal the commuter rhythm, with a clear late afternoon peak.
03Stations are ranked by total trip volume, with a season filter to compare across the year.
04Station pairs are mapped and filtered to 750 or more trips to highlight the corridors that matter.
Weather and demandTracks temperature

Daily bike trips follow temperature closely, rising into spring, peaking June to August, and dropping to their lowest in winter.

Daily rhythmLate afternoon peak

Usage is lowest overnight, climbs from around 06:00, and peaks in the late afternoon and early evening, 16:00 to 18:00.

Where demand sitsCentral Manhattan

The busiest start stations cluster in central Manhattan and commuter heavy zones, led by W 21 St & 6 Ave.

By designRoutes over single trips

The map keeps only station pairs with 750 or more trips, so the view shows real corridors, not isolated rides.

Streamlit NYC Citi Bike operational analysis dashboard preview
Network breakdown

The network at a glance

Headline cuts of the 2022 Citi Bike network, from the busiest stations to the daily and hourly demand shape.

Top stations

Busiest start stations by trips.

  • W 21 St & 6 Ave1,461
  • West St & Chambers St1,309
  • Broadway & W 58 St1,296
  • 1 Ave & E 68 St1,239

Daily rhythm

Trips by time of day.

  • OvernightLowest
  • Morning riseFrom 06:00
  • Peak window16:00 to 18:00

Seasonal demand

How usage moves with weather.

  • Peak monthsJun to Aug
  • LowestWinter
  • DriverTemperature

Routes

Geospatial map focus.

  • Filter750+ trips
  • ConcentrationManhattan
  • PatternCommuter
Streamlit dashboard

Dashboard pages built around stakeholder questions

The app moves from an introduction into weather, hourly, station, map, and recommendation detail, each page answering a specific operational question.

01

Weather and Bike Trips

Daily trips plotted against daily temperature, showing the strong seasonal relationship between weather and demand.

02

Bike Trips by Hours

The daily usage curve, with demand lowest overnight, rising from 06:00, and peaking in the late afternoon.

03

Most Popular Stations

The top 20 start stations by trip volume, with a season filter so demand can be compared across the year.

04

Interactive Map with Bike Trips

A Kepler.gl map of aggregated trips, filtered to station pairs with 750 or more trips to surface the busiest corridors.

05

Recommendations

Practical actions on seasonal scaling and waterfront station planning, drawn from the patterns in the data.

Key findings

What the data made clear

The dashboard turns a year of trip records into a small number of patterns an operations team can actually act on.

Demand tracks temperature

Daily bike trips follow temperature closely across the year, rising into spring, peaking June to August, and falling to their lowest in winter. Temperature is a clear driver of demand, which makes seasonality something the operation can plan around.

Usage follows a commuter rhythm

Trips are lowest overnight, climb sharply from around 06:00, and peak in the late afternoon and early evening, 16:00 to 18:00. Weekday usage runs above weekends, pointing to commuter driven demand.

Demand concentrates in central Manhattan

The busiest start stations cluster in central Manhattan and commuter heavy zones, led by W 21 St & 6 Ave. Some stations also show an imbalance between departures and arrivals, a signal for rebalancing.

A few corridors carry the load

Filtering the map to station pairs with 750 or more trips shows demand running in dense, recurring corridors along Manhattan’s waterfront and its north to south routes, rather than spread evenly across the network.

Method and confidence

Built on real data, with an honest scope

An operational dashboard only holds up if the data is cleaned properly and the conclusions stay within what the data can actually show. Both were built in from the start.

Conclusions you can defend

The findings rest on cleaned, real world trip data enriched with independent weather data, and the scope is stated plainly.

Real data, cleaned: a full year of Citi Bike trips, with 69,835 missing-station rows resolved before any analysis.
Independent weather merge: NOAA daily temperature joined by date, so the seasonal pattern is supported by an outside source rather than asserted.
Honest scope: this is a public portfolio case study, not an internal Citi Bike engagement, and the recommendations are framed as directional, not as costed plans.
Python data pipeline

Cleaning, the NOAA weather merge, feature engineering, and station and route aggregation, all in pandas.

Geospatial trip map

A Kepler.gl map of aggregated station to station trips, filtered to 750 or more trips to surface the busiest routes.

Five page Streamlit dashboard

Weather, hourly usage, top stations, the interactive map, and recommendations, deployed live on Streamlit Cloud.

Plotly charts and slicers

Daily trips against temperature, the hourly demand curve, and a station ranking with a season filter for drill-down.

Recommendations

How an operations team could use the dashboard

Scale bike availability seasonally: because demand tracks temperature so closely, Citi Bike could reduce active fleet and rebalancing effort by roughly 30 to 40% from November to April, lowering operating cost during a predictable low demand window.

Plan stations around real corridors: new stations and rebalancing effort along the waterfront should follow the dense, recurring high volume routes the map highlights, rather than even spacing.

Match staffing to the daily peak: concentrate availability and redistribution around the late afternoon peak, 16:00 to 18:00, when demand is highest.

Stay in scope: capital decisions and exact station counts were left out as a network operator call, with the analysis pointing the direction rather than costing the build.

Tools & deliverables

What this project demonstrates

Working with real world API data, data cleaning at scale, joining external datasets, feature engineering and aggregation in pandas, geospatial visualization with Kepler.gl, and the ability to turn raw trip records into an interactive, deployed dashboard with clear operational recommendations.

Python Jupyter pandas NumPy Matplotlib Plotly Kepler.gl Streamlit NOAA Weather Data

Need the full portfolio or resume?

Download the PDF portfolio for a polished overview of the projects, or open the resume for the formal career summary, tools, and work history.