Data Mining Datasets

  • SHMetro Dataset: a large-scale dataset for metro ridership prediction [Download]

    • 288 metro stations in Shanghai, China.
    • Totally 811.8 million transaction records (7/01/2016-9/31/2016).
    • For each station, the inflow and outflow of every 15 minutes are measured.

    Related Work: Physical-Virtual Collaboration Modeling for Intra-and Inter-Station Metro Ridership Prediction

  • HZMetro Dataset: a large-scale dataset for metro ridership prediction [Download]

    • 80 metro stations in HangZhou, China.
    • 2.35 million ridership per day (01/01/2019-01/25/2019).
    • For each station, the inflow and outflow of every 15 minutes are measured.

    Related Work: Physical-Virtual Collaboration Modeling for Intra-and Inter-Station Metro Ridership Prediction

  • NYC-TOD Dataset: a large-scale dataset for taxi origin-destination demand prediction [Download]

    • We diveded the New York City into a 15×5 grid map.
    • A total of 132 million taxi trip records in 2014.
    • We measured the taxi demands between every two regions during each 0.5 hour.

    Related Work: Contextualized Spatial-Temporal Network for Taxi Origin-Destination Demand Prediction

  • TaxiNYC Dataset: a large-scale dataset for taxi pickup/dropoff prediction [Download]

    • The New York City is diveded into a 15×7 grid map.
    • A total of 132 million taxi trip records in 2014.
    • For each region, the taxi pickup/dropoff demand of every 30 minutes are measured.

    Related Work: Dynamic Spatial-Temporal Representation Learning for Traffic Flow Prediction

    Computer Vision Datasets

  • SYSU16K Landmark Dataset: a large-scale dataset for facial landmark localization in the Wild [Download]

    • It contains 7317 images with 16K faces collected from the Internet.
    • Each face is accurately annotated with 72 landmarks.
    • The faces on this dataset exhibit various pose, expression, illumination and resolution, and may have severe occlusions.

    Related Work: Facial Landmark Machines: A Backbone-Branches Architecture with Progressive Representation Learning