Data source:
This dataset is collected by Stanford SNAP group using Gowalla’s public API. Detailed descriptions can be seen and raw data can be downloaded from https://snap.stanford.edu/data/loc-Gowalla.html.
Data preparation
The data is cleaned by (1) merging those check-in records with interval time of less than 30 minutes and locations within a 500m×500m area with their center of mass to remove duplicate check-ins, and (2) filtering out inactive users with less than 10 friends or 5 check-in locations to avoid unreliable statistics on incomplete observations. After that, 26,647 users are remaining with 2,698,029 check-in records from Feb. 2009 to Oct. 2010, forming an undirected network with 254,823 edges.
Data format:
Gowalla_sample_users: [user ID] [number of friends] [number of check-in locations]
Gowalla_sample_edges: [user1 ID] [user2 ID]
Gowalla_sample_checkins: [user ID] [check-in time] [latitude] [longitude]