reddit / data

10,600
Posts
100,000
Comments
14,796
Authors
Browse data in browser

Downloads

Schema

posts.csv

post_id, subreddit, date_str, year_month, day_of_week, hour, author, title, selftext, selftext_len, score, num_comments, url, has_image, flair, over_18

comments.csv

id, subreddit, date_str, year_month, day_of_week, hour, author, body, body_len, score, is_top_level, post_id, parent_comment_id

Use post_id to group by thread. Use parent_comment_id to reconstruct reply chains.

authors.csv

author, item_count, avg_score, total_score, avg_body_len, source