Full Reddit Submission Corpus now available for 2006 thru August 2015
Reddit是个社交新闻站点,其2006~2015年全部提交语料库提供下载。压缩后大小42,674,151,378 字节。
https://www.reddit.com/r/datasets/comments/3mg812/full_reddit_submission_corpus_now_available_2006/