Sunbelt is an app that collects and aggregates text data from sources such as Reddit, Twitter, MediaWiki, and Common Crawl, to integrate these data into machine learning and analytics pipelines. The onset of advanced, high-availability machine learning language models offered as a service has multiplied the interest and potential use-cases for natural language data, and Sunbelt sought to enable organizations to quickly access large natural language datasets.
How sunbelt works
sun·belt
noun: a strip of territory receiving a higher amount of sunshine than its surrounding territory
Sunbelt offers a deeper view into a subset of historical Reddit data than any other platform available. Unlike other services such as Pushshift and Reveddit, which store data on posts and comments immediately after they are posted, or create a new way for users to see live data on Reddit, Sunbelt stores information about how posts and comments have changed over time. This is where Sunbelt gets its name. The sun makes things visible, and the landscape of currently available data sources is shrouded in darkness compared to what Sunbelt offers.