pushshift/readme.md

657 B

Reddit Ingest

Distributed server/client setup for ingesting all of reddit.

  • Scales to multiple clients.
  • Supports reddit authentication.
  • Tolerant to clients losing state/going offline.
# ...install PostgreSQL...
pip install -r requirements.txt
# ...modify example yamls...
mv batcher_config.example.yaml batcher_config.yaml
mv fetcher_config.example.yaml fetcher_config.yaml
mkdir logs
bash compile_proto.sh

# Run one instance of batcher.py:
python batcher.py

# And several instances of fetcher.py:
python fetcher.py

Getting a refresh token: https://praw.readthedocs.io/en/stable/tutorials/refresh_token.html#obtaining-refresh-tokens