pushshift/readme.md

26 lines
657 B
Markdown
Raw Permalink Normal View History

2023-07-11 05:05:24 +00:00
# Reddit Ingest
Distributed server/client setup for ingesting all of reddit.
- Scales to multiple clients.
- Supports reddit authentication.
- Tolerant to clients losing state/going offline.
2023-07-11 04:55:36 +00:00
```bash
# ...install PostgreSQL...
pip install -r requirements.txt
# ...modify example yamls...
mv batcher_config.example.yaml batcher_config.yaml
mv fetcher_config.example.yaml fetcher_config.yaml
mkdir logs
bash compile_proto.sh
2023-07-11 05:05:24 +00:00
# Run one instance of batcher.py:
2023-07-11 04:55:36 +00:00
python batcher.py
2023-07-11 05:05:24 +00:00
# And several instances of fetcher.py:
2023-07-11 04:55:36 +00:00
python fetcher.py
```
Getting a refresh token:
https://praw.readthedocs.io/en/stable/tutorials/refresh_token.html#obtaining-refresh-tokens