pastebinner/README.md

1.5 KiB

README

You need docker-compose installed.

  • disable ipv6 if you dont want to use it in this file under the networks: section. You can just comment the ipv6 subnet,gateway,enable lines to do that

Before you start:

You should have a Pastebin Pro API membership. You will also need to whitelist your IP Address. I have had success with both Ipv4 and Ipv6 addresses. This allows you access to the scraping API: https://pastebin.com/doc_scraping_api

  1. Create a .pastebin_creds file that contains the following environment vars
pastebin_api_key
pastebin_username
pastebin_password

this should store the creds in a file that is .gitignored and will allow the application to correctly scrape paste data.

To use:

docker-compose up

This will create the following containers and services:

  • pastebinner-rails
  • pastebinner-elasticsearch
  • pastebinner-redis
  • pastebinner-kibana
  • pastebinner-sidekiq

Interacting:

You can access the Kibana search interface at https://localhost:5601. This is just an interface into Kibana. You will need to create the pastes index pattern at first visit. It should then be scraping public pastes every 1 min. Any duplicate pastes keys are stored in Redis and will not be retrieved twice so we are not sending dupes to our ES db. To view status of jobs you can visit the sidekiq dashboard at http://localhost:3000/sidekiq To view the status of the worker job, you can view the sidekiq logs with docker-compose logs pastebinner-sidekiq