This is a rails/postgres application that will serve json data from the following data sources: - Cves - Cpes - CNA security advisories - GHSA Github security advisories - Github repositories that track public exploits for cves. - A list of github usernames github API data.
Find a file
2022-05-04 20:10:55 -05:00
app starting to add in cna scrapers. first one is adobe. wrote up to index_hash method that allows me to pull a hash of all urls for each advisory id 2022-04-28 13:38:05 -05:00
bin rubocop commit 2022-04-19 02:37:27 -05:00
config add route 2022-04-27 01:31:57 -05:00
data make empty data dir 2022-04-11 21:04:54 -05:00
db starting to add in cna scrapers. first one is adobe. wrote up to index_hash method that allows me to pull a hash of all urls for each advisory id 2022-04-28 13:38:05 -05:00
lib add second link for adobe advisories 2022-05-04 20:10:55 -05:00
log init commit rails new data_importer 2022-03-30 22:12:56 -05:00
public init commit rails new data_importer 2022-03-30 22:12:56 -05:00
storage init commit rails new data_importer 2022-03-30 22:12:56 -05:00
test rubocop commit 2022-04-19 02:37:27 -05:00
tmp init commit rails new data_importer 2022-03-30 22:12:56 -05:00
.gitignore make empty data dir 2022-04-11 21:04:54 -05:00
config.ru rubocop commit 2022-04-19 02:37:27 -05:00
crontab.yaml added support for cisa known exploits json feed 2022-04-26 23:56:35 -05:00
docker-compose.yml begin with workers. first add a cna worker 2022-04-18 17:27:31 -05:00
Dockerfile now we work with ruby:latest docker image 2022-04-21 14:30:29 -05:00
entrypoint.sh docker-compose and postgres basic rails app skeleton 2022-03-30 22:52:39 -05:00
Gemfile now we work with ruby:latest docker image 2022-04-21 14:30:29 -05:00
Gemfile.lock now we work with ruby:latest docker image 2022-04-21 14:30:29 -05:00
package.json init commit rails new data_importer 2022-03-30 22:12:56 -05:00
Rakefile rubocop commit 2022-04-19 02:37:27 -05:00
README.md edit README. add faktory stuff to README 2022-04-19 01:26:14 -05:00

data_importer

This is a rails/postgres application that will serve json data from the following data sources:

  • Cves
  • Cpes
  • CNA security advisories
  • GHSA Github security advisories
  • Github repositories that track public exploits for cves.
  • A list of github usernames github API data.

Check the HTTP API section below for specific endpoints that can be queried via http.

Supported data models:

Initial Setup

Environment files

Create the following file that will contain the environment variables we need to login to APIs: credentials.env

# Twitter stuff doesnt work right now.
# twitter_bearer_token=
# twitter_api_key=
# twitter_access_token_secret=
# twitter_access_token=
# twitter_api_key_secret=

github_api_token=

Build container

docker-compose build

Database creation and seeding initial data

docker-compose run web rake db:create
docker-compose run web rake db:migrate
docker-compose run web rake db:seed

Running faktory

# Launch containers
docker-compose up -d

visit http://localhost:7420 in a web browser for faktory web UI.

Scheduling import jobs

A default crontab.yaml has been provided with a reasonable schedule. It uses the faktory_cron to schedule and ship importer worker jobs to faktory.

Launch Pry console

docker-compose run web rails console

HTTP API

For now unauthenticated api over localhost:3000 until I put in some basic token auth. All response data is json rendered.

Cves

  get "/cves", to: "cves#index"
  get "/cves/:cve_id", to: "cves#show"
  get "/cves/years/:year", to: "cves#show_year"

Cpes

  get "/cpes", to: "cpes#index"
  get "/cpes/:id", to: "cpes#show"

Cnas

  get "/cnas", to: "cnas#index"
  get "/cnas/:id", to: "cnas#show"
  get "/cnas/cna/:cna_id", to: "cnas#show_for_cna"

GithubAdvisories

  get "/github_advisories", to: "github_advisories#index"
  get "/github_advisories/:ghsa_id", to: "github_advisories#show"

GithubUsers

Create a text file named ./data/github_usernames.txt with one username per line There is a seed task that will read this file and perform an API call to github API and store the data in DB for each user. The API calls made are using the following graphQL endpoints:

  • User Note: the following keys are returned - github_id, login, name, avatar_url, bio, bio_html, location
  • RepositoryInfo Note: An array is returned of each public repository of the user.
  get "/github_users", to: "github_users#index"
  get "/github_users/:username", to: "github_users#show"

GithubPocs

  get "/github_pocs", to: "github_pocs#index"
  get "/github_pocs/:id", to: "github_pocs#show"
  get "/github_pocs/cve/:cve_id", to: "github_pocs#show_for_cve"
  get "/github_pocs/years/:year", to: "github_pocs#show_year"

InthewildCveExploits

  get "/inthewild_cve_exploits", to: "inthewild_cve_exploits#index"
  get "/inthewild_cve_exploits/:cve_id", to: "inthewild_cve_exploits#show"

TrickestPocCves

  get "/trickest_poc_cves", to: "trickest_poc_cves#index"
  get "/trickest_poc_cves/:id", to: "trickest_poc_cves#show"
  get "/trickest_poc_cves/cve/:cve_id", to: "trickest_poc_cves#show_for_cve"
  get "/trickest_poc_cves/years/:year", to: "trickest_poc_cves#show_year"

CvemonCves

  get "/cvemon_cves", to: "cvemon_cves#index"
  get "/cvemon_cves/:id", to: "cvemon_cves#show"
  get "/cvemon_cves/cve/:cve_id", to: "cvemon_cves#show_for_cve"
  get "/cvemon_cves/years/:year", to: "cvemon_cves#show_year"