129 lines
No EOL
4.5 KiB
Markdown
129 lines
No EOL
4.5 KiB
Markdown
# data_importer
|
|
|
|
This is a rails/postgres application that will serve json data from the following data sources:
|
|
- Cves
|
|
- Cpes
|
|
- CNA security advisories
|
|
- GHSA Github security advisories
|
|
- Github repositories that track public exploits for cves.
|
|
- A list of github usernames github API data.
|
|
|
|
Check the HTTP API section below for specific endpoints that can be queried via http.
|
|
|
|
## Supported data models:
|
|
- `Cve` data from [cve_list](https://github.com/CVEProject/cvelist) github repo.
|
|
- `Cpe` data from [nvd](https://nvd.nist.gov/products/cpe) 2.2 format.
|
|
- `Cna` data from [mitre](https://raw.githubusercontent.com/CVEProject/cve-website/dev/src/assets/data/CNAsList.json).
|
|
- `GithubPoc` data from [nomi-sec](https://github.com/nomi-sec/PoC-in-GitHub) github repo.
|
|
- `GithubAdvisory` data from [github_advisories_database](https://github.com/github/advisory-database/) github repo.
|
|
- `GithubUser` data from [github_graphql_api](https://docs.github.com/en/graphql)
|
|
- `InthewildCveExploit` data from [inthewild.io](https://inthewild.io/api/exploited) exploited feed.
|
|
- `TrickestPocCve` data from [trickest](https://github.com/trickest/cve) github repo.
|
|
- `CvemonCve` data from [ARPSyndicate](https://raw.githubusercontent.com/ARPSyndicate/cvemon/main/data.json) github repo.
|
|
|
|
## Initial Setup
|
|
|
|
### Environment files
|
|
Create the following file that will contain the environment variables we need to login to APIs:
|
|
`credentials.env`
|
|
```
|
|
# Twitter stuff doesnt work right now.
|
|
# twitter_bearer_token=
|
|
# twitter_api_key=
|
|
# twitter_access_token_secret=
|
|
# twitter_access_token=
|
|
# twitter_api_key_secret=
|
|
|
|
github_api_token=
|
|
```
|
|
|
|
### Build container
|
|
`docker-compose build`
|
|
|
|
### Database creation and seeding initial data
|
|
```
|
|
docker-compose run web rake db:create
|
|
docker-compose run web rake db:migrate
|
|
docker-compose run web rake db:seed
|
|
```
|
|
|
|
### Running faktory
|
|
```
|
|
# Launch containers
|
|
docker-compose up -d
|
|
```
|
|
visit http://localhost:7420 in a web browser for faktory web UI.
|
|
|
|
### Scheduling import jobs
|
|
A default crontab.yaml has been provided with a reasonable schedule. It uses the [faktory_cron](https://github.com/cdrx/faktory_cron) to schedule and ship importer worker jobs to faktory.
|
|
|
|
### Launch Pry console
|
|
`docker-compose run web rails console`
|
|
|
|
### HTTP API
|
|
For now unauthenticated api over localhost:3000 until I put in some basic token auth. All response data is json rendered.
|
|
|
|
#### Cves
|
|
```
|
|
get "/cves", to: "cves#index"
|
|
get "/cves/:cve_id", to: "cves#show"
|
|
get "/cves/years/:year", to: "cves#show_year"
|
|
```
|
|
#### Cpes
|
|
```
|
|
get "/cpes", to: "cpes#index"
|
|
get "/cpes/:id", to: "cpes#show"
|
|
```
|
|
|
|
#### Cnas
|
|
```
|
|
get "/cnas", to: "cnas#index"
|
|
get "/cnas/:id", to: "cnas#show"
|
|
get "/cnas/cna/:cna_id", to: "cnas#show_for_cna"
|
|
```
|
|
|
|
#### GithubAdvisories
|
|
```
|
|
get "/github_advisories", to: "github_advisories#index"
|
|
get "/github_advisories/:ghsa_id", to: "github_advisories#show"
|
|
```
|
|
|
|
#### GithubUsers
|
|
Create a text file named `./data/github_usernames.txt` with one username per line
|
|
There is a seed task that will read this file and perform an API call to github API and store the data in DB for each user. The API calls made are using the following graphQL endpoints:
|
|
- [User](https://docs.github.com/en/graphql/reference/objects#user) Note: the following keys are returned - github_id, login, name, avatar_url, bio, bio_html, location
|
|
- [RepositoryInfo](https://docs.github.com/en/graphql/reference/interfaces#repositoryinfo) Note: An array is returned of each public repository of the user.
|
|
```
|
|
get "/github_users", to: "github_users#index"
|
|
get "/github_users/:username", to: "github_users#show"
|
|
```
|
|
|
|
#### GithubPocs
|
|
```
|
|
get "/github_pocs", to: "github_pocs#index"
|
|
get "/github_pocs/:id", to: "github_pocs#show"
|
|
get "/github_pocs/cve/:cve_id", to: "github_pocs#show_for_cve"
|
|
get "/github_pocs/years/:year", to: "github_pocs#show_year"
|
|
```
|
|
|
|
#### InthewildCveExploits
|
|
```
|
|
get "/inthewild_cve_exploits", to: "inthewild_cve_exploits#index"
|
|
get "/inthewild_cve_exploits/:cve_id", to: "inthewild_cve_exploits#show"
|
|
```
|
|
|
|
#### TrickestPocCves
|
|
```
|
|
get "/trickest_poc_cves", to: "trickest_poc_cves#index"
|
|
get "/trickest_poc_cves/:id", to: "trickest_poc_cves#show"
|
|
get "/trickest_poc_cves/cve/:cve_id", to: "trickest_poc_cves#show_for_cve"
|
|
get "/trickest_poc_cves/years/:year", to: "trickest_poc_cves#show_year"
|
|
```
|
|
|
|
#### CvemonCves
|
|
```
|
|
get "/cvemon_cves", to: "cvemon_cves#index"
|
|
get "/cvemon_cves/:id", to: "cvemon_cves#show"
|
|
get "/cvemon_cves/cve/:cve_id", to: "cvemon_cves#show_for_cve"
|
|
get "/cvemon_cves/years/:year", to: "cvemon_cves#show_year"
|
|
``` |