Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend optimizations and improvements, TAXII interop filters #175

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 6 additions & 13 deletions .github/workflows/python-ci-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,25 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: [3.7, 3.8, 3.9, '3.10']

name: Python ${{ matrix.python-version }} Build
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3.3.0
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4.5.0
with:
python-version: ${{ matrix.python-version }}
- name: Start MongoDB
uses: supercharge/[email protected]
with:
mongodb-version: 4.0
uses: supercharge/[email protected]
- name: Install and update essential dependencies
run: |
pip install -U pip setuptools
pip install tox-gh-actions
pip install codecov
- name: Create test user
run: |
mongo admin --eval 'db.createUser({user:"travis",pwd:"test",roles:[{role:"root",db:"admin"}]});'
- name: Test with Tox
run: |
tox
run: tox
- name: Upload coverage information to Codecov
uses: codecov/codecov-action@v1
uses: codecov/codecov-action@v3.1.1
with:
fail_ci_if_error: true # optional (default = false)
verbose: true # optional (default = false)
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.4.0
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: check-merge-conflict
- repo: https://github.com/PyCQA/flake8
rev: 3.8.4
rev: 6.0.0
hooks:
- id: flake8
name: Check project styling
args:
- --max-line-length=160
- repo: https://github.com/PyCQA/isort
rev: 5.7.0
rev: 5.12.0
hooks:
- id: isort
name: Sort python imports (shows diff)
Expand Down
43 changes: 21 additions & 22 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,23 +113,26 @@ The <config_file> contains:

To use the Memory back-end plug, include the following in the <config-file>:

.. code-block:: json
.. code-block:: text

{
"backend": {
"module_class": "MemoryBackend",
"filename": "<path to json file with initial data>"
"filename": <path to json file with initial data>,
"interop_requirements": true/false # the TAXII interop document has some additional requirements
}
}

To use the Mongo DB back-end plug, include the following in the <config-file>:

.. code-block:: json
.. code-block:: text

{
"backend": {
"module_class": "MongoBackend",
"uri": "<Mongo DB server url> # e.g., 'mongodb://localhost:27017/'"
"uri": <Mongo DB server url> # e.g., 'mongodb://localhost:27017/'
"filename": <path to json file with initial data>,
"interop_requirements": true/false # the TAXII interop document has some additional requirements
}
}

Expand All @@ -138,13 +141,16 @@ To use the Mongo DB back-end plug, include the following in the <config-file>:
A description of the Mongo DB structure expected by the mongo db backend code is
described in `the documentation <https://medallion.readthedocs.io/en/latest/mongodb_schema.html>`_.

The ``interop_requirements`` option will enforce additional requirements from
the TAXII 2.1 Interoperability specification. It defaults to ``false``.

As required by the TAXII specification, *medallion* supports HTTP Basic
authorization. However, the user names and passwords are currently stored in
the <config_file> in plain text.

Here is an example:

.. code-block:: json
.. code-block:: text

{
"users": {
Expand All @@ -161,43 +167,38 @@ Authorization could be enhanced by changing the method "decorated" using

Configs may also contain a "taxii" section as well, as shown below:

.. code-block:: json
.. code-block:: text

{
"taxii": {
"max_page_size": 100
"interop_requirements": true
}
}

All TAXII servers require a config, though if any of the sections specified above
are missing, they will be filled with default values.

The ``interop_requirements`` option will enforce additional requireemnts from
the TAXII 2.1 Interoperability specification. It defaults to ``false``.

We welcome contributions for other back-end plugins.

Docker
------

We also provide a Docker image to make it easier to run *medallion*
We also provide a Docker image to make it easier to run *medallion* with the MongoDB backend. Use the --build argument
if the code has changed.

.. code-block:: bash

$ docker build . -t medallion -f docker_utils/Dockerfile
$ docker-compose up [--build]

If operating behind a proxy, add the following option (replacing `<proxy>` with
your proxy location and port): ``--build-arg https_proxy=<proxy>``.
This uses the information in docker-compose.yml to create a Docker container with medallion, mongo db and mongo-express

Then run the image
If operating behind a proxy, add the following to the medallion:build section of docker-compose.yml:

.. code-block:: bash
.. code-block:: text

$ docker run --rm -p 5000:5000 -v <directory>:/var/taxii medallion
HTTPS_PROXY: <proxy>

Replace ``<directory>`` with the full path to the directory containing your
medallion configuration.
replacing <proxy> with your proxy location and port.

Governance
----------
Expand Down Expand Up @@ -249,10 +250,8 @@ additional or substitute Maintainers, per `consensus agreements <https://www.oas
Current Maintainers of this TC Open Repository
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- `Chris Lenk <mailto:[email protected]>`__; GitHub ID: https://github.com/clenk/; WWW: `MITRE Corporation <https://www.mitre.org/>`__
- `Rich Piazza <mailto:[email protected]>`__; GitHub ID: https://github.com/rpiazza/; WWW: `MITRE Corporation <https://www.mitre.org/>`__
- `Zach Rush <mailto:[email protected]>`__; GitHub ID: https://github.com/zrush-mitre/; WWW: `MITRE Corporation <https://www.mitre.org/>`__
- `Jason Keirstead <mailto:[email protected]>`__; GitHub ID: https://github.com/JasonKeirstead; WWW: `IBM <http://www.ibm.com/>`__
- `Duncan Sparrell <mailto:[email protected]>`__; GitHub ID: https://github.com/sparrell; WWW: `sFractal <http://sfractal.com/>`__

About OASIS TC Open Repositories
--------------------------------
Expand Down
3 changes: 3 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@

def pytest_addoption(parser):
parser.addoption("--backends", action="store", default="memory,mongo")
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
language = "en"

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
Expand Down Expand Up @@ -97,7 +97,7 @@
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# html_static_path = ['_static']

# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
Expand Down
96 changes: 56 additions & 40 deletions docs/mongodb_schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,30 +4,41 @@ Design of the TAXII Server Mongo DB Schema for *medallion*

As *medallion* is a prototype TAXII server implementation, the schema design for a Mongo DB is relatively straightforward.

Each Mongo database contains one or more collections. The term "collection" in Mongo DBs is similar to the concept of a table in a relational database. Collections contain "documents", similar to records.
Each Mongo database contains one or more collections. The term "collection" in Mongo DBs is similar to the concept of a table in a relational database. Collections contain "documents", somewhat analogous to table rows.

It is unfortunate that the term "collection" is also used to signify something unrelated in the TAXII specification. We will use the phrase "taxii collection" to distinguish them.

An instance of this schema can be populated via the file test/data/initialize_mongodb.py. This instance will be used for examples below.
You can initialize the database with content by specifying a JSON file in the backend section of the medallion configuration. The JSON file containing TAXII server content must have a particular structure. Refer to medallion/test/data/default_data.json for an example of the required structure.

Utilities to initialize your own Mongo DB can be found in test/generic_initialize_mongodb.py.
An example configuration:

.. code-block:: json

{
"backend": {
"module_class": "MongoBackend",
"uri": "<Mongo DB server url, e.g. mongodb://localhost:27017/>",
"filename": "<path to json file with initial data>"
}
}

.. important::
To avoid accidentally deleting data, the Mongo backend will check whether the database appears to have already been initialized. If so, it will not change anything. To override the safety check and always reinitialize the database, add another backend setting: ``"clear_db": true``.

The discovery database
----------------------

Basic metadata contained in the mongo database named **discovery_database**.
Basic metadata is contained in the mongo database named **discovery_database**. The discovery_database contains two collections:

The discovery_database contains two collections:

**discovery_information**. It should only contain only one "document", which is the discovery information that would be returned from the Discovery endpoint. Here is the document from the example database.
**discovery_information** should only contain only one "document", which is the discovery information that would be returned from the Discovery endpoint. Here is the document from the example database.

.. code-block:: json

{
"title": "Some TAXII Server",
"description": "This TAXII Server contains a listing of",
"contact": "string containing contact information",
"default": "http://localhost:5000/api2/",
"default": "http://localhost:5000/trustgroup1/",
"api_roots": [
"http://localhost:5000/api1/",
"http://localhost:5000/api2/",
Expand All @@ -45,7 +56,7 @@ Here is a document from the example database:
"title": "Malware Research Group",
"description": "A trust group setup for malware researchers",
"versions": [
"taxii-2.0"
"application/taxii+json;version=2.1"
],
"max_content_length": 9765625,
"_url": "http://localhost:5000/trustgroup1/",
Expand All @@ -55,7 +66,8 @@ Here is a document from the example database:
The api root databases
----------------------

Each api root is contained in a separate Mongo DB database. It has four collections: **status**, **objects**, **manifests**, and **collections**. To support multiple taxii collections, any document in the **objects** and **manifests** contains an extra property, "collection_id", to link it to the taxii collection that it is contained in. Because "_collection_id" property is not part of the TAXII specification, it will be stripped by *medallion* before any document is returned to the client.
Each api root is contained in a separate Mongo DB database. It has three collections: **status**, **objects**,
and **collections**.

A document from the **collections** collection:

Expand All @@ -72,22 +84,31 @@ A document from the **collections** collection:
]
}

Because the STIX objects and the manifest entries correspond one-to-one, the manifest is stored with the object. It keeps all information about an object in one place and avoids the complexity and overhead of needing to join documents. Also, timestamps are stored as numbers due to the millisecond precision limitation of the Mongo built-in ``Date`` type. These documents are converted to proper STIX or TAXII JSON format as needed.

A document from the **objects** collection:

.. code-block:: json

{
"created": "2014-05-08T09:00:00.000Z",
"id": "indicator--a932fcc6-e032-176c-126f-cb970a5a1ade",
"labels": [
"file-hash-watchlist"
"created": 1485524993.997,
"description": "Poison Ivy",
"id": "malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec",
"is_family": true,
"malware_types": [
"remote-access-trojan"
],
"modified": "2014-05-08T09:00:00.000Z",
"name": "File hash for Poison Ivy variant",
"pattern": "[file:hashes.'SHA-256' = 'ef537f25c895bfa782526529a9b63d97aa631564d5d789c2b765448c8635fb6c']",
"type": "indicator",
"valid_from": "2014-05-08T09:00:00.000000Z",
"_collection_id": "91a7b528-80eb-42ed-a74d-c6fbd5a26116"
"modified": 1485524993.997,
"name": "Poison Ivy",
"spec_version": "2.1",
"type": "malware",
"_collection_id": "91a7b528-80eb-42ed-a74d-c6fbd5a26116",
"_manifest": {
"date_added": 1485524999.997,
"id": "malware--c0931cc6-c75e-47e5-9036-78fabc95d4ec",
"media_type": "application/stix+json;version=2.1",
"version": 1485524993.997
}
}

A document from the **status** collection:
Expand All @@ -97,38 +118,33 @@ A document from the **status** collection:
{
"id": "2d086da7-4bdc-4f91-900e-d77486753710",
"status": "pending",
"request_timestamp": "2016-11-02T12:34:34.12345Z",
"request_timestamp": "2016-11-02T12:34:34.123456Z",
"total_count": 4,
"success_count": 1,
"successes": [
"indicator--a932fcc6-e032-176c-126f-cb970a5a1ade"
{
"id": "indicator--cd981c25-8042-4166-8945-51178443bdac",
"version": "2014-05-08T09:00:00.000Z",
"message": "Successfully added object to collection '91a7b528-80eb-42ed-a74d-c6fbd5a26116'."
}
],
"failure_count": 1,
"failures": [
{
"id": "malware--664fa29d-bf65-4f28-a667-bdb76f29ec98",
"version": "2015-05-08T09:00:00.000Z",
"message": "Unable to process object"
}
],
"pending_count": 2,
"pendings": [
"indicator--252c7c11-daf2-42bd-843b-be65edca9f61",
"relationship--045585ad-a22f-4333-af33-bfd503a683b5"
{
"id": "indicator--252c7c11-daf2-42bd-843b-be65edca9f61",
"version": "2016-08-08T09:00:00.000Z"
},
{
"id": "relationship--045585ad-a22f-4333-af33-bfd503a683b5",
"version": "2016-06-08T09:00:00.000Z"
}
]
}

A document from the **manifest** collection:

.. code-block:: json

{
"id": "indicator--a932fcc6-e032-176c-126f-cb970a5a1ade",
"date_added": "2016-11-01T10:29:05Z",
"versions": [
"2014-05-08T09:00:00.000Z"
],
"media_types": [
"application/vnd.oasis.stix+json; version=2.0"
],
"_collection_id": "91a7b528-80eb-42ed-a74d-c6fbd5a26116"
}
Loading