Skip to content

[Design] Filtered Data Exhaust API

Sowmya N Dixit edited this page Aug 25, 2020 · 1 revision

Introduction

This wiki give the details about design of filtered data exhaust, get job and list jobs API.

Below are the list of APIs to handle data product's job execution:

The APIs will be called by the portal by sending the filter parameter. At first a request will be authorized with consumer-id and channel-id given at request headers

Data Exhaust API

Data exhaust API is used to submit a job to create a data file that can be downloaded once the job is complete. request_id is returned by the API that is used to retrieve status periodically. request_id is generated by doing a hash on client_key and the filter criteria. Therefore for a given client if the same request is submitted twice - the same request_id is returned and the response would contain the output file if the job is already completed.

API

POST - /data/v3/dataset/request/submit

Request Parameters

  • request headers
    • X-Consumer-ID
    • X-Channel-ID
  • params.client_key - ID of the client. for ex: partner id. Mandatory
  • request.filter.start_date - filtered data exhaust start date. Mandatory
  • request.filter.end_date - filtered data exhaust end date. Mandatory
  • request.filter.tags - filter by tag. Mandatory
  • request.filter.events - list of events to filter and return. Optional
  • request.output_format - output format of the data. csv, json. Default value is json. Optional
{
    "id": "ekstep.analytics.dataset.request.submit",
    "ver": "1.0",
    "ts": "2016-12-07T12:40:40+05:30",
    "params": {
        "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341", // unique request message id, UUID
        "client_key": "<id>" // ID of the requestor. In partner case the partner id
    },
    "request": {
        "filter": {
            "start_date": String, // Start date of the data exhaust
            "end_date": String, // End date of the data exhaust
            "tags": Array[String], // tag id for a partner
            "events": Array[String] // list of event types to return.
        },
        "output_format": String, // json, csv. Default value is json.
    }
}

Blank or empty events filter criteria will send all events

Response:

{
    "id": "ekstep.analytics.dataset.request.submit",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "resmsgid": "3f03da60-1e24-3d41-aa7b-1daf91c36431",
        "client_key": "<id>",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": JobStatusResponse
}

JobStatusResponse

JobStatusResponse = {
    "request_id": String, // Request Id generated by Data out API
    "status": String // SUBMITTED, PROCESSING, COMPLETED OR FAILED if the request is made already
    "last_updated": DateTime, // Date Time in epoch format
    "request_data": {}, // data passed in the API
    "ouput": { // Job output
        "location": String, // Location of the file in S3
        "file_size": Long,  // file size
        "dt_file_created": Date, // Epoch timestamp on when the file is created
        "dt_first_event": Date, // Date on which the first event seen in the file
        "dt_last_event": Date, // Date on which the last event seen in the file
        "dt_expiration": Date // Date when the file is going to be deleted.
    },
    "job_stats": { // Job specific stats
        "dt_job_submitted": DateTime, // DateTime when the job is submitted
        "dt_job_processing": DateTime, // DateTime when the job is picked up for processing
        "dt_job_completed": DateTime, // DateTime when the job is complete
        "input_events": Int, // Total input events processed
        "output_events": Int, // Total output events produced
        "latency": Int, // Latency in seconds from the time the job is submitted before picked up for processing
        "execution_time": Int // Total time taken for processing excluding latency
    },
    attempts: Int // No. of times processed this request.
}

Following are list of status messages:

  1. SUBMITTED - Submitted for processing
  2. PROCESSING - Picked up for processing
  3. COMPLETED - Job completed succesfully
  4. FAILED - Job failed succesfully

Examples:

Sample 1:

Request:

{
    "id": "ekstep.analytics.dataset.request.submit",
    "ver": "1.0",
    "ts": "2016-12-07T12:40:40+05:30",
    "params": {
        "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341", // unique request message id, UUID
        "client_key": "dev-portal" // ID of the requestor. In partner case the partner id
    },
    "request": {
        "filter": {
            "start_date": "2016-11-01",
            "end_date": "2016-11-20",
            "tags": ["6da8fa317798fd23e6d30cdb3b7aef10c7e7bef5"]
        }
    }
}

Response:

{
    "id": "ekstep.analytics.dataset.request.submit",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "resmsgid": "3f03da60-1e24-3d41-aa7b-1daf91c36431",
        "client_key": "dev-portal",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": {
        "request_id": "6a54bfa283de43a89086e69e2efdc9eb6750493d", // Request Id generated by Data out API
        "status": "SUBMITTED",
        "last_updated": 1479890492, // Date Time in epoch format
        "request_data": {
            "filter": {
                "start_date": "2016-11-01",
                "end_date": "2016-11-20",
                "tags": ["6da8fa317798fd23e6d30cdb3b7aef10c7e7bef5"]
            }
        },
        "ouput": {},
        "job_stats": { // Job specific stats
            "dt_job_submitted": 1479890492
        },
        "attempts": 0
    }
}

Sample 2:

Request:

{
    "id": "ekstep.analytics.dataset.request.submit",
    "ver": "1.0",
    "ts": "2016-12-07T12:40:40+05:30",
    "params": {
        "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341", // unique request message id, UUID
        "client_key": "dev-portal" // ID of the requestor. In partner case the partner id
    },
    "request": {
        "filter": {
            "start_date": "2016-11-01",
            "end_date": "2016-11-20",
            "tags": ["6da8fa317798fd23e6d30cdb3b7aef10c7e7bef5"]
            "events": ["OE_ASSESS", "OE_ITEM_RESPONSE", "OE_LEVEL_SET"], // events specific to assessment
        }
    }
}

Response:

{
    "id": "ekstep.analytics.dataset.request.submit",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "msgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "resmsgid": "3f03da60-1e24-3d41-aa7b-1daf91c36431",
        "client_key": "dev-portal",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": {
        "request_id": "6a54bfa283de43a89086e69e2efdc9eb6750493d", // Request Id generated by Data out API
        "status": "COMPLETED",
        "last_updated": 1479890492, // Date Time in epoch format
        "request_data": {
            "filter": {
                "start_date": "2016-11-01",
                "end_date": "2016-11-20",
                "tags": ["6da8fa317798fd23e6d30cdb3b7aef10c7e7bef5"]
                "events": ["OE_ASSESS", "OE_ITEM_RESPONSE", "OE_LEVEL_SET"],
            }
        },
        "ouput": { 
            "location": "https://s3-ap-southeast-1.amazonaws.com/ekstep-public/ecar_files/1454996593287.zip",
            "file_size": 4535,
            "dt_file_created": "1479886892", // Epoch timestamp on when the file is created
            "dt_first_event": "1479886894", // Date on which the first event seen in the file
            "dt_last_event": "1479886895",
            "dt_expiration": "1479886899"
        },
        "job_stats": { // Job specific stats
            "dt_job_submitted": 1479886892,
            "dt_job_processing": 1479886952,
            "dt_job_completed": 1479890132,
            "input_events": 12000,
            "output_events": 1400,
            "latency": 2,
            "execution_time": 53
        },
        "attempts": 1
    }
}

Get Job API

Description:

This API will take request_id and the client_key and return the respose with job status, message, and any other details specific to the job (data). This API is used to retrieve the latest status of the job including output

API

GET - /data/v3/dataset/request/read/:client_key/:request_id

Request Parameters

  • request headers
    • X-Consumer-ID
    • X-Channel-ID
  • request_id - Request ID provided in the response when the job execution request made. Mandatory
  • client_key - client ID given when the job execution request made. Mandatory

Response:

{
    "id": "ekstep.analytics.dataset.request.info",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "resmsgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": JobStatusResponse
}

Examples:

Sample:

Request:

GET - /data/v3/dataset/request/read/dev-portal/6a54bfa283de43a89086e69e2efdc9eb6750493d

Response:

{
    "id": "ekstep.analytics.dataset.request.info",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "resmsgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": {
        "request_id": "6a54bfa283de43a89086e69e2efdc9eb6750493d", // Request Id generated by Data out API
        "status": "SUBMITTED",
        "last_updated": 1479890492, // Date Time in epoch format
        "request_data": {
            "start_date": "2016-11-10",
            "end_date": "2016-11-21",
            "tags": ["dd2dfa50dc8feca1e5303a87b2c6a42db3ebe102"]
        },
        "output": {},
        "job_stats": { // Job specific stats
            "dt_job_submitted": 1479886892
        },
        "attempts": 0
    }
}

Job List API

API

GET - /data/v3/dataset/request/list/:client_key?limit=10

Request Parameters

  • client_key - Client id given when the job execution request made. Mandatory
  • limit - Result jobs limit. Default limit is 100. Optional

Response:

{
    "id": "ekstep.analytics.dataset.request.list",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "resmsgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": {
        "count": Int,
        "jobs": Array[JobStatusResponse]
    }
}

Examples

Sample:

Request:

GET - /data/v3/dataset/request/list/dev-portal

  • request headers
    • X-Consumer-ID
    • X-Channel-ID

Response:

{
    "id": "ekstep.analytics.dataset.request.list",
    "ver": "1.0",
    "ts": "2016-12-07T12:43:23.890+00:00",
    "params": {
        "resmsgid": "4f04da60-1e24-4d31-aa7b-1daf91c46341",
        "status": "successful"
    },
    "responseCode": "OK",
    "result": {
        "count": 2,
        "jobs": [
            {
                "request_id": "6a54bfa283de43a89086e69e2efdc9eb6750493d",
                "status": "SUBMITTED",
                "last_updated": 1479890492, // Date Time in epoch format
                "request_data": {
                    "start_date": "2016-11-10",
                    "end_date": "2016-11-21",
                    "tag": ["dd2dfa50dc8feca1e5303a87b2c6a42db3ebe102"]
                },
                "output": {},
                "job_stats": { // Job specific stats
                    "dt_job_submitted": 1479886892
                }
            },
            {
                "request_id": "6a54bfa283de43a89086e69e2efdc9eb6750493d",
                "status": "COMPLETED",
                "last_updated": 1479890492, // Date Time in epoch format
                "request_data": {
                    "filter": {
                        "start_date": "2016-11-01",
                        "end_date": "2016-11-20",
                        "tags": ["6da8fa317798fd23e6d30cdb3b7aef10c7e7bef5"]
                        "events": ["OE_ASSESS", "OE_ITEM_RESPONSE", "OE_LEVEL_SET"],
                    }
                },
                "ouput": { 
                    "location": "https://s3-ap-southeast-1.amazonaws.com/ekstep-public/ecar_files/1454996593287.zip",
                    "file_size": 5434,
                    "dt_file_created": "1479886892",
                    "dt_first_event": "1479886894",
                    "dt_last_event": "1479886896",
                    "dt_expiration": "1479886899"
                },
                "job_stats": { // Job specific stats
                    "dt_job_submitted": 1479886892,
                    "dt_job_processing": 1479886952,
                    "dt_job_completed": 1479890132,
                    "input_events": 12000,
                    "output_events": 1400,
                    "latency": 2,
                    "execution_time": 53
                },
                "attempts": 1
            }
        ]
    }
}
Clone this wiki locally