Plan is to support
-
message-logs (general purpose/message based logs) | maybe called unstructured logs - doesnt have schema - has message and status - can have random keys to log
-
tech-logs (JSON Objects with details that need to be logged) | maybe called structured logs - has schema - has message and status - only schema-matched logs are logged
-
metrics (time taken/space used/etc) | json object of performance and stats
-
what else?
-
Context-less logs go to default context
- Publisher - The App/Service that is Publishing the Logs
- Context - The Component of the App(Publisher) that is publishing the logs
- Sub-Context - The Sub-Component of the Context Publishing the logs [❌]
Contexts have
- Schema : property - hasSchema [true or false]
- ID : property - id [a short UUID]
- The Publishers (Apps) are Logically Separated - Separate Directories - The Publisher Directory has a Contexts.json file with the details and list of Contexts
- The Contexts have logical Seperation too - Separate JSONs (maybe)
- The JSONs have two parts - schema , logs (an array of JSONs (ndjson))
- No Separation for Sub Conntexts - lives in the Contexts only
- logs and metrics have separate logical boundaries - inside contexts dir. has both logs.json and metrics.json (or ndjson)
--- Publisher#1
|-- ContextsRegistryForPub#1.json
|-- Context#1
|-- logs.json
|-- metrics.json
|-- Context#2.json
|-- logs.json
|-- metrics.json
|-- ...
--- Publisher#2
|-- ContextsRegistryForPub#2.json
|-- Context#1
|-- logs.json
|-- metrics.json
|-- Context#2.json
|-- logs.json
|-- metrics.json
|-- ...
--- Publisher#3
|-- ContextsRegistryForPub#3.json
|-- Context#1
|-- logs.json
|-- metrics.json
|-- Context#2.json
|-- logs.json
|-- metrics.json
|-- ...
-- ...
{
"publisher#1" : {
"id" : "xyzabc123",
"path" : "/path/to/pub#1/directory",
"origin" : "app.something.com:1000",
"timestamp" : "12345678",
"namespace" : "publisher#1",
"description" : "Lorem Ipsum"
},
"publisher#2" : {...},
"publisher#3" : {...},
}
Whats Included in request ? - context(namespace), description, origin, kind , if log or metrics use schema then add keys log and metric and value as a json object with {"key" : "descriptor" }.
status ad timestamp are added by default - no need to add those
Do we need descriptors for schema?
POST /:publisher/context/new
{
"context" : "context#1",
"origin" : "app.something.com:1000/context1",
"description" : "Lorem Ipsum",
"kind" : [log , metric],
"logs" : [column1, column2, ...],
"metrics":[ column1, 2, ...]
}
or
{
"context" : "context#1",
"origin" : "app.something.com:1000/context1",
"description" : "Lorem Ipsum",
"kind" : {
"logs" : [column1, column2, ...], // or incase of no schema logs: []
"metrics":[ column1, 2, ...] // or incase of no schema metrics: []
}
}
context
---becomes--> namespace
{
"context#1" : {
"id" : "123345",
"timestamp" : "1233",
"description" : "Lorem Ipsum",
"namespace" : "context#1",
"origin" : "app.something.com:1000/context1",
"kind" : [log , metric],
"logs" : {
"usesSchema" : true,
"schema": [..., status[WARN|INFO|etc], timestamp],
"path" : "path/to/logs"
},
"metrics" : {
"usesSchema" : true,
"schema": [..., timestamp],
"path" : "path/to/metrics"
}
}
}
{ ... }
{ ... }
{
"key" : {
"type" : "string",
}
}
Matches Type, and other Constraints
Or
{
"key" : "descriptor",
"key" : "descriptor",
...
}
Matches only Keys, if keys are present.
Matches only Keys, if keys are present.
Update Wont use Descriptor, just key-names.
Match Keys Only, no Descriptor
- An App needs to register itself (as a Publisher) onto the logsmith - PublisherRegistry
- The Context may be created/registered in contextsRegistry
- Emit Logs/metrics from somewhere.
- Check Context
- If Context Exists
- Check if it uses Schema
- Compare Schema with Data (if usesSchema=true)
- Else, No Schema - register
- if Context Does't Exist
- Register Context - no schema
- Register Logs
- If Context Exists
- ?
What is LaaS? - Logging as a Service on Wikipedia
(Ideate Later)
curl --request POST --url localhost:8080/publisher/new --header 'content-type: application/json' --data '{"publisher" : "tester","origin" : "two","description" : "hfbhfbvhfdbvf"}'
A Snippet on How to Write Tests.
Structure
<Handler-Component> Tests
Testing Method <method-name>
Test Case #<N> [ <Success/Failure> Case - <Description> ] [ Should Return <Value> ]
Code
describe("ContextHandler-Validate Tests", function(){
describe("Testing Method - validateNewContext()", function(){
it("Test Case #1 [ Success Case - Valid Context ] [ Should Return True ]", function(){
expect(validateNewContext(validNewContextData.validNewContextOne, TestContextRegistry)).to.equal(true)
})
it("Test Case #2 [ Failure Case - Context Exists ] [ Should Return False ]", function(){
expect(validateNewContext(invalidNewContextData.invalidNewContextOne, TestContextRegistry)).to.equal(false)
})
it("Test Case #3 [ Failure Case - Kind Value Missing ] [ Should Return False ]", function(){
expect(validateNewContext(invalidNewContextData.invalidNewContextTwo, TestContextRegistry)).to.equal(false)
})
it("Test Case #3 [ Failure Case - Origin Missing ] [ Should Return False ]", function(){
expect(validateNewContext(invalidNewContextData.invalidNewContextThree, TestContextRegistry)).to.equal(false)
})
it("Test Case #3 [ Failure Case - Invalid Origin Value ] [ Should Return False ]", function(){
expect(validateNewContext(invalidNewContextData.invalidNewContextThree, TestContextRegistry)).to.equal(false)
})
})
})
Rules that govern logs and metric schema
- timestamp, status are put in by default
- If the above are added in the new context defination, logsmith considers it as no-schema, as they are added and populated by the system
- If a log has a schema, its uses strict schema usage, i.e logs with any extra columns or missing columns are not recorded
- Will introduce a useStrictSchema Flag in future to enable and disable it.
- If The publisher emits timestamp, the system generated timestamp is overriden
- No need to mention timestamp and status for new schema defination
- status : WARN, INFO, CRITICAL, etc
- timestamp : when log is registered
- timestamp is put in by default
- If the above are added in the new context defination, logsmith considers it as no-schema, as they are added and populated by the system
- If a metric has a schema, its uses strict schema usage, i.e metric with any extra columns or missing columns are not recorded
- Will introduce a useStrictSchema Flag in future to enable and disable it.
- If The publisher emits timestamp, the system generated timestamp is overriden
- No need to mention timestamp for new schema defination
- timestamp : when log is registered
How should the "Template" LogRegistry look like?
{
"kind" : "logs",
"hasSchema" : true, // or false
"schema" : ["col1", "col2","timestamp" , "status"],
"defaults" : ["timestamp" , "status"],
"logCount" : 69,
"logs" :[
{"timestamp" : 123456789, "status" : "warn", "col1", "col2", ...},
{"timestamp" : 123456789, "status" : "warn", "col1", "col2", ...},
{"timestamp" : 123456789, "status" : "warn", "col1", "col2", ...},
{"timestamp" : 123456789, "status" : "warn", "col1", "col2", ...},
{"timestamp" : 123456789, "status" : "warn", "col1", "col2", ...},
...
]
}
POST: /:publisher/:context/log
{
"timestamp" : 1234,
"status" " [INFO | WARN | Etc],
"col1" : kjgkjh,
...
}
Been thinking about the beauty of Kubernetes Resource Creation and definition with all the "kinds" and "specifications" in YAML. Trying to recreate something like that would not only be cool but also quite helpful in terms of Usage- standardized structure and well defined, and the flexibility of JSON format.
- Creating a Platform Agnostic Thing
- Replace different names of individual to "namespace"
- Create a Kubernetes inspired Resource Creation in JSON
- Specification for all Configuration for a Namespace
FOR REGISTERING PUBLISHER
{
"kind" : "publisher",
"namespace" : "publisher-name",
"spcifications" : {
"origin" : "origin.app.com",
"description" : "Lorem Ipsum",
}
}
FOR CREATING CONTEXT
{
"kind" : "context",
"namespace" : "context-name",
"spcifications" : {
"origin" : "origin.app.com",
"description" : "Lorem Ipsum",
"kinds" : {
"logs" : ["col1","col2"],
"metrics" : [col1, col2]
}
}
}
FOR LOG - need more storm in the brain. :)
- [for future] Use Predefined
Log Levels
- INFO WARN etc. - [for future] Use Threads wherever possible to boost performance. Use Threaded Logging from client using libraries.
A Full-Stack app that can publish logs are called "Publisher". An App may comprise of various components - frontend, backend, database, event-processor, etc. that may publish its own logs. Each Component has its own "Context" within the Publisher as a Logical separation. Each Context publishes its logs to The Context's LogRegistry.
docker exec --interactive logsmith-monitor /bin/sh
This brings up the interactive shell inside the container where you can run the commands.
docker exec logsmith-monitor_logsmithmonitor_1 /bin/sh ./scripts/clean-up.sh
-
GET /publisher
Publisher List -
GET /:publisher/context
Context List -
GET /:publisher/:context/logs?realtime=true
Realtime Log List
- It is a good practice to pass a single log object to everywhere its required instead of defining it in every file.
from SubDirPath import defOne
from SubDirTwo import defTwo
log = log()
defOne(params..., log)
defTwo(params..., log)
log.warn()
// SubDirPath File
def defOne(param, log):
log.warn()
// SubDirTwo File
def defTwo(param, log):
log.info()
Instead of
from SubDirPath import defOne
from SubDirTwo import defTwo
log = log(config)
defOne(params)
defTwo(params)
log.warn()
// SubDirPath File
def defOne(param):
log = log(config)
log.warn()
// SubDirTwo File
def defTwo(param):
log = log(config)
log.info()
Go to Example Directory, use npm install
npm install <path-to-project>
//eg
npm install ../logsmith
Some thoughts on helper libraries.
- Q:
- Publisher
- publisher.exists
- publisher.invalid
- publisher.success
- publisher.missing
- Context
- context.invalid
- context.success
- context.exists
- context.missing
- Log
- log.success
- log.error
const sampleData = {
listener: "http://localhost:8080",
publisher: {
publisher: "testapp1",
origin: "bend.testapp1.com",
description: "test App"
},
context: {
context: "testcon2",
origin: "bend.testcon.con",
description: "Cotext",
kind: {
logs: []
}
},
log: {
loglevel: "WARN",
worker: "TEST WrKr",
random: "txt"
}
}
monitorLogRunner(sampleData.listener, sampleData.publisher, sampleData.context, sampleData.log, function (response) {
console.log(response)
})
function loadInits(monitorLogging, monitorConfigs) {
if (monitorLogging && monitorConfigs.publisher != undefined && monitorConfigs.context != undefined) {
createNewPublisher(monitorConfigs.listener, monitorConfigs.publisher, function (res) {
createNewContext(monitorConfigs.listener, monitorConfigs.publisher, monitorConfigs.context, function (resp) {
console.log("loded")
})
})
}
}
if (response.scope == MonitorResponse.publisher.missing) {
createNewPublisher(listener, publisher, function (response) {
if (response.scope == MonitorResponse.publisher.success) {
monitorLogRunner(listener, monitorConfig, log, function (response) {
callback(response)
})
}
callback(response)
})
} else if (response.scope == MonitorResponse.context.missing) {
createNewContext(listener, publisher, context, function (response) {
if (response.scope == MonitorResponse.context.success) {
monitorLogRunner(listener, monitorConfig, log, function (response) {
console.log(response)
callback(response)
})
}
callback(response)
})
} else {
if (response.scope == MonitorResponse.log.error) {
callback(response)
} else if (response.scope == MonitorResponse.log.success) {
callback(response)
}
}
Identified Issues
- Mid-flight crashing
- If the Monitor goes down, for any reason , the logging library also causes the app/program its logging for to go down, mid flight.
- Caused due to FetchError in node-fetchwhen server is unreachable.
- This should be worked on and rather than making the program stop it should handle it
- May add a catch block to handle this
- Pending Issue Creation - Create an issue and work on it.