Skip to content

Commit

Permalink
Have to start some real work. Will complete tests over lunch.
Browse files Browse the repository at this point in the history
  • Loading branch information
theangryangel committed Nov 18, 2015
1 parent 9804850 commit 49e751a
Show file tree
Hide file tree
Showing 5 changed files with 73 additions and 43 deletions.
32 changes: 16 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,9 @@ This plugin does not bundle any JDBC jar files, and does expect them to be in a
particular location. Please ensure you read the 4 installation lines below.

## Headlines
- Support for connection pooling added in 0.2.0 [unreleased until #21 is resolved]
- Support for unsafe statement handling (allowing dynamic queries) in 0.2.0 [unreleased until #21 is resolved]
- Support for connection pooling added in 0.2.0 [unreleased until #10 is resolved]
- Support for unsafe statement handling (allowing dynamic queries) in 0.2.0 [unreleased until #10 is resolved]
- Altered exception handling to now count sequential flushes with exceptions thrown in 0.2.0 [untested and unreleased until #10 is resolved]

## Versions
- See master branch for logstash v2+
Expand All @@ -31,20 +32,19 @@ particular location. Please ensure you read the 4 installation lines below.

## Configuration options

| Option | Type | Description | Required? |
| ------ | ---- | ----------- | --------- |
| driver_path | String | File path to jar file containing your JDBC driver. This is optional, and all JDBC jars may be placed in $LOGSTASH_HOME/vendor/jar/jdbc instead. | No |
| connection_string | String | JDBC connection URL | Yes |
| username | String | JDBC username - this is optional as it may be included in the connection string, for many drivers | No |
| password | String | JDBC password - this is optional as it may be included in the connection string, for many drivers | No |
| statement | Array | An array of strings representing the SQL statement to run. Index 0 is the SQL statement that is prepared, all other array entries are passed in as parameters (in order). A parameter may either be a property of the event (i.e. "@timestamp", or "host") or a formatted string (i.e. "%{host} - %{message}" or "%{message}"). If a key is passed then it will be automatically converted as required for insertion into SQL. If it's a formatted string then it will be passed in verbatim. | Yes |
| unsafe_statement | Boolean | If yes, the statement is evaluated for event fields - this allows you to use dynamic table names, etc. **This is highly dangerous** and you should **not** use this unless you are 100% sure that the field(s) you are passing in are 100% safe. Failure to do so will result in possible SQL injections. Please be aware that there is also a potential performance penalty as each event must be evaluated and inserted into SQL one at a time, where as when this is false multiple events are inserted at once. Example statement: [ "insert into %{table_name_field} (column) values(?)", "fieldname" ] | No |
| max_pool_size | Number | Maximum number of connections to open to the SQL server at any 1 time | No |
| connection_timeout | Number | Number of seconds before a SQL connection is closed | No |
| flush_size | Number | Maximum number of entries to buffer before sending to SQL - if this is reached before idle_flush_time | No |
| idle_flush_time | Number | Number of idle seconds before sending data to SQL - even if the flush_size has not yet been reached | No |
| max_repeat_exceptions | Number | Number of times the same exception can repeat before we stop logstash. Set to a value less than 1 if you never want it to stop | No |
| max_repeat_exceptions_time | Number | Maxium number of seconds between exceptions before they're considered "different" exceptions. If you modify idle_flush_time you should consider this value | No |
| Option | Type | Description | Required? | Default |
| ------ | ---- | ----------- | --------- | ------- |
| driver_path | String | File path to jar file containing your JDBC driver. This is optional, and all JDBC jars may be placed in $LOGSTASH_HOME/vendor/jar/jdbc instead. | No | |
| connection_string | String | JDBC connection URL | Yes | |
| username | String | JDBC username - this is optional as it may be included in the connection string, for many drivers | No | |
| password | String | JDBC password - this is optional as it may be included in the connection string, for many drivers | No | |
| statement | Array | An array of strings representing the SQL statement to run. Index 0 is the SQL statement that is prepared, all other array entries are passed in as parameters (in order). A parameter may either be a property of the event (i.e. "@timestamp", or "host") or a formatted string (i.e. "%{host} - %{message}" or "%{message}"). If a key is passed then it will be automatically converted as required for insertion into SQL. If it's a formatted string then it will be passed in verbatim. | Yes | |
| unsafe_statement | Boolean | If yes, the statement is evaluated for event fields - this allows you to use dynamic table names, etc. **This is highly dangerous** and you should **not** use this unless you are 100% sure that the field(s) you are passing in are 100% safe. Failure to do so will result in possible SQL injections. Please be aware that there is also a potential performance penalty as each event must be evaluated and inserted into SQL one at a time, where as when this is false multiple events are inserted at once. Example statement: [ "insert into %{table_name_field} (column) values(?)", "fieldname" ] | No | False |
| max_pool_size | Number | Maximum number of connections to open to the SQL server at any 1 time | No | 5 |
| connection_timeout | Number | Number of seconds before a SQL connection is closed | No | 2800 |
| flush_size | Number | Maximum number of entries to buffer before sending to SQL - if this is reached before idle_flush_time | No | 1000 |
| idle_flush_time | Number | Number of idle seconds before sending data to SQL - even if the flush_size has not yet been reached | No | 1 |
| max_flush_exceptions | Number | Number of sequential flushes which cause an exception, before we stop logstash. Set to a value less than 1 if you never want it to stop. This should be carefully configured with relation to idle_flush_time if your SQL instance is not highly available. | No | 0 |

## Example configurations
If you have a working sample configuration, for a DB thats not listed, pull requests are welcome.
Expand Down
19 changes: 19 additions & 0 deletions lib/logstash-output-jdbc_ring-buffer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
class RingBuffer < Array
attr_reader :max_size

def initialize(max_size, enum = nil)
@max_size = max_size
enum.each { |e| self << e } if enum
end

def <<(el)
if self.size < @max_size || @max_size.nil?
super
else
self.shift
self.push(el)
end
end

alias :push :<<
end
48 changes: 22 additions & 26 deletions lib/logstash/outputs/jdbc.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
require "stud/buffer"
require "java"
require "logstash-output-jdbc_jars"
require "logstash-output-jdbc_ring-buffer"

class LogStash::Outputs::Jdbc < LogStash::Outputs::Base
# Adds buffer support
Expand Down Expand Up @@ -58,17 +59,20 @@ class LogStash::Outputs::Jdbc < LogStash::Outputs::Base
# a timely manner.
#
# If you change this value please ensure that you change
# max_repeat_exceptions_time accordingly.
# max_flush_exceptions accordingly.
config :idle_flush_time, :validate => :number, :default => 1

# Maximum number of repeating (sequential) exceptions, before we stop retrying
# Maximum number of sequential flushes which encounter exceptions, before we stop retrying.
# If set to < 1, then it will infinitely retry.
config :max_repeat_exceptions, :validate => :number, :default => 4
#
# You should carefully tune this in relation to idle_flush_time if your SQL server
# is not highly available.
# i.e. If your idle_flush_time is 1, and your max_flush_exceptions is 200, and your SQL server takes
# longer than 200 seconds to reboot, then logstash will stop.
config :max_flush_exceptions, :validate => :number, :default => 0

# The max number of seconds since the last exception, before we consider it
# a different cause.
# This value should be carefully considered in respect to idle_flush_time.
config :max_repeat_exceptions_time, :validate => :number, :default => 30
config :max_repeat_exceptions, :obsolete => "This has been replaced by max_flush_exceptions - which behaves slightly differently. Please check the documentation."
config :max_repeat_exceptions_time, :obsolete => "This is no longer required"

public
def register
Expand All @@ -85,17 +89,12 @@ def register
@pool.setMaximumPoolSize(@max_pool_size)
@pool.setConnectionTimeout(@connection_timeout)

@exceptions_tracker = RingBuffer.new(@max_flush_exceptions)

if (@flush_size > 1000)
@logger.warn("JDBC - Flush size is set to > 1000")
end

@repeat_exception_count = 0
@last_exception_time = Time.now

if (@max_repeat_exceptions > 0) and ((@idle_flush_time * @max_repeat_exceptions) > @max_repeat_exceptions_time)
@logger.warn("JDBC - max_repeat_exceptions_time is set such that it may still permit a looping exception. You probably changed idle_flush_time. Considering increasing max_repeat_exceptions_time.")
end

buffer_initialize(
:max_items => @flush_size,
:max_interval => @idle_flush_time,
Expand All @@ -119,21 +118,14 @@ def flush(events, teardown=false)
end

def on_flush_error(e)
return if @max_repeat_exceptions < 1
return if @max_flush_exceptions < 1

if @last_exception == e.to_s
@repeat_exception_count += 1
else
@repeat_exception_count = 0
end
@exceptions_tracker << e.class

if (@repeat_exception_count >= @max_repeat_exceptions) and (Time.now - @last_exception_time) < @max_repeat_exceptions_time
@logger.error("JDBC - Exception repeated more than the maximum configured", :exception => e, :max_repeat_exceptions => @max_repeat_exceptions, :max_repeat_exceptions_time => @max_repeat_exceptions_time)
if @exceptions_tracker.reject { |i| i.nil? }.count >= @max_flush_exceptions
@logger.error("JDBC - max_flush_exceptions has been reached")
raise e
end

@last_exception_time = Time.now
@last_exception = e.to_s
end

def teardown
Expand Down Expand Up @@ -239,7 +231,11 @@ def add_statement_event_params(statement, event)
when false
statement.setBoolean(idx + 1, false)
else
statement.setString(idx + 1, event.sprintf(i))
if event[i].nil? and i =~ /%\{/
statement.setString(idx + 1, event.sprintf(i))
else
statement.setString(idx + 1, nil)
end
end
end

Expand Down
4 changes: 3 additions & 1 deletion logstash-output-jdbc.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |s|
s.name = 'logstash-output-jdbc'
s.version = "0.2.0.rc3"
s.version = "0.2.0.rc4"
s.licenses = [ "Apache License (2.0)" ]
s.summary = "This plugin allows you to output to SQL, via JDBC"
s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program"
Expand All @@ -24,4 +24,6 @@ Gem::Specification.new do |s|
s.add_runtime_dependency "logstash-codec-plain"

s.add_development_dependency "logstash-devutils"

s.post_install_message = "logstash-output-jdbc 0.2.0 introduces several new features - please ensure you check the documentation in the README file"
end
13 changes: 13 additions & 0 deletions spec/outputs/jdbc_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
require "logstash/devutils/rspec/spec_helper"
require "logstash/outputs/jdbc"
require "stud/temporary"

describe LogStash::Outputs::Jdbc do

it "should register without errors" do
plugin = LogStash::Plugin.lookup("output", "jdbc").new({})
expect { plugin.register }.to_not raise_error

end

end

0 comments on commit 49e751a

Please sign in to comment.