diff --git a/README.md b/README.md index 94e8882..76978c1 100644 --- a/README.md +++ b/README.md @@ -12,8 +12,9 @@ This plugin does not bundle any JDBC jar files, and does expect them to be in a particular location. Please ensure you read the 4 installation lines below. ## Headlines - - Support for connection pooling added in 0.2.0 [unreleased until #21 is resolved] - - Support for unsafe statement handling (allowing dynamic queries) in 0.2.0 [unreleased until #21 is resolved] + - Support for connection pooling added in 0.2.0 [unreleased until #10 is resolved] + - Support for unsafe statement handling (allowing dynamic queries) in 0.2.0 [unreleased until #10 is resolved] + - Altered exception handling to now count sequential flushes with exceptions thrown in 0.2.0 [untested and unreleased until #10 is resolved] ## Versions - See master branch for logstash v2+ @@ -31,20 +32,19 @@ particular location. Please ensure you read the 4 installation lines below. ## Configuration options -| Option | Type | Description | Required? | -| ------ | ---- | ----------- | --------- | -| driver_path | String | File path to jar file containing your JDBC driver. This is optional, and all JDBC jars may be placed in $LOGSTASH_HOME/vendor/jar/jdbc instead. | No | -| connection_string | String | JDBC connection URL | Yes | -| username | String | JDBC username - this is optional as it may be included in the connection string, for many drivers | No | -| password | String | JDBC password - this is optional as it may be included in the connection string, for many drivers | No | -| statement | Array | An array of strings representing the SQL statement to run. Index 0 is the SQL statement that is prepared, all other array entries are passed in as parameters (in order). A parameter may either be a property of the event (i.e. "@timestamp", or "host") or a formatted string (i.e. "%{host} - %{message}" or "%{message}"). If a key is passed then it will be automatically converted as required for insertion into SQL. If it's a formatted string then it will be passed in verbatim. | Yes | -| unsafe_statement | Boolean | If yes, the statement is evaluated for event fields - this allows you to use dynamic table names, etc. **This is highly dangerous** and you should **not** use this unless you are 100% sure that the field(s) you are passing in are 100% safe. Failure to do so will result in possible SQL injections. Please be aware that there is also a potential performance penalty as each event must be evaluated and inserted into SQL one at a time, where as when this is false multiple events are inserted at once. Example statement: [ "insert into %{table_name_field} (column) values(?)", "fieldname" ] | No | -| max_pool_size | Number | Maximum number of connections to open to the SQL server at any 1 time | No | -| connection_timeout | Number | Number of seconds before a SQL connection is closed | No | -| flush_size | Number | Maximum number of entries to buffer before sending to SQL - if this is reached before idle_flush_time | No | -| idle_flush_time | Number | Number of idle seconds before sending data to SQL - even if the flush_size has not yet been reached | No | -| max_repeat_exceptions | Number | Number of times the same exception can repeat before we stop logstash. Set to a value less than 1 if you never want it to stop | No | -| max_repeat_exceptions_time | Number | Maxium number of seconds between exceptions before they're considered "different" exceptions. If you modify idle_flush_time you should consider this value | No | +| Option | Type | Description | Required? | Default | +| ------ | ---- | ----------- | --------- | ------- | +| driver_path | String | File path to jar file containing your JDBC driver. This is optional, and all JDBC jars may be placed in $LOGSTASH_HOME/vendor/jar/jdbc instead. | No | | +| connection_string | String | JDBC connection URL | Yes | | +| username | String | JDBC username - this is optional as it may be included in the connection string, for many drivers | No | | +| password | String | JDBC password - this is optional as it may be included in the connection string, for many drivers | No | | +| statement | Array | An array of strings representing the SQL statement to run. Index 0 is the SQL statement that is prepared, all other array entries are passed in as parameters (in order). A parameter may either be a property of the event (i.e. "@timestamp", or "host") or a formatted string (i.e. "%{host} - %{message}" or "%{message}"). If a key is passed then it will be automatically converted as required for insertion into SQL. If it's a formatted string then it will be passed in verbatim. | Yes | | +| unsafe_statement | Boolean | If yes, the statement is evaluated for event fields - this allows you to use dynamic table names, etc. **This is highly dangerous** and you should **not** use this unless you are 100% sure that the field(s) you are passing in are 100% safe. Failure to do so will result in possible SQL injections. Please be aware that there is also a potential performance penalty as each event must be evaluated and inserted into SQL one at a time, where as when this is false multiple events are inserted at once. Example statement: [ "insert into %{table_name_field} (column) values(?)", "fieldname" ] | No | False | +| max_pool_size | Number | Maximum number of connections to open to the SQL server at any 1 time | No | 5 | +| connection_timeout | Number | Number of seconds before a SQL connection is closed | No | 2800 | +| flush_size | Number | Maximum number of entries to buffer before sending to SQL - if this is reached before idle_flush_time | No | 1000 | +| idle_flush_time | Number | Number of idle seconds before sending data to SQL - even if the flush_size has not yet been reached | No | 1 | +| max_flush_exceptions | Number | Number of sequential flushes which cause an exception, before we stop logstash. Set to a value less than 1 if you never want it to stop. This should be carefully configured with relation to idle_flush_time if your SQL instance is not highly available. | No | 0 | ## Example configurations If you have a working sample configuration, for a DB thats not listed, pull requests are welcome. diff --git a/lib/logstash-output-jdbc_ring-buffer.rb b/lib/logstash-output-jdbc_ring-buffer.rb new file mode 100644 index 0000000..70075be --- /dev/null +++ b/lib/logstash-output-jdbc_ring-buffer.rb @@ -0,0 +1,19 @@ +class RingBuffer < Array + attr_reader :max_size + + def initialize(max_size, enum = nil) + @max_size = max_size + enum.each { |e| self << e } if enum + end + + def <<(el) + if self.size < @max_size || @max_size.nil? + super + else + self.shift + self.push(el) + end + end + + alias :push :<< +end diff --git a/lib/logstash/outputs/jdbc.rb b/lib/logstash/outputs/jdbc.rb index 58c29c1..96d2220 100644 --- a/lib/logstash/outputs/jdbc.rb +++ b/lib/logstash/outputs/jdbc.rb @@ -4,6 +4,7 @@ require "stud/buffer" require "java" require "logstash-output-jdbc_jars" +require "logstash-output-jdbc_ring-buffer" class LogStash::Outputs::Jdbc < LogStash::Outputs::Base # Adds buffer support @@ -58,17 +59,20 @@ class LogStash::Outputs::Jdbc < LogStash::Outputs::Base # a timely manner. # # If you change this value please ensure that you change - # max_repeat_exceptions_time accordingly. + # max_flush_exceptions accordingly. config :idle_flush_time, :validate => :number, :default => 1 - # Maximum number of repeating (sequential) exceptions, before we stop retrying + # Maximum number of sequential flushes which encounter exceptions, before we stop retrying. # If set to < 1, then it will infinitely retry. - config :max_repeat_exceptions, :validate => :number, :default => 4 + # + # You should carefully tune this in relation to idle_flush_time if your SQL server + # is not highly available. + # i.e. If your idle_flush_time is 1, and your max_flush_exceptions is 200, and your SQL server takes + # longer than 200 seconds to reboot, then logstash will stop. + config :max_flush_exceptions, :validate => :number, :default => 0 - # The max number of seconds since the last exception, before we consider it - # a different cause. - # This value should be carefully considered in respect to idle_flush_time. - config :max_repeat_exceptions_time, :validate => :number, :default => 30 + config :max_repeat_exceptions, :obsolete => "This has been replaced by max_flush_exceptions - which behaves slightly differently. Please check the documentation." + config :max_repeat_exceptions_time, :obsolete => "This is no longer required" public def register @@ -85,17 +89,12 @@ def register @pool.setMaximumPoolSize(@max_pool_size) @pool.setConnectionTimeout(@connection_timeout) + @exceptions_tracker = RingBuffer.new(@max_flush_exceptions) + if (@flush_size > 1000) @logger.warn("JDBC - Flush size is set to > 1000") end - @repeat_exception_count = 0 - @last_exception_time = Time.now - - if (@max_repeat_exceptions > 0) and ((@idle_flush_time * @max_repeat_exceptions) > @max_repeat_exceptions_time) - @logger.warn("JDBC - max_repeat_exceptions_time is set such that it may still permit a looping exception. You probably changed idle_flush_time. Considering increasing max_repeat_exceptions_time.") - end - buffer_initialize( :max_items => @flush_size, :max_interval => @idle_flush_time, @@ -119,21 +118,14 @@ def flush(events, teardown=false) end def on_flush_error(e) - return if @max_repeat_exceptions < 1 + return if @max_flush_exceptions < 1 - if @last_exception == e.to_s - @repeat_exception_count += 1 - else - @repeat_exception_count = 0 - end + @exceptions_tracker << e.class - if (@repeat_exception_count >= @max_repeat_exceptions) and (Time.now - @last_exception_time) < @max_repeat_exceptions_time - @logger.error("JDBC - Exception repeated more than the maximum configured", :exception => e, :max_repeat_exceptions => @max_repeat_exceptions, :max_repeat_exceptions_time => @max_repeat_exceptions_time) + if @exceptions_tracker.reject { |i| i.nil? }.count >= @max_flush_exceptions + @logger.error("JDBC - max_flush_exceptions has been reached") raise e end - - @last_exception_time = Time.now - @last_exception = e.to_s end def teardown @@ -239,7 +231,11 @@ def add_statement_event_params(statement, event) when false statement.setBoolean(idx + 1, false) else - statement.setString(idx + 1, event.sprintf(i)) + if event[i].nil? and i =~ /%\{/ + statement.setString(idx + 1, event.sprintf(i)) + else + statement.setString(idx + 1, nil) + end end end diff --git a/logstash-output-jdbc.gemspec b/logstash-output-jdbc.gemspec index 9867725..fe1b20a 100644 --- a/logstash-output-jdbc.gemspec +++ b/logstash-output-jdbc.gemspec @@ -1,6 +1,6 @@ Gem::Specification.new do |s| s.name = 'logstash-output-jdbc' - s.version = "0.2.0.rc3" + s.version = "0.2.0.rc4" s.licenses = [ "Apache License (2.0)" ] s.summary = "This plugin allows you to output to SQL, via JDBC" s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program" @@ -24,4 +24,6 @@ Gem::Specification.new do |s| s.add_runtime_dependency "logstash-codec-plain" s.add_development_dependency "logstash-devutils" + + s.post_install_message = "logstash-output-jdbc 0.2.0 introduces several new features - please ensure you check the documentation in the README file" end diff --git a/spec/outputs/jdbc_spec.rb b/spec/outputs/jdbc_spec.rb new file mode 100644 index 0000000..3e648ba --- /dev/null +++ b/spec/outputs/jdbc_spec.rb @@ -0,0 +1,13 @@ +require "logstash/devutils/rspec/spec_helper" +require "logstash/outputs/jdbc" +require "stud/temporary" + +describe LogStash::Outputs::Jdbc do + + it "should register without errors" do + plugin = LogStash::Plugin.lookup("output", "jdbc").new({}) + expect { plugin.register }.to_not raise_error + + end + +end