Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to enable/disable pipelines #7989

Open
fbaligand opened this issue Aug 13, 2017 · 16 comments
Open

Add ability to enable/disable pipelines #7989

fbaligand opened this issue Aug 13, 2017 · 16 comments

Comments

@fbaligand
Copy link
Contributor

In Logstash 6.0, we can add or remove pipelines.
It would be great to be able to disable a pipeline (without removing it and all its configuration), so that we can re-enable it later.
This is particularly useful for pipelines that we want to activate not always, but on demand for a limited time.
Obviously, this feature would be really great if it can be done using à rest API or something like that, so that it doesn't require a Logstash restart.

@jsvd
Copy link
Member

jsvd commented Aug 13, 2017

this feature request aside, I must remind that multiple pipelines are compatible with the dynamic config reloading. so if logstash is started with -r, adding/removing a new entry to the pipelines.yml will add (and start) or remove (and stop) a pipeline.

@fbaligand
Copy link
Contributor Author

So comment/uncomment one pipeline config in "pipelines.yml" would enable/disable a pipeline ?

@jsvd
Copy link
Member

jsvd commented Aug 13, 2017

@fbaligand it will stop the pipeline and delete it internally, uncommenting will add and start it

@fbaligand
Copy link
Contributor Author

Ok thank you !

@fbaligand
Copy link
Contributor Author

However, I maintain the feature request, because I'm very interested to be able to enable/disable (or start/stop) a pipeline using a script command.

@robgil
Copy link

robgil commented Aug 18, 2018

@jsvd it would be great if it was exposed as an api endpoint to stop pipelines. Otherwise its a CM tool exercise. The use case is more apparent when you're running a large pool of logstash hosts and want to temporarily disable a pipeline while doing other maintenance, or reduce the number of concurrent pipelines to reduce threads or pressure on ES. Making CM changes usually requires additional approval and if you're managing CM with a vcs, then it equates to a production code change.

I concede this is a convenience feature.

@offsnore
Copy link

offsnore commented Dec 6, 2018

I would not concede this as a convenience feature - production systems (by best practice if not policy) would never a) change a configuration file on the fly or b) rename a file on a server to stop/start that service. Wholly un-scalable not to mention. So..bump?

@offsnore
Copy link

offsnore commented Dec 6, 2018

Sweet! I say usually to 'believe in the roadmap' and assume something like this is on the way (cuz why not). I'm not familiar, but enable/disable through Kibana, then through 6.6 ES APIs or such, is exactly what is requested, at issue. I was replying to the comment I guess, and these alternatives are what we will try using in the meantime. An API or feature on-the-way is enough for me (and customer)! 👌 Thanks for the quick reply @fbaligand! Is there a GH issue we can track and link here/close this issue perhaps?

@fbaligand
Copy link
Contributor Author

Sorry, I just removed my previous comment as it is totally out of context.
I misunderstood your previous comment.

And so, no, to my knowledge, there is no kibana or elasticsearch feature that allows to solve this issue.
Sorry.

@rozling
Copy link

rozling commented Apr 22, 2019

For us, Pipelines are proving useful for isolating logs based on certain conditions, performing actions and then outputting them in a certain way, while keeping an eye on the monitoring stats.

A pipeline in the real world would have the ability to stop the flow upstream in order to carry out some maintenance. In Elasticsearch, when performing tasks like e.g. mapping updates, it's useful to stop ingestion long enough for the changes to be made, so new indices don't automatically get created.

While we can fiddle with commenting out lines in YAML files, that feels inconsistent with the RESTful way of doing things that the Elastic stack is known for, and as mentioned above it doesn't scale.

FWIW I have the same gripe with vanilla Logstash (i.e. the main pipeline); stopping the entire service when you need to make a mapping update always feels like using a sledgehammer to crack a nut. Maybe I'm doing it wrong, but at least afaik live config reloading isn't suitable for this.

It's especially relevant now as we have a separate pipeline for migrating to ECS, and we don't want this to interfere with ingestion on the main pipeline

@fbaligand
Copy link
Contributor Author

+1000 for @rozling comment!
In my case, I have some pipelines that I activate only for few hours and then need to disable it.
I would love a rest api to enable/disable a pipeline!

@robcowart
Copy link

robcowart commented Aug 20, 2019

I just came across this issue while trying to determine if this was possible. The use-case is running Logstash and associated configs in a docker container. The container could have up to 35 pipelines enabled. However not every instance of the container needs all of the pipelines (depending on the environment and scale requirements).

If a config option such as pipeline.enable was available, it could be used together with environment variables (another needed feature requested here) like this...

- pipeline.id: barracuda
  pipeline.enable: ${LS_BARRACUDA_ENABLE}
  pipeline.workers: ${LS_BARRACUDA_WORKERS}
  path.config: "/etc/logstash/pipelines/barracuda/*.conf"

- pipeline.id: forcepoint
  pipeline.enable: ${LS_FORCEPOINT_ENABLE}
  pipeline.workers: ${LS_FORCEPOINT_WORKERS}
  path.config: "/etc/logstash/pipelines/barracuda/*.conf"

The docker_compose.yml file which starts the container would then be able to provide control of which pipelines are started by defining environment variables. For example...

environment:
  LS_BARRACUDA_ENABLE: false
  LS_BARRACUDA_WORKERS: 2

  LS_FORCEPOINT_ENABLE: true
  LS_FORCEPOINT_WORKERS: 4

This would eliminate the need to produce a container for every possible combination of pipelines.

@PieroValdebenito
Copy link

PieroValdebenito commented Feb 16, 2021

my workaround was setting a directory structure with enabled and disabled (also for pre/pro env) configurations files and then using pipelines.yml to point to enabled or disabled directory depending on environment variable.

image

conf/pipelines.yml file

- pipeline.id: bulk-db
  path.config: "bulk/${BULK_INTEGRATION:disabled}/db_to_destiny.conf"
  pipeline.workers: 1

I know is dirty but works.

Notes:

  • the disabled directory has the same file db_to_destiny.conf but empty
  • Also you can use a suffix in pipelines.yml db_to_destiny.${BULK_INTEGRATION:disabled}.conf and two files, one for enabled config and another for disabled config.

image

@fbaligand
Copy link
Contributor Author

Hi,

well, if you use a environment variable to define if a pipeline is enabled or disabled, this can’t be applied in a hot way, without restarting Logstash.

Hot pipeline enable/disable is the feature that I expect in this issue.

@SelAnt
Copy link

SelAnt commented Apr 20, 2022

Here is possible solution to dynamically enable/disable any component of pipeline using build-in Ruby (and an extra pipeline with HTTP input which will act like REST API).

In my case there is main pipeline 'beatsin' which receives events (from Filebeat or other). This pipeline constantly ships events to ES (http://es01:9200) and, optionally, can ship events to another consumer (ES http://es02:9200 for our case).

There is another pipeline 'sandboxswitch' which listens to port 5081 and accept REST calls with JSON payload and a single field 'sandbox_enable' (supported values 0 and 1):

To enable the optional pipeline:

curl -H "content-type: application/json" -XPUT 'http://127.0.1:5081' -d '{"sandbox_enable":1}'

To disable the optional pipeline:

curl -H "content-type: application/json" -XPUT 'http://127.0.1:5081' -d '{"sandbox_enable":0}'

Results of the latest call to 'sandboxswitch' is stored in '/usr/share/logstash/data folder to restore latest status after Logstash restart.

pipelines.yml snippet:

- pipeline.id: beatsin
  queue.type: persisted
  config.string:  |-
    input {
      beats {
        port=>5044
      }
    } 
    filter{
      ruby {
        init => "$sandbox_on = File.exist?('/usr/share/logstash/data/sandbox_off') ? 0 : 1;
        code => 'event.set("sandbox_on", $sandbox_on)'
      }
    }
    output {
      elasticsearch {
        hosts => "http://es01:9200"
        pipeline => "all_logs"
        user=>"elastic"
        password => "iloveelastic"
      }
      if [sandbox_on] and [sandbox_on] == 1 {
        elasticsearch {
          hosts => "http://es02:9200"
          pipeline => "all_logs"
          user=>"elastic"
          password => "iloveelastic"
        }
      }
    }

- pipeline.id: sandboxswitch
  pipeline.workers: 1
  config.string:  |-
    input {
      http {
        host => "0.0.0.0"
        id => "5081"
        port => 5081
      }
    }
    filter{
      ruby {
        code => "if $sandbox_on then
                     new_sandbox_on=event.get('sandbox_enable');
                     if new_sandbox_on then
                         $sandbox_on = new_sandbox_on==1 ? 1 : 0;
                         if File.exist?('/usr/share/logstash/data/sandbox_off') then
                             if $sandbox_on==1 then
                                 File.delete('/usr/share/logstash/data/sandbox_off')
                             end;
                         else
                             if $sandbox_on!=1 then
                                 File.new('/usr/share/logstash/data/sandbox_off','w').close;
                             end;
                         end;
                     end;
                     event.set('sandbox_on', $sandbox_on);
                 end;"
      }
    }
    output {
       stdout {
          codec => "rubydebug"
      }
    }

Known limitation - if the optional pipeline is already blocked (ES http://es01:9200 is not available) - this solution won't unlock already blocked events, they will wait for http://es01:9200 is back to be shipped.

@techzilla
Copy link

techzilla commented Nov 13, 2024

This feature would be extremely valuable quality of life improvement, when working in a fortune 500 style enterprise environment, config changes are much more restricted than API calls.

I concur with the proposed API,

To enable the optional pipeline:

curl -H "content-type: application/json" -XPUT '[http://127.0.1:5081](http://127.0.0.1:5081/)' -d '{"sandbox_enable":1}'

To disable the optional pipeline:

curl -H "content-type: application/json" -XPUT '[http://127.0.1:5081](http://127.0.0.1:5081/)' -d '{"sandbox_enable":0}'

Disabling by this mechanism will mark pipeline as inactive, enabling will mark it as active again.

Loading pipelines and enabling them should not exclusively combined, as they have different intentions which are just usually combined.
Default behavior of auto-reload should be to enable pipeline on first load if not specified in pipeline config, to keep it disabled if it already was marked as disabled by the API, and but continue to update on config changes so that it would have said changes if set to enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants