Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasionally "Operation not permitted" #99

Open
Kain-90 opened this issue Oct 16, 2024 · 4 comments
Open

Occasionally "Operation not permitted" #99

Kain-90 opened this issue Oct 16, 2024 · 4 comments

Comments

@Kain-90
Copy link

Kain-90 commented Oct 16, 2024

I have read the issue

But my problem is different, with the same parameters, sometimes my code execution block can be successful, but sometimes it reports an error: Operation not allowed

I am also unable to find the relevant execution logs, and I have encountered the same problem while trying to upgrade the version. I have not found a stable reproduce logic, it interrupts the workflow execution every time it appears.

BTW, I'm using a self-hosted deployment method, and the version is 0.6.16.

The simplest step for reproduction:

  1. Create the example code execution block
  2. Rapidly click "Run this step" multiple times (about 10 to 30 times)
image image
@law52525
Copy link

I have the same situation. Whether I use the newest code or the code of version 0.2.10, this problem sometimes happens. And I notice that when there are a lot of concurrent requests, this problem is very likely to appear.

@18827555809
Copy link

I have the same situation. Whether I use the newest code or the code of version 0.2.10, this problem sometimes happens. And I notice that when there are a lot of concurrent requests, this problem is very likely to appear.

I have the same situation,When concurrency is particularly high, this problem may occur. I don't know if it's related to the system kernel

@18827555809
Copy link

I have read the issue

But my problem is different, with the same parameters, sometimes my code execution block can be successful, but sometimes it reports an error: Operation not allowed

I am also unable to find the relevant execution logs, and I have encountered the same problem while trying to upgrade the version. I have not found a stable reproduce logic, it interrupts the workflow execution every time it appears.

BTW, I'm using a self-hosted deployment method, and the version is 0.6.16.

The simplest step for reproduction:

  1. Create the example code execution block
  2. Rapidly click "Run this step" multiple times (about 10 to 30 times)

image image

I have already solved this problem. The reason is that in concurrent scenarios, there may be user mode processes that actively release the CPU. This process requires a system call sched_yield because there are many waiting processes, and the operating system will choose to execute high priority jobs. At this time, low priority jobs need to actively release the CPU, but we do not give sched_yield permission. Therefore, when encountering the process of actively releasing the CPU, an error will be reported: bad system call, In my Linux kernel, it is 24

@Kain-90
Copy link
Author

Kain-90 commented Dec 20, 2024

I have read the issue
But my problem is different, with the same parameters, sometimes my code execution block can be successful, but sometimes it reports an error: Operation not allowed
I am also unable to find the relevant execution logs, and I have encountered the same problem while trying to upgrade the version. I have not found a stable reproduce logic, it interrupts the workflow execution every time it appears.
BTW, I'm using a self-hosted deployment method, and the version is 0.6.16.
The simplest step for reproduction:

  1. Create the example code execution block
  2. Rapidly click "Run this step" multiple times (about 10 to 30 times)

image image

I have already solved this problem. The reason is that in concurrent scenarios, there may be user mode processes that actively release the CPU. This process requires a system call sched_yield because there are many waiting processes, and the operating system will choose to execute high priority jobs. At this time, low priority jobs need to actively release the CPU, but we do not give sched_yield permission. Therefore, when encountering the process of actively releasing the CPU, an error will be reported: bad system call, In my Linux kernel, it is 24

It makes sense logically, but in practice, I don't really have any concurrency scenarios since I'm manually clicking the 'start' button, and as you know, it's physically impossible to click multiple times at the exact same moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants