Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Console] fix console deinit: add ability to unblock linenoise, to fix deinit deadlock (IDFGH-8529) #9983

Conversation

chipweinberger
Copy link
Contributor

@chipweinberger chipweinberger commented Oct 15, 2022

EDIT: NOT MERGING THIS PR, but still needs to be fixed. new approach to fix deinit is needed. see comments.

This solves: #9974, and #9877

Main Changes (to solve the deadlock):

  • Implements: usb_serial_jtag_unblock_reads(); & uart_unblock_reads(); which causes any pending read() to return -1 with EWOULDBLOCK.
  • Implement vRingbufferUnblockRx which is what the above functions are wait on
  • callingesp_console_repl_X_delete() unblocks and safely exits the linenoise thread, before we stop the console. This lets the linenoise thread let go of the _readLock, so that we can deinit.
  • linenoise.c, read operations now return early if EWOULDBLOCK is returned by the read operation. This is necessary for linenoise to exit when we deinit.

The crux of the deadlock fix is here:

esp_console_replc.

static esp_err_t esp_console_repl_usb_serial_jtag_delete(esp_console_repl_t *repl) {
    ....
    // wait for repl thread to stop, if it is running
    if (repl_com->state == CONSOLE_REPL_STATE_START) {
        repl_com->state = CONSOLE_REPL_STATE_REPL_STOP_REQUESTED;
        while (repl_com->state != CONSOLE_REPL_STATE_REPL_TASK_ENDED) {
            usb_serial_jtag_unblock_reads();
            vTaskDelay(20 / portTICK_PERIOD_MS);
        }
    } 
    ...
}

@chipweinberger chipweinberger changed the title [Console] add ability to unblock [Console] add ability to unblock linenoise, to fix deadlock Oct 15, 2022
@espressif-bot espressif-bot added the Status: Opened Issue is new label Oct 15, 2022
@github-actions github-actions bot changed the title [Console] add ability to unblock linenoise, to fix deadlock [Console] add ability to unblock linenoise, to fix deadlock (IDFGH-8529) Oct 15, 2022
@chipweinberger chipweinberger force-pushed the user/chip/console-unblock branch 8 times, most recently from 84de283 to 85950d0 Compare October 15, 2022 08:23
@chipweinberger chipweinberger changed the title [Console] add ability to unblock linenoise, to fix deadlock (IDFGH-8529) [Console] fix console deinit: add ability to unblock linenoise, to fix deadlock (IDFGH-8529) Oct 16, 2022
@chipweinberger chipweinberger force-pushed the user/chip/console-unblock branch 4 times, most recently from e95ba5f to 7a2067c Compare October 18, 2022 06:20
@chipweinberger chipweinberger force-pushed the user/chip/console-unblock branch 6 times, most recently from e1aea77 to 7f5bdf8 Compare October 18, 2022 22:03
@CLAassistant
Copy link

CLAassistant commented Oct 18, 2022

CLA assistant check
All committers have signed the CLA.

@chipweinberger chipweinberger force-pushed the user/chip/console-unblock branch 3 times, most recently from fcd7413 to 12ed681 Compare October 18, 2022 22:28
@chipweinberger chipweinberger force-pushed the user/chip/console-unblock branch 4 times, most recently from 5885b24 to eac9d32 Compare October 18, 2022 23:00
Copy link
Member

@igrr igrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed all the changes, leaving just a few comments for now.

*
* @param[in] xRingbuffer The ring buffer who's rx will be unblocked.
*/
void vRingbufferUnblockRx(RingbufHandle_t xRingbuffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use vTaskAbortDelay instead of introducing this function?

If usb_serial_jtag.c starts using something other than a ringbuffer internally (e.g. a StreamBuffer), we would have to add similar API for unblocking there; and given that StreamBuffer is upstream FreeRTOS code we would most likely want to avoid that.

An alternative is to change the implementation in linenoise, using a select for reads with an extra eventfd file descriptor for unblocking them. However that also has downsides:

  • Will start linking select and the related support code even into simple applications, the code size might be affected (need to check)
  • Will need to implement select support in usb_serial_jtag and usb_cdcacm VFS drivers.

Copy link
Contributor Author

@chipweinberger chipweinberger Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vTaskAbortDelay is an...interesting idea. First time I'm seeing that function.

Yes I considered select too and avoided it due to complexity. Half of it the fear of me implementing it wrong ;P

I think vTaskAbortDelay sounds like a reasonable simple solution. ESP-IDF is yours to decide 😊

Personally, I find vRingbufferUnblockRx more explicit, targeted, and clean, but understand your concern.

Perhaps we should just convert to vTaskAbortDelay if we ever switch to StreamBuffer?

Edit: There is no guarantee where the Console Task is blocked. Realistically, it's probably blocked on a read, but indiscriminately unblocking tasks is usually a good way to introduce weird behavior. It could be blocked anywhere in user code for instance, and they might not have expected these aborts to happen. Users might have to rewrite a lot of their code to support this behavior. Too complex, IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could probably introduce more code to determine where the console thread is blocked, before calling vTaskAbortDelay. I would need to evaluate that further.

Edit: we could use a flag in linenoise. Something like linenoiseIsWaitingForNextCommand(). I'm not a huge fan of that, versus the other method, but it would work. Let me know what you think.

void xxx_console_deinit(){
   ....
  while(linenoiseIsWaitingForNextCommand()){
        vTaskAbortDelay(consoleThread);
   }
   ....
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this again, select on a pair of file descriptors (stdin and eventfd) is probably the cleanest solution.

This way, we don't end up having to introduce, test and support vRingbufferUnblockRx. Also this solves the problem that blocking might happen elsewhere in the code (e.g. in the console command handler).

I can give some pointers about a select based implementation, if you decide that you can work on that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about vTaskAbortDelay with linenoiseIsWaitingForNextCommand?

I likely don't have time to implement select 😅

* @note application must configure USB-SERIAL-JTAG driver before calling these functions
* With this function, read is blocking and interrupt-driven.
*/
void esp_vfs_usb_serial_jtag_use_driver_for_rx(void);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing this new function, would it instead make more sense to fix the TX-blocking-indefinitely behavior?

This is already done in usb_serial_jtag_tx_char with the TX_FLUSH_TIMEOUT_US check, we exit and drop the character if the flush doesn't happen for too long.

Seems like the same behavior could be achieved in usbjtag_tx_char_via_driver, by passing pdMS_TO_TICKS(TX_FLUSH_TIMEOUT_US / 1000) instead of portMAX_DELAY?

Copy link
Contributor Author

@chipweinberger chipweinberger Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine by me, but I thought blocking forever was the reason the driver exists? If It doesn't block, why use the driver? I suppose it would still add some additional buffering...

If we replace indefinite blocking TX from the driver, it should probably be a configurable delay. Some people might want to block TX, I would think, in order to send large files over serial robustly for example.

void esp_vfs_usb_serial_jtag_set_driver_tx_timeout(int32_t ms), where <0 would be infinite.

I'm not sure if the USB specification guarantees the host will read from your device every X ms. I'd need to look. Also your OS's could be slow to initialize their USB driver, maybe? So, yes, configurable delay seems best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we start using select, then this change will not be necessary. "Blocking forever" will be an okay behavior for the driver, and the timeout will be handled by select.

Copy link
Contributor Author

@chipweinberger chipweinberger Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm don't understand. Where would the select statement be added?

This function is so that ESP_LOGI, printf, etc, do not block. So TX can remain in non-driver mode.

We can continue discussion in a new PR for this change: #10099


3. The behavior between an actual USB-to-serial bridge chip and the USB Serial/JTAG Controller is slightly different if the ESP-IDF application does not listen for incoming bytes. An USB-to-serial bridge chip will just send the bytes to a (not listening) chip, while the USB Serial/JTAG Controller will block until the application reads the bytes. This can lead to a non-responsive looking terminal program.
3. (Prior to ESP-IDF v5.0) The behavior between an actual USB-to-serial bridge chip and the USB Serial/JTAG Controller is slightly different if the ESP-IDF application does not listen for incoming bytes. An USB-to-serial bridge chip will just send the bytes to a (not listening) chip, while the USB Serial/JTAG Controller will block until the application reads the bytes. This can lead to a non-responsive looking terminal program.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simply delete the line; the comment will still be there in the versions of the docs which are affected by the limitation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fancy!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this change into its own PR: #10099

@@ -1084,8 +1099,10 @@ static int linenoiseRaw(char *buf, size_t buflen, const char *prompt) {
}

count = linenoiseEdit(buf, buflen, prompt);
fputc('\n', stdout);
flushWrite();
if (count != -1){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick, space before opening brace

@@ -1095,6 +1112,8 @@ static int linenoiseDumb(char* buf, size_t buflen, const char* prompt) {
size_t count = 0;
while (count < buflen) {
int c = fgetc(stdin);
if (c == -1 && errno == EWOULDBLOCK) {return -1;}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, please move the return into its own line.

(Similar comment for a few lines of diff above. Yes, some of the existing linenoise code uses this style but we would prefer to avoid it for newly written code.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

* This can be used as a simple way to determine that a serial
* console has definitely been attached at some point
*/
bool usb_serial_jtag_has_ever_received_bytes_since_install();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is defined but not used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I use this in my code! =)

I need a way to detect if a USB terminal is attached, so that I can decide which port to open my terminal on. If the user types a character over USB during boot, I open USB Console, otherwise I open UART Console.

(I also send a 'terminal probe request' over USB Serial during boot, so detection is automatic for most terminals 😊)

It feels a little 'hacky', but also reliable and simple. If you can think of a simpler, more reliable way, let me know.

I like that this solution is less complex than calling read() or initializing linenoise or esp_console.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I use this in my code! =)

That's perfectly valid, but please keep in mind that an API which doesn't have any usages or tests within IDF repository is very likely to get broken or generally unmaintained.

The use case does make sense though; I would suggest splitting it into a separate PR. There we can discuss how to add a test or an example which would use this new API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was considering a separate PR myself. I'll make one!

Copy link
Contributor Author

@chipweinberger chipweinberger Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new PR #10097

@chipweinberger
Copy link
Contributor Author

@igrr hope you have time to respond to the new comments at some point.

@igrr
Copy link
Member

igrr commented Nov 2, 2022

@chipweinberger I'm sorry, could you point which comment you mean? Did I miss something in this PR?

@@ -221,8 +221,8 @@ static void flushWrite(void) {
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read operations now return early with an error if EWOULDBLOCK is returned by the read operation.

@@ -59,20 +59,18 @@ There are several limitations to the USB console feature. These may or may not b

1. If the application accidentally reconfigures the USB peripheral pins, or disables the USB Serial/JTAG Controller, the device will disappear from the system. After fixing the issue in the application, you will need to manually put the {IDF_TARGET_NAME} into download mode by pulling low {IDF_TARGET_BOOT_PIN} and resetting the chip.

2. If the application enters deep sleep mode, USB CDC device will disappear from the system.
2. If the application enters deep sleep mode, the USB Serial/JTAG device will disappear from the system.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throughout this document, change USB CDC -> USB Serial/JTAG, as to not confuse it with the other USB CDC port in the OTG peripheral.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this change into its own PR: #10099

repl_com->state = CONSOLE_REPL_STATE_REPL_STOP_REQUESTED;
while (repl_com->state != CONSOLE_REPL_STATE_REPL_TASK_ENDED) {
usb_serial_jtag_unblock_reads();
vTaskDelay(20 / portTICK_PERIOD_MS);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should switch this to vTaskDelay(1). Obviously a more complex change could replace polling.

*
* @param[in] xRingbuffer The ring buffer who's rx will be unblocked.
*/
void vRingbufferUnblockRx(RingbufHandle_t xRingbuffer);
Copy link
Contributor Author

@chipweinberger chipweinberger Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vTaskAbortDelay is an...interesting idea. First time I'm seeing that function.

Yes I considered select too and avoided it due to complexity. Half of it the fear of me implementing it wrong ;P

I think vTaskAbortDelay sounds like a reasonable simple solution. ESP-IDF is yours to decide 😊

Personally, I find vRingbufferUnblockRx more explicit, targeted, and clean, but understand your concern.

Perhaps we should just convert to vTaskAbortDelay if we ever switch to StreamBuffer?

Edit: There is no guarantee where the Console Task is blocked. Realistically, it's probably blocked on a read, but indiscriminately unblocking tasks is usually a good way to introduce weird behavior. It could be blocked anywhere in user code for instance, and they might not have expected these aborts to happen. Users might have to rewrite a lot of their code to support this behavior. Too complex, IMO.

* @note application must configure USB-SERIAL-JTAG driver before calling these functions
* With this function, read is blocking and interrupt-driven.
*/
void esp_vfs_usb_serial_jtag_use_driver_for_rx(void);
Copy link
Contributor Author

@chipweinberger chipweinberger Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine by me, but I thought blocking forever was the reason the driver exists? If It doesn't block, why use the driver? I suppose it would still add some additional buffering...

If we replace indefinite blocking TX from the driver, it should probably be a configurable delay. Some people might want to block TX, I would think, in order to send large files over serial robustly for example.

void esp_vfs_usb_serial_jtag_set_driver_tx_timeout(int32_t ms), where <0 would be infinite.

I'm not sure if the USB specification guarantees the host will read from your device every X ms. I'd need to look. Also your OS's could be slow to initialize their USB driver, maybe? So, yes, configurable delay seems best.


3. The behavior between an actual USB-to-serial bridge chip and the USB Serial/JTAG Controller is slightly different if the ESP-IDF application does not listen for incoming bytes. An USB-to-serial bridge chip will just send the bytes to a (not listening) chip, while the USB Serial/JTAG Controller will block until the application reads the bytes. This can lead to a non-responsive looking terminal program.
3. (Prior to ESP-IDF v5.0) The behavior between an actual USB-to-serial bridge chip and the USB Serial/JTAG Controller is slightly different if the ESP-IDF application does not listen for incoming bytes. An USB-to-serial bridge chip will just send the bytes to a (not listening) chip, while the USB Serial/JTAG Controller will block until the application reads the bytes. This can lead to a non-responsive looking terminal program.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fancy!

@@ -1095,6 +1112,8 @@ static int linenoiseDumb(char* buf, size_t buflen, const char* prompt) {
size_t count = 0;
while (count < buflen) {
int c = fgetc(stdin);
if (c == -1 && errno == EWOULDBLOCK) {return -1;}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

* This can be used as a simple way to determine that a serial
* console has definitely been attached at some point
*/
bool usb_serial_jtag_has_ever_received_bytes_since_install();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I use this in my code! =)

I need a way to detect if a USB terminal is attached, so that I can decide which port to open my terminal on. If the user types a character over USB during boot, I open USB Console, otherwise I open UART Console.

(I also send a 'terminal probe request' over USB Serial during boot, so detection is automatic for most terminals 😊)

It feels a little 'hacky', but also reliable and simple. If you can think of a simpler, more reliable way, let me know.

I like that this solution is less complex than calling read() or initializing linenoise or esp_console.

*
* @param[in] xRingbuffer The ring buffer who's rx will be unblocked.
*/
void vRingbufferUnblockRx(RingbufHandle_t xRingbuffer);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could probably introduce more code to determine where the console thread is blocked, before calling vTaskAbortDelay. I would need to evaluate that further.

Edit: we could use a flag in linenoise. Something like linenoiseIsWaitingForNextCommand(). I'm not a huge fan of that, versus the other method, but it would work. Let me know what you think.

void xxx_console_deinit(){
   ....
  while(linenoiseIsWaitingForNextCommand()){
        vTaskAbortDelay(consoleThread);
   }
   ....
}

@chipweinberger
Copy link
Contributor Author

@igrr my mistake, my comments were still "pending"

@chipweinberger chipweinberger force-pushed the user/chip/console-unblock branch from d8eb79b to 125599e Compare November 2, 2022 21:08
@chipweinberger
Copy link
Contributor Author

chipweinberger commented Nov 8, 2022

@igrr I'm okay denying this Pull Request. Seems like we are losing steam on it.

The last idea I had was this approach:

void xxx_console_deinit(){
   ....
  while(linenoiseIsWaitingForNextCommand()){
        vTaskAbortDelay(consoleThread);
   }
   ....
}

I wont have time to implement a select based solution.

Let me know, otherwise I'll close the PR. If we go forward, I'll open a new PR regardless

@chipweinberger
Copy link
Contributor Author

chipweinberger commented Nov 17, 2022

NOT MERGING THIS PR. Still Unfixed. De-initializing consoles is not possible until this is fixed.

If anyone ever fixes this deadlock, they should open a new PR. Please see my previous comment for how this could be solved. Ivan does not want to alter FreeRTOS.

@chipweinberger chipweinberger deleted the user/chip/console-unblock branch November 18, 2022 23:39
@chipweinberger chipweinberger changed the title [Console] fix console deinit: add ability to unblock linenoise, to fix deadlock (IDFGH-8529) [Console] fix console deinit: add ability to unblock linenoise, to fix deinit deadlock (IDFGH-8529) Dec 9, 2022
@espressif-bot espressif-bot added Resolution: Duplicate This issue or pull request already exists Status: Done Issue is done internally and removed Status: Opened Issue is new labels Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Duplicate This issue or pull request already exists Status: Done Issue is done internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants