-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Debug Feature] print actual backtrace when Idle WDT fires (IDFGH-10181) #11449
Comments
I see there have been a lot of changes to see: e25cda2 |
For anyone else on v4.4, a simple workaround is to disable This will get you a stacktrace if you really need one. You could also look at the coredump. |
I still have the same problem with v5.1
The backtrace is not informative |
I expected to see that the In the coredump I showed, there is no such information whatsoever. |
Interesting. looking at the code I thought there were improvements in v5.1. I've reopened. That's too bad. |
@dizcza could you please check the ELF file SHA256 printed by your app at startup, and compare it to the SHA256 of the ELF file passed to espcoredump.py? Unreasonable backtraces like the one you have posted (e.g. prvCheckTasksWaitingTermination most definitely doesn't call esp_task_wdt_reset) are most often the result of a mismatch between the ELF file used to produce the application binary and the ELF file used when decoding the core dump. |
Indeed, the SHA differs:
The coredump app SHA is taken from the
In both cases, the coredump info is like I've shown above. My coredump procedure is the following. First, I set up the partitions like so:
Then I configured the app to dump backtraces to flash. Upon reset, I read the
The code to read a coredump and write it to an SD card: void coredump_log_sdcard() {
coredump_print_summary();
size_t out_cd_addr = 0;
size_t out_cd_size = 0;
esp_err_t err = esp_core_dump_image_get(&out_cd_addr, &out_cd_size);
bool erase = false;
if (err != ESP_OK) {
ESP_LOGI(TAG, "esp_core_dump_get_summary: %s", esp_err_to_name(err));
erase = true;
} else {
ESP_LOGI(TAG, "core dump addr: %d", out_cd_addr);
ESP_LOGI(TAG, "core dump size: %d", out_cd_size);
}
const char *label = COREDUMP_PARTITION_NAME;
const esp_partition_t* partition = esp_partition_find_first(
ESP_PARTITION_TYPE_DATA, ESP_PARTITION_SUBTYPE_DATA_COREDUMP, label);
if (partition == NULL) {
ESP_LOGI(TAG, "partition '%s' not found", label);
return;
}
if (out_cd_size <= 0) {
ESP_LOGW(TAG, "partition '%s' size: %d", label, out_cd_size);
return;
}
if (erase) {
// erase and exit
esp_partition_erase_range(partition, 0, out_cd_size);
return;
}
FILE* f = sdcard_record_open_file_once("COREDUMP.BIN", "w");
if (f == NULL) {
return;
}
char *buffer = (char*) malloc(COREDUMP_BUFF_SIZE);
long offset = 0;
while (out_cd_size > 0) {
int chunk_size = out_cd_size > COREDUMP_BUFF_SIZE ? COREDUMP_BUFF_SIZE : out_cd_size;
esp_partition_read(partition, offset, buffer, chunk_size);
fwrite(buffer, sizeof(char), chunk_size, f);
offset += chunk_size;
out_cd_size -= chunk_size;
}
fclose(f);
free(buffer);
} ESP-IDF v5.1-dev-4528-g420ebd208a Here is the output in the terminal:
|
Could you please give it a try to read the core dump directly from flash (with |
Same story. |
Could you try to make a version of the program which you can share, that still reproduces the issue? I just ran the test application — https://github.com/espressif/esp-idf/tree/master/tools/test_apps/system/panic — with the config from sdkconfig.ci.coredump_flash_elf_sha, and ran the |
Change the static void infinite_loop(void* arg) {
(void) arg;
esp_task_wdt_add(NULL);
ulTaskNotifyTake(pdTRUE, portMAX_DELAY);
while(1) {
;
}
} run the
Or this is the expected output and really no way to say which task hasn't yielded to the WDT? Cause I have lots of tasks each guarded by a WDT... |
The last backtrace you have posted looks reasonable if you are registering TWDT for another task: the Idle task was running (idling) and at some point got interrupted by TWDT interrupt. In this case it's indeed not possible fo say just from the backtrace, which of the tasks has not fed the watchdog. |
Sadly indeed. |
Well, that was the reason I posted a response here thinking that cases as I described should be more informative. If it's a won't fix, let it be. Or let it mark a feature request - don't know if implementing it is possible. |
We have just discussed this with the team, it is probably possible to include additional information into the core dump to make such cases easier to analyze. Basically, this part of the text currently printed to console
could also be included into the coredump, and extracted by the coredump utility on the host side. In your case, the "following tasks did not reset the watchdog in time" list would include the task which you have registered the task watchdog for. We probably won't implement this immediately, but let's keep this as an open feature request. |
Nice idea, I like it. All right. |
Is your feature request related to a problem?
ESP-IDF: release/v4.4
Debugging Idle TWDT crashes is harder than it needs to be.
In the example below, we should print the backtrace of
btApiTask
, but instead we get the backtrace of the idle task (much less useful).I think we could do this with a task snapshot. See: #9708
The text was updated successfully, but these errors were encountered: