Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] esp_backtrace_print() should support printing all tasks (IDFGH-8215) #9708

Closed
chipweinberger opened this issue Sep 3, 2022 · 11 comments
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Feature Request Feature request for IDF

Comments

@chipweinberger
Copy link
Contributor

chipweinberger commented Sep 3, 2022

ESP-IDF v4.4.2

My product has a lot of debug features. I like having debug commands I can call at will.

I just hit a rare task deadlock, but had no ability to print out the backtrace of all tasks.

esp_backtrace_print() only supports printing the current task.

Is there any way to print the backtrace of all apps while the program is running?

I've considered:

vTaskGetInfo returns pxStackBase, but we actually need pxStackTop, which I don't see a way to get.

@chipweinberger chipweinberger added the Type: Feature Request Feature request for IDF label Sep 3, 2022
@chipweinberger chipweinberger changed the title [Question] esp_backtrace_print() only supports printing the current task, but I need to print all tasks [Feature] esp_backtrace_print() only supports printing the current task, but I need to print all tasks Sep 3, 2022
@espressif-bot espressif-bot added the Status: Opened Issue is new label Sep 3, 2022
@github-actions github-actions bot changed the title [Feature] esp_backtrace_print() only supports printing the current task, but I need to print all tasks [Feature] esp_backtrace_print() only supports printing the current task, but I need to print all tasks (IDFGH-8215) Sep 3, 2022
@chipweinberger chipweinberger changed the title [Feature] esp_backtrace_print() only supports printing the current task, but I need to print all tasks (IDFGH-8215) [Feature] esp_backtrace_print() should support printing all tasks (IDFGH-8215) Sep 27, 2022
@o-marshmallow
Copy link
Collaborator

Hi @chipweinberger ,

We don't have any plan to add this feature currently. However, on master, even though it is not meant to be public, you may be able to find what you are looking for with uxTaskGetSnapshotAll. Make sure it is activate from the menuconfig. This function is delcared as:

UBaseType_t uxTaskGetSnapshotAll( TaskSnapshot_t * const pxTaskSnapshotArray, const UBaseType_t uxArrayLength, UBaseType_t * const pxTCBSize );

It will return a snapshot of all the tasks currently in the system, as snapshot consists of a TCB and a top of the stack. When pointing to a halted task, the top of the stack contains an XtExcFrame. That frame lets you get any register, including PC and SP of the pre-empted task. (taking a look at esp_core_dump_get_regs_from_stack may help you too)

Finally, this will let you populate an esp_backtrace_frame_t structure that you could pass to esp_backtrace_print_from_frame.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented Oct 11, 2022

Wow thank you for the detailed response @o-marshmallow.

I'll give this a shot, after I switch to v5.0!

This is great.

@KaeLL
Copy link
Contributor

KaeLL commented Oct 11, 2022

@o-marshmallow and if a PR were to be submitted?

@surfskidude
Copy link

It would be very useful to have this as an easy-to-use public API as I also have a task that is blocking on a mutex and I can't find where it is.

@chipweinberger
Copy link
Contributor Author

chipweinberger commented Mar 13, 2023

That is exactly my use case as well.

I wrote a couple freertos wrappers to detect threads blocked 15+ seconds on a mutex, and print out the stack trace of the owner when it took the mutex, and the deadlocked task backtrace.

@o-marshmallow , @igrr Espressif should add a native version of this tool as a KConfig.
CONFIG_LOG_DEADLOCKED_MUTEXES_THRESHOLD_S

#include <string.h>

#include "freertos/FreeRTOS.h"
#include "freertos/semphr.h"

#include "esp_log.h"
#include "esp_debug_helpers.h"

static const char* TAG = "pd freertos deadlock";

#define STACK_DEPTH 12

typedef struct {
    SemaphoreHandle_t xSemaphore;
    uint32_t backtrace[STACK_DEPTH];
} pd_semphr_debug_data_t;

static int max_semaphores  = 100;

static pd_semphr_debug_data_t* dbg;

static int deadlock_thresh_ms = 15000;

static SemaphoreHandle_t mutex;

static void takeMtx() {
    xSemaphoreTake(mutex, portMAX_DELAY);
}

static void giveMtx() {
    xSemaphoreGive(mutex);
}

static void backtrace_frames(uint32_t* backtrace, uint32_t depth, esp_backtrace_frame_t* frame)
{
    //Initialize stk_frame with first frame of stack
    esp_backtrace_frame_t stk_frame = { 
        .pc = 0,        /* PC of the current frame */
        .sp = 0,        /* SP of the current frame */
        .next_pc  = 0,  /* PC of the current frame's caller */
        .exc_frame = NULL /* Pointer to the full frame data structure, if applicable */
    };
    memcpy(&stk_frame, frame, sizeof(esp_backtrace_frame_t));

    backtrace[0] = esp_cpu_process_stack_pc(stk_frame.pc);

    //Check if first frame is valid
    bool corrupted = !(esp_stack_ptr_is_sane(stk_frame.sp) &&
        (esp_ptr_executable((void *)esp_cpu_process_stack_pc(stk_frame.pc)) ||
        /* Ignore the first corrupted PC in case of InstrFetchProhibited */
        (stk_frame.exc_frame && ((XtExcFrame *)stk_frame.exc_frame)->exccause == EXCCAUSE_INSTR_PROHIBITED)));

    uint32_t n = 1;
    uint32_t i = (depth <= 0) ? INT32_MAX : depth;
    while (i-- > 0 && stk_frame.next_pc != 0 && !corrupted) {
        //Get previous stack frame
        if (!esp_backtrace_get_next_frame(&stk_frame)) {corrupted = true;}
        backtrace[n] = esp_cpu_process_stack_pc(stk_frame.pc);
        n++;
    }

    return;
}

static void backtrace(uint32_t* backtrace, uint32_t len)
{
    esp_backtrace_frame_t start = { 
        .pc = 0,        /* PC of the current frame */
        .sp = 0,        /* SP of the current frame */
        .next_pc  = 0,  /* PC of the current frame's caller */
        .exc_frame = NULL /* Pointer to the full frame data structure, if applicable */
    };
    esp_backtrace_get_start(&(start.pc), &(start.sp), &(start.next_pc));
    backtrace_frames(backtrace, len, &start);
}

static void print_owner(SemaphoreHandle_t xSemaphore)
{
    takeMtx();
    printf("\nowner of semaphore (%p)\n", xSemaphore);
    for (int i = 0; i < max_semaphores; i++) {
        if (dbg[i].xSemaphore == xSemaphore) {
            for (int k = 0; k < STACK_DEPTH; k++) {
                printf("0x%08X\n", dbg[i].backtrace[k]);
            }
        }
    }
    giveMtx();
}

static bool take(SemaphoreHandle_t xSemaphore, TickType_t xTicksToWait, bool recursive)
{
    if (recursive) {
        return xSemaphoreTakeRecursive(xSemaphore, xTicksToWait);
    } else {
        return xSemaphoreTake(xSemaphore, xTicksToWait);
    }
}

static bool xSemaphoreTakeDebug(SemaphoreHandle_t xSemaphore, TickType_t xTicksToWait, bool recursive)
{
    bool rv = false;

    if (xTicksToWait != portMAX_DELAY) {
       
        // note: currently, we only support debugging portMAX_DELAY 
        rv = take(xSemaphore, xTicksToWait, recursive);

    } else { // portMAX_DELAY

        int threshTicks = deadlock_thresh_ms / portTICK_PERIOD_MS;

        bool once = false;
        while (!take(xSemaphore, threshTicks, recursive)) {
            if(!once){
                ESP_LOGW(TAG, "xSemaphoreTake%s %p taking awhile", recursive ? "Recursive" : "", xSemaphore);
                esp_backtrace_print(20); 
                print_owner(xSemaphore);
                once = true;
            }
        }
        if (once) {
            ESP_LOGW(TAG, "xSemaphoreTake%s %p succeeded!", recursive ? "Recursive" : "", xSemaphore);
        }

        rv = true;
    }

    // add debug data
    if (rv) {
        takeMtx();
        for (int i = 0; i < max_semaphores; i++) {
            if (dbg[i].xSemaphore == xSemaphore || dbg[i].xSemaphore == NULL) {
                if (dbg[i].xSemaphore == NULL) {
                    printf("semaphore: alloc mutex dbg data: %i\n", i);
                }
                dbg[i].xSemaphore = xSemaphore;
                backtrace(dbg[i].backtrace, STACK_DEPTH);
                break;
            }
        }
        giveMtx();
    }

    return rv;
}

///////////////////////////////////////////////////
// ██████   ██    ██  ██████   ██       ██   ██████ 
// ██   ██  ██    ██  ██   ██  ██       ██  ██      
// ██████   ██    ██  ██████   ██       ██  ██      
// ██       ██    ██  ██   ██  ██       ██  ██      
// ██        ██████   ██████   ███████  ██   ██████

void esp_semaphore_deadlock_helper_init(int max_semaphores_arg, int deadlock_thresh_ms_arg)
{
    if (dbg == NULL) {
        dbg = (pd_semphr_debug_data_t*) calloc(1, sizeof(pd_semphr_debug_data_t) * max_semaphores_arg);
        mutex = xSemaphoreCreateMutex();
        max_semaphores = max_semaphores_arg;
    }
    deadlock_thresh_ms = deadlock_thresh_ms_arg;
}

// to debug, you must replace your semaphore take calls with this function!
bool xSemaphoreTake2(SemaphoreHandle_t xSemaphore, TickType_t xTicksToWait)
{
   return xSemaphoreTakeDebug(xSemaphore, xTicksToWait, false);
}

// to debug, you must replac your semaphore take calls with this function!
bool xSemaphoreTakeRecursive2(SemaphoreHandle_t xSemaphore, TickType_t xTicksToWait)
{
   return xSemaphoreTakeDebug(xSemaphore, xTicksToWait, true);
}

@redgreenloko
Copy link

Guys, does anyone know how to populate a esp_backtrace_frame_t structure from TaskSnapshot_t? Doesn't seem too obvious.

@redgreenloko
Copy link

redgreenloko commented May 11, 2023

Found out. Maybe it'll be helpful for someone.

void getTaskBacktraceFrame(TaskSnapshot_t* snapshot,esp_backtrace_frame_t* frame)
{
  XtExcFrame* xtexcframe = (XtExcFrame*)snapshot->pxTopOfStack;
  frame->backtrace.pc = xtexcframe->xtexc->pc;
  frame->backtrace.sp = xtexcframe->xtexc->a1;
  frame->backtrace.next_pc = xtexcframe->xtexc->a0;
  frame->backtrace.exc_frame = xtexcframe->xtexc;
}

@chipweinberger
Copy link
Contributor Author

@redgreenloko if you implement this please open a PR!

@chipweinberger
Copy link
Contributor Author

chipweinberger commented Jun 2, 2023

I've implemented it.

esp_debug_helpers.c

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "freertos/task_snapshot.h"


// note for ESP32S3, you must alter esp_backtrace_print_from_frame to whitelist 0x400559DD
// as a valid memory address. see: https://github.com/espressif/esp-idf/issues/11512#issuecomment-1566943121
// Otherwise nearly all the backtraces will print as corrupt.
//
//    //Check if first frame is valid
//    bool corrupted = !(esp_stack_ptr_is_sane(stk_frame.sp) &&
//                       (esp_ptr_executable((void *)esp_cpu_process_stack_pc(stk_frame.pc)) ||
//         /*whitelist*/  esp_cpu_process_stack_pc(stk_frame.pc) == 0x400559DD ||         
//                        /* Ignore the first corrupted PC in case of InstrFetchProhibited */
//                       (stk_frame.exc_frame && ((XtExcFrame *)stk_frame.exc_frame)->exccause == EXCCAUSE_INSTR_PROHIBITED)));



esp_err_t IRAM_ATTR esp_backtrace_print_all_tasks(int depth, bool panic)
{
    uint32_t task_count = uxTaskGetNumberOfTasks();

    TaskSnapshot_t* snapshots = (TaskSnapshot_t*) calloc(task_count * sizeof(TaskSnapshot_t), 1);

    // get snapshots
    UBaseType_t tcb_size = 0;
    uint32_t got = uxTaskGetSnapshotAll(snapshots, task_count, &tcb_size);

    uint32_t len = got < task_count ? got : task_count;

    print_str("printing all tasks:\n\n", panic);

    esp_err_t err = ESP_OK;

    for (uint32_t i = 0; i < len; i++) {

        TaskHandle_t handle = (TaskHandle_t) snapshots[i].pxTCB;

        char* name = pcTaskGetName(handle);

        print_str(name ? name : "No Name" , panic);

        XtExcFrame* xtf = (XtExcFrame*)snapshots[i].pxTopOfStack;
        
        esp_backtrace_frame_t frame = {
            .pc = xtf->pc,
            .sp = xtf->a1,
            .next_pc = xtf->a0,
            .exc_frame = xtf,
        };

        esp_err_t nerr = esp_backtrace_print_from_frame(depth, &frame, panic);
        if (nerr != ESP_OK) {
            err = nerr;
        }
    }

    free(snapshots);
    return err;
}

@chipweinberger
Copy link
Contributor Author

PR: #11575

@chipweinberger
Copy link
Contributor Author

@o-marshmallow awesome!! thank you! =)

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: Opened Issue is new labels Feb 6, 2024
jnross pushed a commit to junelife/esp-idf that referenced this issue Feb 12, 2024
This commit adds esp_backtrace_print_all_tasks() which prints the backtraces
of all tasks at runtime.

Closes espressif#9708
CLoses espressif#11575

[Omar Chebib: Prevent task switching while printing backtraces of tasks.]
[Omar Chebib: Ensure all task stacks are flushed from register to RAM.]
[Omar Chebib: Removed esp_task_snapshot_to_backtrace_frame() as task snapshot is private API.]
[Omar Chebib: Added test case for esp_backtrace_print_all_tasks().]

Signed-off-by: Omar Chebib <[email protected]>
(cherry picked from commit 3686689)

# Conflicts:
#	components/esp_system/port/arch/xtensa/debug_helpers.c
#	components/esp_system/test_apps/esp_system_unity_tests/main/test_backtrace.c
#	components/freertos/esp_additions/freertos_tasks_c_additions.h
#	components/freertos/esp_additions/include/esp_private/freertos_debug.h
jnross pushed a commit to junelife/esp-idf that referenced this issue Feb 12, 2024
This commit adds esp_backtrace_print_all_tasks() which prints the backtraces
of all tasks at runtime.

Closes espressif#9708
CLoses espressif#11575

[Omar Chebib: Prevent task switching while printing backtraces of tasks.]
[Omar Chebib: Ensure all task stacks are flushed from register to RAM.]
[Omar Chebib: Removed esp_task_snapshot_to_backtrace_frame() as task snapshot is private API.]
[Omar Chebib: Added test case for esp_backtrace_print_all_tasks().]

Signed-off-by: Omar Chebib <[email protected]>
(cherry picked from commit 3686689)

# Conflicts:
#	components/esp_system/port/arch/xtensa/debug_helpers.c
#	components/esp_system/test_apps/esp_system_unity_tests/main/test_backtrace.c
#	components/freertos/esp_additions/freertos_tasks_c_additions.h
#	components/freertos/esp_additions/include/esp_private/freertos_debug.h
jnross pushed a commit to junelife/esp-idf that referenced this issue Feb 12, 2024
This commit adds esp_backtrace_print_all_tasks() which prints the backtraces
of all tasks at runtime.

Closes espressif#9708
CLoses espressif#11575

[Omar Chebib: Prevent task switching while printing backtraces of tasks.]
[Omar Chebib: Ensure all task stacks are flushed from register to RAM.]
[Omar Chebib: Removed esp_task_snapshot_to_backtrace_frame() as task snapshot is private API.]
[Omar Chebib: Added test case for esp_backtrace_print_all_tasks().]

Signed-off-by: Omar Chebib <[email protected]>
(cherry picked from commit 3686689)

# Conflicts:
#	components/esp_system/port/arch/xtensa/debug_helpers.c
#	components/esp_system/test_apps/esp_system_unity_tests/main/test_backtrace.c
#	components/freertos/esp_additions/freertos_tasks_c_additions.h
#	components/freertos/esp_additions/include/esp_private/freertos_debug.h
mykmelez pushed a commit to junelife/esp-idf that referenced this issue Aug 9, 2024
This commit adds esp_backtrace_print_all_tasks() which prints the backtraces
of all tasks at runtime.

Closes espressif#9708
CLoses espressif#11575

[Omar Chebib: Prevent task switching while printing backtraces of tasks.]
[Omar Chebib: Ensure all task stacks are flushed from register to RAM.]
[Omar Chebib: Removed esp_task_snapshot_to_backtrace_frame() as task snapshot is private API.]
[Omar Chebib: Added test case for esp_backtrace_print_all_tasks().]

Signed-off-by: Omar Chebib <[email protected]>
(cherry picked from commit 3686689)

(cherry picked from commit ed19aae)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Feature Request Feature request for IDF
Projects
None yet
Development

No branches or pull requests

6 participants