-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix stat printing condition #584
base: master
Are you sure you want to change the base?
Conversation
There seems to be a little confusion. We made a change to remove the subtraction because the stats profiler was broken with it. |
Shouldn't the parser be fixed then? it broke bsg_print_stat in bsg_manycore. @drichmond |
IIRC, it's not the parser that's borked. The stat file was empty. |
If you remove the subtract then the coordinate will not match and it will never print any stats... |
We found the opposite to be true, and that removing the subtract fixed the problem. It sounds like there's a mismatch in infrastructure between replicant and manyore in terms of how the stat packets are formatted? |
Doesn't it use the same bsg_nonsynth_manycore_testbench.v and bsg_cuda_print_stat_kernel_start/end() in C? |
it does |
I'm combing through the code now. |
I think the divergence happens here:
How does the nbf loader set the __bsg_grp_org_x/y variables? It looks like CUDA sets them to absolute coordinates currently. So they are being set to (16,8) at the moment. |
It uses bsg_set_tile_x_y(). __bsg_grp_org_xy actually comes from the tile-group origin Network CSR. https://github.com/bespoke-silicon-group/bsg_manycore/blob/master/software/bsg_manycore_lib/bsg_set_tile_x_y.c#L31 |
I see. I can fix that in replicant, but will this code work if a tile in pod (x=0,y=1) issues a print_stat request? |
Before you do that, I think we need to spend more time thinking about the correct fix here. I don't think this fix will work when there are multiple pods. Removing the subtract does not completely fix the problem either, because the coordinates in print_stat_tag are only 6-bits wide (currently global x/y are 7-bits). But that's fine, because we can always know who sent the packet by looking at the src x/y of the packet. |
Could we use a bind solution here, like for the SAIF start/end? |
A bind solution would reduce network traffic, improve accuracy of the windows, and reduce the instruction footprint. Maybe we could even bind in and read the tile group IDs from DMEM? Just spitballing here. |
what does saif do? isn't there a special instruction that triggers it? |
I think, as Tommy suggested, we should check against the src x/y coordinates of the packet rather than reading the x/y out of the tag data. So that would mean checking for equality with the global_x_i, global_y_i. |
In retrospect that seems sort of obvious. Why didn't we do that anyway? Is this one more legacy part of F1? |
I really don't know. If I recall, we didn't support the stats_tag feature on F1. |
we didn't, but we supported it in simulation, back when we used the F1 interface. |
@tommydcjung What do you think of a bind solution? Maybe not now, but long term? Are there any issues you foresee? |
bind solution as in using addi x0 won't be able to trigger vcache and router profiler. For saif, the bind solution doesn't work when you are simulating using gate-level netlist. |
Great points! So we would have to modify all three profilers to bind, but even then it won't work if we want to use gate-level netlist. |
No description provided.