-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More parallel+serial discussion #1470
Comments
Separately, for @eisenhauer @philip-davis, what's the reason that SST is built as a separate library? This, together with the fact that it partially uses C sources (so the |
(Again, please forgive lateness in joining discussions. I've unsubscribed from most GitHub notifications to preserve sanity, so I miss things.) . WRT SST's status as a library (and, indirectly, it's representation in C): Mostly, it's for historical/arbitrary reasons. In the early stages, it was easier to play with SST concepts and prototype an implementation outside of ADIOS. We had a pretty good idea of what the interfaces were, both between ADIOS2 and SST, and inside SST between the control plane and the data plane, but we needed a framework in which to exercise things. So, SST was prototyped as a library with a well-defined interface to the test-harness levels and between the control plane and data plane. Philip and I were both most comfortable in C, so that's what we used. (That when we were doing this, ADIOS2 development was mostly focused on the writer side and the reader side had received little attention reinforced the practicality of outside-ADIOS2 development. Writer-side-first development made sense for files, but was unhelpful for a staging system.) When it came time to actually pull SST into ADIOS2, @chuckatkins and I made the decision to leave SST as a library. I don't have strong feelings about that per se. It doesn't matter much if we link objects directly or just link the library. But I imagine that more awkward bit is that SST is in C and can't directly use the wrappers from the main library. That's unrelated to the library thing... |
Thanks @eisenhauer for the insights. You're right, the usage of C causes the bigger hassles. I think it's perfectly reasonable to support C code in adios2, though, even if it means hassles :( Still, before I go there, I guess this can also be considered a guinea pig for my other consideration, that is, is it worth it to have separate code paths for the serial case. I took one file ( So on its face, that doesn't seem worth it -- quite a bit of code added for virtually no gain vs handling this through mpidummy. However, I'd like to ask first whether there might be in point in doing these separate paths for other reasons. You can see that what I ended up with is basically an encode, copy the encoded buffer, decode again. I don't really have an understanding of what's going on at a deeper level, though. If it were possible to eliminate the encode->decode roundtrip, things might look simpler, and maybe more importantly, it could also potentially speed thing up. I suppose that at least the "single reader" case might actually be relevant in real life, so if it's possible to optimize that case, maybe this could all be worth it after all. diff --git a/source/adios2/toolkit/sst/cp/cp_common.c b/source/adios2/toolkit/sst/cp/cp_common.c
index 99cc4375..2e00643c 100644
--- a/source/adios2/toolkit/sst/cp/cp_common.c
+++ b/source/adios2/toolkit/sst/cp/cp_common.c
@@ -1,5 +1,6 @@
#include <ctype.h>
#include <stdarg.h>
+#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@@ -532,9 +533,41 @@ static FMStructDescList combineCpDpFormats(FMStructDescList top,
return CombinedFormats;
}
+static inline bool isSerial(SstStream Stream)
+{
+ return Stream->CohortSize == 1;
+}
+
+static void **CPSerial_consolidateDataToRankZero(SstStream Stream,
+ void *LocalInfo,
+ FFSTypeHandle Type,
+ void **RetDataBlock)
+{
+ FFSBuffer Buf = create_FFSBuffer();
+ int DataSize;
+ char *Buffer =
+ FFSencode(Buf, FMFormat_of_original(Type), LocalInfo, &DataSize);
+
+ char *RecvBuffer = malloc((DataSize + 7) & ~7);
+ memcpy(RecvBuffer, Buffer, DataSize);
+ free_FFSBuffer(Buf);
+
+ FFSContext context = Stream->CPInfo->ffs_c;
+ struct _CP_DP_init_info **Pointers = malloc(sizeof(Pointers[0]));
+ FFSdecode_in_place(context, RecvBuffer, (void **)&Pointers[0]);
+
+ *RetDataBlock = RecvBuffer;
+ return (void **)Pointers;
+}
+
void **CP_consolidateDataToRankZero(SstStream Stream, void *LocalInfo,
FFSTypeHandle Type, void **RetDataBlock)
{
+ if (isSerial(Stream))
+ {
+ return CPSerial_consolidateDataToRankZero(Stream, LocalInfo, Type,
+ RetDataBlock);
+ }
FFSBuffer Buf = create_FFSBuffer();
int DataSize;
int *RecvCounts = NULL;
@@ -609,9 +642,35 @@ void **CP_consolidateDataToRankZero(SstStream Stream, void *LocalInfo,
return (void **)Pointers;
}
+static void *CPSerial_distributeDataFromRankZero(SstStream Stream,
+ void *root_info,
+ FFSTypeHandle Type,
+ void **RetDataBlock)
+{
+ FFSBuffer Buf = create_FFSBuffer();
+ int DataSize;
+ char *tmp =
+ FFSencode(Buf, FMFormat_of_original(Type), root_info, &DataSize);
+ char *Buffer = malloc(DataSize);
+ memcpy(Buffer, tmp, DataSize);
+ free_FFSBuffer(Buf);
+
+ FFSContext context = Stream->CPInfo->ffs_c;
+
+ void *RetVal;
+ FFSdecode_in_place(context, Buffer, &RetVal);
+ *RetDataBlock = Buffer;
+ return RetVal;
+}
+
void *CP_distributeDataFromRankZero(SstStream Stream, void *root_info,
FFSTypeHandle Type, void **RetDataBlock)
{
+ if (isSerial(Stream))
+ {
+ return CPSerial_distributeDataFromRankZero(Stream, root_info, Type,
+ RetDataBlock);
+ }
int DataSize;
char *Buffer;
void *RetVal;
@@ -647,6 +706,12 @@ void *CP_distributeDataFromRankZero(SstStream Stream, void *root_info,
void **CP_consolidateDataToAll(SstStream Stream, void *LocalInfo,
FFSTypeHandle Type, void **RetDataBlock)
{
+ // FIXME, actually consolidateDataToAll is entirely unused
+ if (isSerial(Stream))
+ {
+ return CPSerial_consolidateDataToRankZero(Stream, LocalInfo, Type,
+ RetDataBlock);
+ }
FFSBuffer Buf = create_FFSBuffer();
int DataSize;
int *RecvCounts;``` |
Unfortunately, I think it's hard to optimize out the (seemingly) redundant single-processor encode/decode. The complexity here is that the encode/decode also has a side effect of rearranging the memory layout of any data that is more complex than a flat structure with no pointers. Essentially it consolidates those disparate malloc blocks into a single large block (with internal pointers). (This allows EVpath and message-passing systems built on FFS to efficiently manage incoming complex structured messages as a single entity while simplifying their memory management. This single large block is represented by the RetDataBlock parameter in the routines above.) So, if we got rid of the encode/decode pair, we'd either have to provide similar functionality (likely at a similar cost), or rewrite all the single-processor memory handling as a special case. That's likely a much bigger mess than we want to tackle... |
Thanks @eisenhauer for the clarification. So that leads me to the conclusion that separate code paths aren't the right solution here, so I'll work on making the MPI wrapping thing available to plain old C. |
@bradking just keeping you in the loop on this discussion |
With #1480 merged, I'll close this. |
This is follow-up from #1443 and PR #1466. The PR addresses the fundamentals of the issue, but the work isn't really complete.
Open issues:
The current approach is to replace MPI_* functions by their SMPI_* wrapper equivalents. These wrappers will call through to the real MPI if it's initialized (and not finalized), otherwise it'll do the MPI dummy stuff. I somewhat deliberately only changed selected pieces to use those wrappers, since I'm not too convinced that it's a real clean approach, though I'm torn about that. It could however, be considered to do a global search and replace and use the wrappers everywhere. Some things still wouldn't work (e.g., no easy way to implement
MPI_Send
/MPI_Recv
properly on a single proc). The current state isn't horrible, ie., if someone were to try to use some part which hasn't been converted to SMPI_* wrappers w/o callingMPI_Init
, they'd get MPI complaining about just that, rather than some mysterious subtle problms.What I think would be cleaner, but potentially involve some code duplication, would be to have separate code paths for serial/parallel case where both are supported. e.g., in the serial case,
AggregateMetadata
probably shouldn't even be called. This, however, requires being able to distinguish whether a function is meant to execute serially. I think there is one straightforward way to make that distinction, ie., defineMPI_COMM_NULL
as having the meaning "don't use MPI, this executes serially". That's clearer than the current reliance onMPI_Initialized
/MPI_Finalized
; that status, one should note, might actually change along the way. So it's possible to haveSMPI_Comm_dup
return the communicator itself when called beforeMPI_Init
(because that's what mpidummy) does, then initialize MPI, and then eventually haveSMPI_Comm_free
call the realMPI_Comm_free
, which will rightfully complain that the communicator hadn't beendup
'd in the first place. This actually happens in real life, which is why I didn't change theMPI_Comm_dup
handling inside ofcore::ADIOS
for now.Switching between real MPI and mpidummy based on the communicator (MPI_COMM_NULL vs not) resolves a lot of ambiguity -- it's also cheaper than having to check
MPI_Initialized
andMPI_Finalized
on every MPI call, so I'd be in favor of going that way. That's also a pretty small/localized change.On the bigger issue (wrap everything with SMPI vs separate code paths), I think one might have try and look at it to decide what's better.
The text was updated successfully, but these errors were encountered: