You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It occurs to me that many public datasets have predefined training and testing splits for comparison purposes. We need the ability to supply a --train_dir and --test_dir or a train.pcap and test.pcap for this purpose, as right now we split the data randomly that we get.
The text was updated successfully, but these errors were encountered:
Sure, that could work, (and I've done it that way in the past).
Alternatively, though this might overload things slightly, it might be easier (for the user and the implementation), to identify test files via the labeling file….
And perhaps make it smart-ish (though we needn't) – say, if some rows are marked "test", then we know which are test and which train, (and same for just some marked "train" and the rest left blank). But, if some are marked "test" and some "train" and there are any unmarked, we error. (Alternatively we make it less smart, and/or make this column boolean.)
It occurs to me that many public datasets have predefined training and testing splits for comparison purposes. We need the ability to supply a --train_dir and --test_dir or a train.pcap and test.pcap for this purpose, as right now we split the data randomly that we get.
The text was updated successfully, but these errors were encountered: