Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prio scheduler crate #3167

Open
wants to merge 15 commits into
base: master
Choose a base branch
from
Open

Conversation

flame4
Copy link

@flame4 flame4 commented Oct 15, 2024

Problem

Currently the definition and implementation of PrioGraphScheduler and related components is coupled with banking-stage on solana-core crate, with most of useful api availabilty a pub(crate), so it's hard to reuse.

Summary of Changes

  • Extract PrioGraphScheduler as indepent component to use.
  • The SchedulerController handles a lot of forwarding and decision making logic, which is not related to scheduer itself, so just left in solana_core.
  • Scheduler is coupled with ImmutableDeserializedPacket. To avoid circular dependencies, just raise up a Trait type DeserializableTxPacket and let ImmutableDeserializedPacket impls this trait. It also helps modularity of solana crate.
  • Impls a unified IdGenerator to replace TransactionIdGenerator and BatchIdGenerator, which is quite same.
  • two duplicated things: TARGET_NUM_TRANSACTIONS_PER_BATCH and ReadWriteAccountSet, have no good idea to public them somewhere.
  • other codes, basically speaking, just copy-paste of the scheduler implementations.

@mergify mergify bot requested a review from a team October 15, 2024 10:33
/// DeserializablePacket can be deserialized from a Packet.
///
/// DeserializablePacket will be deserialized as a SanitizedTransaction
/// to be scheduled in transaction stream and scheduler.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract a trait from ImmutableDeserializedPacket itself, and let ImmutableDeserializedPacket implement this to avoid circular dependency

};

/// Wrapper struct to accumulate locks for a batch of transactions.
#[derive(Debug, Default)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have no idea about where to public this type. Since the usage on scheduler is only here, i just copy one now.

@flame4 flame4 force-pushed the prio-scheduler-crate branch 2 times, most recently from 739519a to b118a73 Compare October 16, 2024 02:52
@flame4 flame4 force-pushed the prio-scheduler-crate branch from b118a73 to 6ab80ef Compare October 16, 2024 02:54
@buffalojoec buffalojoec added the CI Pull Request is ready to enter CI label Oct 25, 2024
@anza-team anza-team removed the CI Pull Request is ready to enter CI label Oct 25, 2024
Copy link

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey guys, sorry for delay here. I have a few preliminary suggestions before further review.

First and foremost, a few useful tips for opening/managing PRs on Agave.

It's much easier to review when you rebase and force-push than do merge commits.

git fetch <anza-origin> master
git rebase <anza-origin>/master
git push -f

You'll also want to make sure your branch passes formatting, clippy, and tests before pushing.

scripts/cargo-fmt.sh
scripts/cargo-clippy.sh
scripts/cargo-clippy-nightly.sh

Typically I just test the crates I've changed.

cargo t -p solana-core -p solana-prio-graph-scheduler

As for this PR itself, can we break this up, maybe into multiple separate PRs? It's pretty difficult to follow all of the changes.

My suggestion would be to create a PR that initializes the new solana-prio-graph-scheduler crate, and then simply move modules from solana-core directly into solana-prio-graph-scheduler without changing anything. Then just update any necessary imports.

With this approach, it seems like you could move all of these in the same PR:

  • scheduler_messages (from banking_stage)
  • batch_id_generator
  • in_flight_tracker
  • scheduler_error
  • scheduler_metrics
  • thread_aware_account_locks
  • transaction_id_generator
  • transaction_priority_id

From there, it'll be much easier to see in future PRs what the changes are - like introducing traits, etc.


I've actually thrown up an example PR on your repo so you guys can see what this would look like.
soonlabs#3

In my opinion, I think you guys should build on a specific branch, using incremental commits like I did in my PR, to demonstrate the decoupled priograph scheduler and the target implementation. It'll be much easier to follow along that way.

In addition, I think that gives us a good canvas to use for discussions about upstreaming a decoupled priograph scheduler to Agave, considering it seemed like discussions were still ongoing about that.

cc @apfitzge

@@ -445,6 +446,7 @@ solana-package-metadata-macro = { path = "sdk/package-metadata-macro", version =
solana-perf = { path = "perf", version = "=2.1.0" }
solana-poh = { path = "poh", version = "=2.1.0" }
solana-poseidon = { path = "poseidon", version = "=2.1.0" }
solana-prio-graph-scheduler = { path = "prio-graph-scheduler", version = "=2.1.0" }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should go below solana-precompile-error (alphabetical order).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I think this was actually me (heh)!

@@ -73,8 +72,7 @@ mod packet_deserializer;
mod packet_filter;
mod packet_receiver;
mod read_write_account_set;
mod scheduler_messages;
mod transaction_scheduler;
mod scheduler_controller;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is scheduler_metrics not included? The validator absolutely needs these metrics hooked up to its scheduler.

It's also no longer included in the module hierarchy here, btw.

Comment on lines +3 to +17
pub struct IdGenerator {
next_id: u64,
}

impl Default for TransactionIdGenerator {
impl Default for IdGenerator {
fn default() -> Self {
Self { next_id: u64::MAX }
}
}

impl TransactionIdGenerator {
pub fn next(&mut self) -> TransactionId {
impl IdGenerator {
pub fn next<T: From<u64>>(&mut self) -> T {
let id = self.next_id;
self.next_id = self.next_id.wrapping_sub(1);
TransactionId::new(id)
T::from(id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's actually a subtle difference between BatchIdGenerator and TransactionIdGenerator, and that's in the implementation of the Default trait for each. The default value for next_id is zero in BatchIdGenerator (from the derive macro), but it's u64::MAX in TransactionIdGenerator.

Since these two structs don't have any external dependencies, can they be left separate? Each can still be generic over From<u64>.

Comment on lines +13 to +29
solana-sdk = { workspace = true }
solana-poh = { workspace = true }
solana-metrics = { workspace = true }
solana-ledger = { workspace = true }
solana-runtime = { workspace = true }
solana-gossip = { workspace = true }
solana-cost-model = { workspace = true }
solana-measure = { workspace = true }

ahash = { workspace = true }
prio-graph = { workspace = true }
thiserror = { workspace = true }
itertools = { workspace = true }
log = { workspace = true }
crossbeam-channel = { workspace = true }
arrayvec = { workspace = true }
min-max-heap = { workspace = true }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort these alphabetically and group them together.

solana-sanitize = { workspace = true }
solana-short-vec = { workspace = true }
solana-svm-transaction = { workspace = true }
# let dev-context-only-utils works when running this crate.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this comment? It doesn't really make sense.

Comment on lines +40 to +45
solana-prio-graph-scheduler = { path = ".", features = [
"dev-context-only-utils",
] }
solana-sdk = { workspace = true, features = ["dev-context-only-utils"] }

bincode = { workspace = true }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also apha-order dev-dependencies please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants