Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: High-level network specification #2050

Merged
merged 98 commits into from
May 22, 2024

Conversation

AdhocMan
Copy link
Contributor

@AdhocMan AdhocMan commented Nov 28, 2022

This PR implements a high-level network specification as proposed in #418. It does not include support for gap junctions to allow the use of domain decomposition for some distributed network generation.
The general idea is a DSL based on set algebra, which operates on the set of all possible connections, by selecting based on different criteria, such as the distance between cells or lists of labels. By operating on all possible connections, a separate definition of cell populations becomes unnecessary. An example for selecting all inter-cell connections with a certain source and destination label is:
(intersect (inter-cell) (source-label \"detector\") (destination-label \"syn\"))

For parameters such as weight and delay, a value can be defined in the DSL in a similar way with the usual mathematical operations available. An example would be:
(max 0.1 (exp (mul -0.5 (distance))))

The position of each connection site is calculated by resolving the local position on the cell and applying an isometry, which is provided by a new optional function of the recipe. In contrast to the usage of policies to select a member within a locset, each site is treated individually and can be distinguished by its position.

Internally, some steps have been implemented in an attempt to reduce the overhead of generating connections:

  • Pre-select source and destination sites based on the selection to reduce the sampling space when possible
  • If selection is limited to a maximum distance, use an octree for efficient spatial sampling
  • When using MPI, only instantiate local cells and exchange source sites in a ring communication pattern to overlap communication and sampling. In addition, this reduces memory usage, since only the current and next source sites have to be stored in memory during the exchange process.

Custom selection and value functions can still be provided by storing the wrapped function in a dictionary with an associated label, which can then be used in the DSL.

Some challenges remain. In particular, how to handle combined explicit connections returned by connections_on and the new way to describe a network. Also, the use of non-blocking MPI is not easily integrated into the current context types, and the dry-run context is not supported so far.

Example

A (trimmed) example in Python, where a ring connection combined with random connections based on the distance:

class recipe(arbor.recipe):
    def cell_isometry(self, gid):
        # place cells with equal distance on a circle
        radius = 500.0 # μm
        angle = 2.0 * math.pi * gid / self.ncells
        return arbor.isometry.translate(radius * math.cos(angle), radius * math.sin(angle), 0)

    def network_description(self):
        seed = 42

        # create a chain
        ring = f"(chain (gid-range 0 {self.ncells}))"
        # connect front and back of chain to form ring
        ring = f"(join {ring} (intersect (source-cell {self.ncells - 1}) (destination-cell 0)))"

        # Create random connections with probability inversely proportional to the distance within a
        # radius
        max_dist = 400.0 # μm
        probability = f"(div (sub {max_dist} (distance)) {max_dist})"
        rand = f"(intersect (random {seed} {probability}) (distance-lt {max_dist}))"

        # combine ring with random selection
        s = f"(join {ring} {rand})"
        # restrict to inter-cell connections and certain source / destination labels
        s = f"(intersect {s} (inter-cell) (source-label \"detector\") (destination-label \"syn\"))"

        # normal distributed weight with mean 0.02 μS, standard deviation 0.01 μS
        # and truncated to [0.005, 0.035]
        w = f"(truncated-normal-distribution {seed} 0.02 0.01 0.005 0.035)"
        # fixed delay
        d = "(scalar 5.0)"  # ms delay

        return arbor.network_description(s, w, d, {})

TODO

  • Export function to allow inspection of generated connections
  • Further testing of distributed network generation
  • Documentation

@AdhocMan AdhocMan marked this pull request as ready for review December 13, 2022 10:27
@brenthuisman
Copy link
Contributor

@thorstenhater thorstenhater self-requested a review December 14, 2022 14:45
@llandsmeer
Copy link
Contributor

I think the idea of using a high level algebra to define networks is really really nice. However, I do not fully see how the PR in its current form would help with creating 'realistic' neural networks as none of the functions use things like soma-soma distances or placed synapse positions. Eg, if I generate 2000 random morphologically realistic cells with synapses randomly distributed over the axons and dendrites, I would like to connect up those within a 50um range with 10% probability. Not any random cell to any other random cell..

Again, I really like the approach, but I don't see yet who the target audience is. One of the use cases I see is generating large benchmarks, which is useful for Arbor developers looking at the scaling capacity. Or I guess generating small-world etc.. networks would also fit in the based method, making it useful for these network-science-on-neurons papers that look at the effect of topology on dynamics.

Are you planning on adding cell-location dependent network selection methods or is that intrinsically impossible in arbor?

@AdhocMan
Copy link
Contributor Author

The idea is to provide the building blocks for such cases, without having to extent arbor to use otherwise unnecessary information like a global cell position. So assuming the user has a way of generating or reading the cell position, a custom selection with the help of a uniform random network value can be created, for example:

class random_distance_selection:
    def __init__(self):
        self.uniform_rand = arbor.network_value.uniform_distribution(42, 0.0, 1.0)

    def __call__(self, src, dest):
        distance = norm(pos(src) - pos(dest))
        if distance > 50:
            return False

        return self.uniform_rand(src, dest) < 0.1

This way, arbor takes care of necessary logic of generating repeatable random numbers for pairs of source and destination, as well as only sampling locally required connections.

Copy link
Contributor

@brenthuisman brenthuisman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

A few comments, I think the documentation can use a bit more love.

In terms of selection operators: maybe one or more should be built in, for instance one that takes into account distance (to soma, other synapse sites, synapse distance?). Not sure what the full list would be here, but e.g. being able to specify a maximum distance or make long distance connections less likely would be clearly useful.

doc/concepts/interconnectivity.rst Outdated Show resolved Hide resolved
doc/concepts/interconnectivity.rst Outdated Show resolved Hide resolved
doc/concepts/interconnectivity.rst Outdated Show resolved Hide resolved
doc/concepts/interconnectivity.rst Outdated Show resolved Hide resolved
doc/concepts/interconnectivity.rst Outdated Show resolved Hide resolved
arbor/include/arbor/network.hpp Outdated Show resolved Hide resolved
doc/concepts/interconnectivity.rst Outdated Show resolved Hide resolved
@thorstenhater
Copy link
Contributor

Hi @AdhocMan,

sorry for adding to an already long list of comments, but this topic has been near and dear to my heart for a long time.
Thanks for taking such good care of it!

Before I start delving into the details of the code, though, some high-level comments:

  1. I think the docs need some love, especially the motivation and a more prominent first example in that section.
    How about contrasting the 'old' way and the high-level DSL on for example the Brunel connectivity as an appetizer?
  2. I agree with Lennart at these spatial queries are the number one concern of real world network building. They need
    an easy to find explanation. A built-in solution would probably appreciated by users.
  3. I am unsure about the performance implications. In the old style each gid simply states its incoming connections
    and later resolves labels. That's basically $\mathcal{O}(1)$ per connection. In your proposal I think I see costs of
    $\mathcal{O}(N_\mathrm{cell})$. Is that correct?
    What's potentially worse tough is that the Python object is called via a trampolining from C++. See discussion on
    Performance: Investigate setup times #1850; but the overhead is Bad. Do have some (scaling) numbers?

NB. 3 doesn't mean your proposal is infeasible or needs more work in this direction. Just that we should be aware of it
and potentially add a note in the docs.

@AdhocMan
Copy link
Contributor Author

AdhocMan commented Jan 2, 2023

1. I think the docs need some love, especially the motivation and a more prominent first example in that section.
   How about contrasting the 'old' way and the high-level DSL on for example the Brunel connectivity as an appetizer?

2. I agree with Lennart at these spatial queries are the _number one_ concern of real world network building. They need
   an easy to find explanation. A built-in solution would probably appreciated by users.

3. I am unsure about the performance implications. In the old style each `gid` simply states its incoming connections
   and later resolves labels. That's basically O(1) per connection. In your proposal I think I see costs of
   O(Ncell). Is that correct?
   What's potentially worse tough is that the Python object is called via a trampolining from C++. See discussion on
   [Performance: Investigate setup times #1850](https://github.com/arbor-sim/arbor/issues/1850); but the overhead is Bad. Do have some (scaling) numbers?

NB. 3 doesn't mean your proposal is infeasible or needs more work in this direction. Just that we should be aware of it and potentially add a note in the docs.

Thanks for the feedback.

In general, this new way of specifying connections is not meant as a replacement of the current way. For simple connection setups, the current style is likely a better fit. So when comparing performance, a more fair comparison would include the work required to know which connections to return. In particular, generating random connections between all cells with a non-uniform probability is inherently a $\mathcal{O}(N_{cell})$ operation per cell.

To address the case of spatial queries, I'm working on an extension of the implementation, that includes support for cell locations. It will not take cell structure into account, to avoid the complexities like cell orientation, local label resolution and locsets containing multiple points. Assuming that limiting connections to a maximum distance is a common use case, one can limit the number of cells that need to be sampled by using a spatial data structure (an octree for example).
This will require to store location data of all cells, so memory could become an issue for very large models, but I don't see a good way of avoiding this.

Regarding the performance of call backs to Python:
The idea is to provide functionality for common cases when possible, such that hopefully a custom selection is not necessary. However, the network generation happens before any label resolution and cell instantiation, so it's not possible to provide a selection based on things like cell type without help from the user. If you see a good way of avoiding call backs to Python for such cases, suggestions would be welcome. Otherwise, there could just be a recommendation to use C++ for large custom network generation. For the new spatial feature, I could imagine something similar to the inhomogeneous expressions, allowing the user to specify a probability function based on the distance between cells.

Once spatial extension is implemented, I'll have another go at the documentation.

@brenthuisman brenthuisman linked an issue Jan 11, 2023 that may be closed by this pull request
@thorstenhater
Copy link
Contributor

thorstenhater commented Jan 11, 2023

Hi @AdhocMan,

I'd like to propose a way that solves the spatial query requirement and buys us much flexibility in the longer
run. I am expecting people to start using advanced connectivity soon, as work is going on concerning large
dynamically-reconnecting networks. That will naturally want to use spatial information.

So, here's the idea: We add a new callback on recipe that looks like this

std::map<std::string, std::any> metadata_on(cell_gid_type gid);

and returns freely configurable metadata. A default set can be implemented and merged with the user-supplied
one (user taking precedence).

Now filters and operations can make use of these fields by selecting them as

(metadata "location" cell)

It's also extensible by the user who can the add their own callbacks and so on. Others that come to mind
are the presence of a certain synapse type, belonging to a certain (sub-)population, ....

What do you think?

@thorstenhater
Copy link
Contributor

Here's another design-level bit of feedback: I'd like to isolate the user from the actual gid (and similar low-level bits and bobs) as much as possible.

Rationale for that is that this type of information makes recipes non-composable. Consider two populations -- possibly
even given as their own recipe each -- that we want to wire together. If we define them as ranges over gids, then
order matters and users need to care about which gid is which type/population. If, however, we define population as
some predicate on metadata, this hurdle disappears.

That these elaborate to actual sets of numbers at some point is a given, sure, but the user shouldn't handle those.

@brenthuisman
Copy link
Contributor

@AdhocMan How's the feature going? If you're stuck, please let me know, so we can get you going again :)

@AdhocMan
Copy link
Contributor Author

@AdhocMan How's the feature going? If you're stuck, please let me know, so we can get you going again :)

I don't have as much time as I was hoping for to work on this, but the spatial support is almost done. Mainly some more tests and adjustment of the documentation / examples left to do.

@thorstenhater Sorry, for the late reply to your suggestions.
One of the design choices was to keep the new network generation feature independent of the recipe. This way any potential circular dependency is avoided and the network generation can easily be tested outside of a full simulation, for example to visualize connections first. So adding a meta data function to the recipe would imply major changes to the current design.

Regarding the use of gid when defining populations:
As far as I understand, the user has to know about gid when designing the recipe anyway, so I'm not sure how much of complication this adds. The focus of this design is not on the population, but on the network description through the network_selection and network_value types, which do not depend on gid and are easily transferable between recipes.

In principle, I like the idea of a meta data map, but I'm concerned about how efficient this would be in terms of performance and memory usage, since it would mean creating several strings and std::any objects for all cells on every rank. Only accessing a cell position through a call back would increase the number of call backs significantly again, since this needs to be accessed when generating connections for each cell. In contrast, by having all cell locations supplied in a single container allows for easy construction of an octree, which then can speed up network generation significantly. The spatial feature I'm working on relies on supplying a std::vector of coordinates for every range of cells instead.

@thorstenhater
Copy link
Contributor

The advantage of the user-defined metadata approach is that it is infinitely extensible. Overheads are in the range
of: 1x typeid for the tag and 1x void*, ie quite bearable, especially if we pull out the common one(s), like spatial
information. So, one could have this:

struct meta {
  size_t gid;
  vec3 pos;
  quaternion q;
  std::vector<std::string, std::any> user;
}

We use similar type-erased tags for probes, cell-information, and more. My strong advice is to think about at least
one iteration of this further than this current instance and make an extensible system.

Note that simple position is not enough, as one might have to handle individual locations on the cell (locset)
which then needs to turned into a list of point via place_pwlin and a quaternion.

One of the design choices was to keep the new network generation feature independent of the recipe.

I think this might be a noble goal that raises more issues than it solves. Currently the network is encoded in the
recipe to keep things localised. Why not keep it this way rather than adding another layer that goes against the
grain of the current design?

@w-klijn
Copy link
Contributor

w-klijn commented Mar 6, 2024

Acknowledging the risk that I will the reviewer....
hashtag:Bump

@AdhocMan
Copy link
Contributor Author

This PR has dropped off my radar unfortunately.
I've now merged the current master branch and I think it should be complete.
So a review would be welcome.

@thorstenhater
Copy link
Contributor

@w-klijn could you give it a read?
I'll take care of the technical bit, so focus on the API from the viewpoint of a domain scientist.

Copy link
Contributor

@thorstenhater thorstenhater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A massive undertaking, can't wait to have it in. I went over mostly code and internals, so @w-klijn can give it a look, too.

static network_selection destination_cell(gid_range range);

// Select connections that form a chain, such that source cell "i" is connected to the destination cell "i+1"
static network_selection chain(std::vector<cell_gid_type> gids);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For one, shifted_by(-1) subsumes chain_reverse. In general, it adds more power for the user at essentially zero cost to us. No harm in keeping chain and chain_reverse, while adding cyclic_shift and using it to implement the other two.

return generate_network_connections(rec_shim, ctx->context, decomp.value());
},
"recipe"_a,
pybind11::arg_v("context", pybind11::none(), "Execution context"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't misuse arg_v for doc strings. The last argument is a string representing the default value if it is not expressible using pybind. See https://pybind11.readthedocs.io/en/stable/advanced/functions.html#default-arguments-revisited

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now

ext/pybind11 Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to bump this? We usually stick to the rule of upgrading dependencies in

  1. separated PRs
  2. if you absolutely must, please bump all references to it: docs, pyproject, setup, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this was changed. It's reverted back now.

High Level Network Description
------------------------------

As an alternative to providing a list of connections for each cell in the :ref:`recipe <modelrecipe>`, arbor supports high-level description of a cell network. It is based around a ``network_selection`` type, that represents a selection from the set of all possible connections between cells. A selection can be created based on different criteria, such as source or target label, cell indices and also distance between source and target. Selections can then be combined with other selections through set algebra like expressions. For distance calculations, the location of each connection point on the cell is resolved through the morphology combined with a cell isometry, which describes translation and rotation of the cell.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing about alternative here, from the code I read that it is additive instead. Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "additional option"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more general note about the DSL design here. Currently, we are duplicating operators from other places, e.g.:

  • (radius-lt ...) (morph) and (distance-lt) (network), which could be (< (radius ...)) and (< (distance ...)). I know that there are reasons for specialisation here, but it seems awkward and possibly harder to maintain. If specialisation is needed we could contract these particular forms to their optimised variant.
  • (exp ...) and (log ...) already appear in iexpr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these operators would be too much for me right now unfortunately.
The parsing of the network DSL is completely separate from the iexp parsing, so there should be no conflict for duplicate expressions.

arbor/communication/distributed_for_each.hpp Show resolved Hide resolved
@@ -47,17 +48,18 @@ class ARB_ARBOR_API domain_decomposition {
int gid_domain(cell_gid_type gid) const;
int num_domains() const;
int domain_id() const;
cell_size_type index_on_domain(cell_gid_type gid) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly expose std::pair<int, cell_size_type> gid_domain(cell_gid_type). This might allow skipping some duplicated calls to the inner function. Second, how about adding a proper type for the returned value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking API change. Internally, we never use them together, so there would be no direct benefit.
But if you prefer, I can merge these functions.

if (leaf_d.empty()) return;

min_.fill(std::numeric_limits<double>::max());
max_.fill(-std::numeric_limits<double>::max());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
max_.fill(-std::numeric_limits<double>::max());
max_.fill(std::numeric_limits<double>::min());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min() value is actually the smallest value close to 0. I made the same mistake initially ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never knew. That would have tripped me up bad. Yet -max might not be representable.

For floating-point types with denormalization, min() returns the minimum positive normalized value. Note that this behavior may be unexpected, especially when compared to the behavior of min() for integral types. To find the value that has no values less than it, use lowest().(since C++11)

lowest seems to be the right pick here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, lowest seems to be the best fit. I've changed it now.


spatial_tree &operator=(const spatial_tree &) = default;

spatial_tree &operator=(spatial_tree &&t) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether std::swapping with t might be better. At least it avoids issues with exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using std::swap now for the container type now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After another look at it, I think using std::swap doesn't really at a benefit here, since we end up using move assignment anyway. It should be noexcept however.

arbor/util/spatial_tree.hpp Show resolved Hide resolved
@llandsmeer
Copy link
Contributor

I just came across this PR in my inbox, looks great! I tried to port the brunel network to it but ran in some small/minor issues, maybe this is useful:

  • seed is missing in the documentation for random (listed now as (random p:real))
  • it might be useful to list the 'return types' of the function in the documentation, eg. (random p:real) -> network-value, (gid-range ..) -> gid-range
  • there is a space missing (very minor) in the argument count error: No matches for 'if-else' with 3arguments:

Error reporting is a bit sparse in general (what is unknown, which of s, w, d is wrong?), eg I got this:

RuntimeError: error in label description: No matches for 'if-else' with 3arguments: (unknown unknown unknown)
  There are 1 potential candidates:
  Candidate 1  Returns the first network-value if a connection is the given network-selection and the second network-value otherwise. 3 arguments: (sel:network-selection, true_value:network-value, false_value:network_value) at :1:2

for what I thought would be a fine expression:

(if-else
        (source-cell (gid-range 0 400))
        (random 42 0.000125)
        (random 42 0.0005)
)

So the condition gets reported as unknown even though its the branches that are of the wrong type, the working expression being:

(random
        42
        (if-else
            (source-cell (gid-range 0 400))
            (scalar 0.000125)
            (scalar 0.0005)
        )
)

which was a bit counterintuitve. Maybe an if-else for network selections would be nice

@AdhocMan
Copy link
Contributor Author

AdhocMan commented Apr 9, 2024

Thanks @thorstenhater and @llandsmeer for the review.

For one, shifted_by(-1) subsumes chain_reverse. In general, it adds more power for the user at essentially zero cost to us. No harm in keeping chain and chain_reverse, while adding cyclic_shift and using it to implement the other two.

There is no answer field for this to respond somehow. The gid_range has an optional step size parameter. So I think together with chain_reverse this additional power is already there. For complete control, the option to provide a custom ordering through a list of indices is also there.

@llandsmeer I've fixed the issues you mentioned and improved the error message. It should now include the correct type names. The addition of return types in the documentation would be inconsistent to how we have documented these expressions so far. Since they are grouped by return type though, I think it's not quite necessary.

The example you showed is quite specific, since the distinction between the two selections is just a parameter. The DSL for selections is designed more generally. So one would usually use operations like join and intersect. For you example, this would be something like:

condition = "(source-cell (gid-range 0 400)"
true_sel = "(random 42 0.000125)"
false_sel = "(random 42 0.0005)"
selection = f"(join (intersect {condition} {true_sel}) (intersect (complement {condition}) {false_sel}))"

@@ -32,6 +32,19 @@ T constexpr area_circle(T r) {
return pi<T> * square(r);
}

template <typename T, typename U, typename = std::enable_if_t<std::is_unsigned_v<U>>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion here:

/// Exponentiation for integral exponents. Uses exponentiation by squaring
template <typename T, typename I, typename = std::enable_if_t<std::is_integral_v<I>>>
T constexpr pow(T base, I exp) {
    // 0^0 and 0^-n are undefined
    if (base == 0 && exp <= 0) return NAN;
    // trivial cases
    if (base == 0 || base == 1) return base;
    // negative exp
    if (exp < 0) {
       exp = -exp;
       base = 1/base;
    }
    // log2(exp)
    T acc = 1;
    for (exp > 0; exp >> 1) {
         if (exp & 1) {
              // NOTE: Could get cute here and eliminate the branch.
              acc *= base;
              exp -= 1;
         } 
         base *= base;
    }
    return acc;
}
  1. it's iterative instead of recursive
  2. allows more exponents
  3. Simpler test for odd numbers

Copy link
Contributor Author

@AdhocMan AdhocMan Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this version in general. But what if T is an integer type? Returning NAN would be a problem.

#include <arbor/network.hpp>

#include <Random123/threefry.h>
#include <Random123/boxmuller.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our use of R123 has grown and I have an open issue to use it in schedule, too. (#2243)
Thus, I'll use that opportunity to consolidate and make some thin abstractions.

@w-klijn
Copy link
Contributor

w-klijn commented Apr 10, 2024

Based from the discussion I know how complex the challenge is you have solved here. When reading the example code it is extremely impressive that you managed to get this in such a terse and succinct description.

The change introduces a complete new language with a complex grammar. The provided example with API does not do the work justice. I would strongly suggest to add a howto and more detail explanation how to use it. But, after merging as this functionality as it is is valuable and should not be lost.
Take the effort to add a howto and detailed explanation and guide, after the fact. This is a important change with great potential.

Non blocking remarks:

  1. install manual still uses pip (python3 has been default for 90% of the unix world). Not your problem though.

  2. The documentation throws a lot of things to the user. (https://arbor--2050.org.readthedocs.build/en/2050/concepts/interconnectivity.html)
    I would try start with explaining that a new language will be used to define networks. I would suggest to build up to the existing example in smaller steps.

a. Would it be possible to add a smallest trivial example? Connect two neurons together with this language?

b. I would suggest to explain what is expected in the return values. (return A.network_description(s, w, d, {}))
The single value output are not informative.
(potentially this is explain else, but provide a link to this location.

Observation
3. The C++ and python examples have a difference
p95 chain = ... vs c99 ring = ....
The assignment in c++ I think is conceptually soso, as at the first assignment the content of 'ring' is a chain

@AdhocMan
Copy link
Contributor Author

@w-klijn Thanks for your feedback!
The documentation certainly could be improved.

One important inclusion at the moment are intra-cell connections, so connections with the source and target on the same cell. To avoid generating such connections, one has to always include the (inter-cell) restriction, which might be easily forgotten.
I'm curious if you think intra-cell connections should be allowed or rather be excluded for ease of use?

@llandsmeer
Copy link
Contributor

Something that I couldn't find in the documentation/api which would be really useful, is obtaining the generated connectivity. How does that happen (if possible)?

@AdhocMan
Copy link
Contributor Author

You can generate network connections and inspect them by using the function generate_network_connections, which takes a recipe and optionally a distributed context and domain decomposition. It's mentioned briefly in the documentation, but was missing from the api documentation. I've added it now.

Any opinions on keeping the generation of intra-cell connections or excluding them to avoid having them accidentally generated?

Copy link
Contributor

@thorstenhater thorstenhater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @AdhocMan,

thanks for undertaking massive addition to Arbor. There's some smaller kinks
we'll work out over time, starting with a tutorial, but before we get into scope
creep territory let's merge this.

Thanks once again 🎉

@thorstenhater thorstenhater merged commit 689eea3 into arbor-sim:master May 22, 2024
23 of 25 checks passed
@brenthuisman
Copy link
Contributor

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support high-level network model specification
5 participants