Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .clang-format #599

Merged
merged 3 commits into from
Feb 7, 2024
Merged

Conversation

halvko
Copy link
Contributor

@halvko halvko commented Dec 14, 2023

After a bunch of testing, this Clang Format configuration is quite close to what is done most of the time in this repository. The only thing I've found which it breaks consistently is when an UDT is instanciated inline, the braces are split:

struct bar
{ } foo;

becomes

struct bar
{
} foo;

and (maybe worse cause that's how it's often written in this project)

struct bar
{ }
// comment
foo;

becomes

struct bar
{
}
// comment
foo;

. This along with disallowing the rest of multiple operator seperated items in argument position to flow to the last line if they are neccesary to split giving:

throw invalid_argument("Adiar requires at least "
                       + std::to_string(minimum_memory / 1024 / 1024)
		       + " MiB of memory");

becomes

throw invalid_argument("Adiar requires at least "
                       + std::to_string(minimum_memory / 1024 / 1024) + " MiB of memory");

Except for that, all changes I've observed it making is within "normal" for the codebase. Currently (as of clang format 18) the first problem is unfixable - a PR to fix it had been stuck since 2020, and has since been closed. The second I have not been able to find a solution to.

In the name of not letting the perfect be the enemy of the good, I still propose adopting this. It "fixes" a lot of inconsistent formatting as it is now, and I think it is a nice developer experience to be able to write a piece of code, and have most of the formatting taken care of by a tool.

@halvko
Copy link
Contributor Author

halvko commented Dec 14, 2023

To try it out, checkout this branch, and call

clang-format <file-name> -i

to have it reformat .

@halvko halvko force-pushed the feature/format-config branch from 363dd56 to 9e535d9 Compare December 14, 2023 10:57
After a bunch of testing, this Clang Format configuration is quite close
to what is done _most of the time_ in this repository. The only thing
I've found which it breaks consistently is when an UDT is instanciated
inline, the braces are split:
```
struct bar
{} foo;
```
becomes
```
struct bar
{
} foo;
```
and (maybe worse cause that's how it's often written in this project)
```
struct bar
{}
// comment
foo;
```
becomes
```
struct bar
{
}
// comment
foo;
```
. This along with disallowing the rest of multiple operator seperated
items in argument position to flow to the last line if they are
neccesary to split giving:
```
throw invalid_argument("Adiar requires at least "
                       + std::to_string(minimum_memory / 1024 / 1024) + " MiB of memory");
```
instead of
```
throw invalid_argument("Adiar requires at least "
                       + std::to_string(minimum_memory / 1024 / 1024)
		       + " MiB of memory");
```

Except for that, all changes I've observed it making is within
"normal" for the codebase.
@halvko halvko force-pushed the feature/format-config branch from 9e535d9 to cbe623f Compare December 14, 2023 10:58
@SSoelvsten
Copy link
Owner

SSoelvsten commented Dec 15, 2023

Great work. That would definitely be a great addition - especially when others than me actually want to contribute. I'll play around with it and add some discrepancies below.

@SSoelvsten SSoelvsten added the 📁 build CMake, GitHub Actions etc. label Dec 15, 2023
@SSoelvsten SSoelvsten self-requested a review December 15, 2023 16:33
@SSoelvsten
Copy link
Owner

SSoelvsten commented Dec 15, 2023

Conflicting with Emacs

Overall the GNU style is Emacs-compatible. So, if there are settings in the GNU clang-format file that conflicts with this then the GNU setting would be the safest to use. The exception would be all settings that can be resolved with a 'newline' character (Emacs does not add newlines, but has an "intelligent" indentation). In general, I'm ok with being more GNU-like at the compromise of some parts of the codebase (which I indented manually).

Note, specifically indentation to make binary operations (e.g. + and &&) align have been done by-hand and Emacs disagrees.

Indentation of Member list

-    builder_ptr(const internal::node::uid_type &p,
-                const shared_ptr<const builder_shared> &sp)
-      : uid(p), builder_ref(sp)
-    { }
+    builder_ptr(const internal::node::uid_type& p, const shared_ptr<const builder_shared>& sp)
+        : uid(p), builder_ref(sp)
+    {}

Indentation is here 4 spaces rather than 2.

Comma in Member Initialisation List

In src/adiar/internal/data_structures/levelized_priority_queue.h

-    levelized_priority_queue(tpie::memory_size_type memory_given, size_t max_size,
-                             [[maybe_unused]] statistics::levelized_priority_queue_t &stats)
-      : _max_size(max_size)
-      , _memory_given(memory_given)
-      , _memory_for_buckets(memory_given - _memory_occupied_by_merger - mem_overflow_queue(memory_given))
-      , _memory_occupied_by_overflow(mem_overflow_queue(memory_given))
-      , _overflow_queue(mem_overflow_queue(memory_given), max_size)
+    levelized_priority_queue(tpie::memory_size_type memory_given,
+                             size_t max_size,
+                             [[maybe_unused]] statistics::levelized_priority_queue_t& stats)
+        : _max_size(max_size),
+          _memory_given(memory_given),
+          _memory_for_buckets(memory_given - _memory_occupied_by_merger
+                              - mem_overflow_queue(memory_given)),
+          _memory_occupied_by_overflow(mem_overflow_queue(memory_given)),
+          _overflow_queue(mem_overflow_queue(memory_given), max_size)

Here, Emacs would de-indent all but the first member initialisation by two. This is why, I move the , to the start of the line, because then Emacs nicely aligns everything.

Unexpected 4-space indentations

In src/adiar/internal/algorithms/reduce.h. Is this related with trying to align the statement with return?

       adiar_assert(!arcs.can_pull_internal()
-                   || current_map.old_uid == arcs.peek_internal().target(),
       adiar_assert(!arcs.can_pull_internal()
+                       || current_map.old_uid == arcs.peek_internal().target(),
                    "Mapping forwarded in sync with internal arcs");

And in src/adiar/internal/algorithms/prod2.h. Here, it adds an odd additional indentation.

   using prod_priority_queue_1_t =
-    levelized_node_priority_queue<prod2_request<0>, request_data_first_lt<prod2_request<0>>,
-                                  look_ahead,
-                                  mem_mode,
-                                  2,
-                                  0>;
   using prod_priority_queue_1_t =
+      levelized_node_priority_queue<prod2_request<0>,
+                                    request_data_first_lt<prod2_request<0>>,
+                                    look_ahead,
+                                    mem_mode,
+                                    2,
+                                    0>;

And in src/adiar/internal/data_structures/levelized_priority_queue.h

     ////////////////////////////////////////////////////////////////////////////
     /// \brief Total number of data structures in Levelized Priority Queue.
     ////////////////////////////////////////////////////////////////////////////
     static constexpr size_t data_structures =
-      buckets * sorter_t::data_structures + priority_queue_t::data_structures;
     ////////////////////////////////////////////////////////////////////////////
     /// \brief Total number of data structures in Levelized Priority Queue.
     ////////////////////////////////////////////////////////////////////////////
     static constexpr size_t data_structures =
+        buckets * sorter_t::data_structures + priority_queue_t::data_structures;

Emacs would in all of these cases only indent the line with two spaces.

Struct Initialisation

In src/adiar/statistics.cpp, the indentation on the struct misaligns with Emacs.

-    return {
-      // i/o
-      internal::stats_arc_file,
-      internal::stats_node_file,
-
-      // data structures
-      internal::stats_levelized_priority_queue,
-
-      // top-down sweeps
-      internal::stats_count,
-      internal::stats_equality,
-      internal::stats_intercut,
-      internal::stats_optmin,
-      internal::stats_prod2,
-      stats_prod3,
-      internal::stats_quantify,
-      internal::stats_select,
-
-      // bottom-up sweeps
-      internal::stats_reduce,
-
-      // other algorithms
-      internal::nested_sweeping::stats
-    };
+    return {// i/o
+            internal::stats_arc_file,
+            internal::stats_node_file,
+
+            // data structures
+            internal::stats_levelized_priority_queue,
+
+            // top-down sweeps
+            internal::stats_count,
+            internal::stats_equality,
+            internal::stats_intercut,
+            internal::stats_optmin,
+            internal::stats_prod2,
+            stats_prod3,
+            internal::stats_quantify,
+            internal::stats_select,
+
+            // bottom-up sweeps
+            internal::stats_reduce,
+
+            // other algorithms
+            internal::nested_sweeping::stats};

The same also applies to initialisation lists. For example, terminals_above_bottom[2] in src/adiar/internal/io/node_writer.h has a similar behaviour. The }; should preferably be on the next line (both aesthetically and to decrease the git-diff and decrease merge conflicts) and only be indented by 2 spaces.

Odd indentation for function call

In src/adiar/internal/data_structures/levelized_priority_queue.h.

-    static_assert(ptr_uint64::max_label+1 > ptr_uint64::max_label,
-                  "'ptr_uint64::label_type' should leave a window of at least one above 'max_label'");
+    static_assert(
+        ptr_uint64::max_label + 1 > ptr_uint64::max_label,
+        "'ptr_uint64::label_type' should leave a window of at least one above 'max_label'");

Emacs would in this case even with the 'newline' added indent it up to the ( parentheses.

Possible Compromises

Aligning Operators/Assignment

The following is an example (from src/adiar/internal/algorithms/pred.h) of how Emacs usually aligns things. Personally, I prefer the setting you have set. I just want to note, that if you cannot fix anything else without breaking this, then I am ok with that.

     const size_t aux_available_memory = memory_available()
-      // Input
-      - 2*node_stream<>::memory_usage()
-      // Level checker policy
-      - comp_policy::level_check_t::memory_usage();
     const size_t aux_available_memory = memory_available()
+                                      // Input
+                                      - 2 * node_stream<>::memory_usage()
+                                      // Level checker policy
+                                      - comp_policy::level_check_t::memory_usage();

@SSoelvsten
Copy link
Owner

SSoelvsten commented Dec 15, 2023

Other (Personal Preference)

One-liner Constructors

In src/adiar/builder.h

-    builder() noexcept
-    { }
+    builder() noexcept {}

In src/adiar/exec_policy.h

-    exec_policy(const access &am)
-      : _access_mode(am)
-    { }
+    exec_policy(const access& am) : _access_mode(am) {}

Can the constructors always have line breaks after the arguments and the member initialisation list?

Line Breaks on Binary Operations

In src/adiar/exec_policy.h.

-      return this->memory_mode()  == ep.memory_mode()
-          && this->access_mode()  == ep.access_mode()
-          && this->quantify_alg() == ep.quantify_alg()
-        ;
+      return this->memory_mode() == ep.memory_mode() && this->access_mode() == ep.access_mode()
+          && this->quantify_alg() == ep.quantify_alg();

Similarly, in src/adiar/statistics.cpp

     uintwide total_runs = internal::stats_prod2.trivial_file
-                        + internal::stats_prod2.trivial_terminal
-                        + internal::stats_prod2.ra.runs
-                        + internal::stats_prod2.pq.runs;
     uintwide total_runs = internal::stats_prod2.trivial_file
+                        + internal::stats_prod2.trivial_terminal + internal::stats_prod2.ra.runs
+                        + internal::stats_prod2.pq.runs;

If there is a line break, then I would prefer it breaks all of them. The idea being, that the 'vertical' dimension is the conditions and the horizontal then their content. But, it is fair, if that is not possible.

Nested Ternaries

In src/adiar/internal/algorithms/nested_sweeping.h.

-        const size_t stop_level = terminal_stop_level == outer_pq_t::no_label    ? outer_roots_stop_level
-                                : outer_roots_stop_level == outer_pq_t::no_label ? terminal_stop_level
-                                : std::max(terminal_stop_level, outer_roots_stop_level)
-          ;
+        const size_t stop_level = terminal_stop_level == outer_pq_t::no_label
+                                    ? outer_roots_stop_level
+                                : outer_roots_stop_level == outer_pq_t::no_label
+                                    ? terminal_stop_level
+                                    : std::max(terminal_stop_level, outer_roots_stop_level);

While not a pattern-matching formatting, it is very readable. I like it! But, the indentation should probably only be 2 rather than 4 to be consistent with the rest.

Doubly-braced Initialisation

In src/adiar/internal/node_writer.h, the {{ is because it initializes a std::array (the class is the outer-most {...} with an initializer list (the inner-most {...}.

-      const cuts_t all_arcs_cut = {{
-          noa_overflow_safe ? number_of_arcs - number_of_false - number_of_true : cut::max,
-          noa_overflow_safe ? number_of_arcs - number_of_true                   : cut::max,
-          noa_overflow_safe ? number_of_arcs - number_of_false                  : cut::max,
-          noa_overflow_safe ? number_of_arcs                                    : cut::max
-        }};
+      const cuts_t all_arcs_cut = {
+          {noa_overflow_safe ? number_of_arcs - number_of_false - number_of_true : cut::max,
+           noa_overflow_safe ? number_of_arcs - number_of_true : cut::max,
+           noa_overflow_safe ? number_of_arcs - number_of_false : cut::max,
+           noa_overflow_safe ? number_of_arcs : cut::max}};

Copy link
Owner

@SSoelvsten SSoelvsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have exhausted all observations on its changes (noted above). Overall, it does a great job! Most of the changes makes older code align with styling I've adopted recently but not applied everywhere.

If any of the above can be addressed, then that would be greatly appreciated.

@SSoelvsten
Copy link
Owner

SSoelvsten commented Dec 15, 2023

I have looked into how to integrate clang-format into CMake. I have pushed the changes to do so (no conflicts with the config, so you can easily rebase any fixups.

@SSoelvsten SSoelvsten force-pushed the feature/format-config branch from 510222e to 31f6796 Compare December 16, 2023 00:31
@SSoelvsten SSoelvsten merged commit e3ee8f3 into SSoelvsten:main Feb 7, 2024
42 of 53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📁 build CMake, GitHub Actions etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants