Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide default pretty-printing at repl #314

Open
mars0i opened this issue May 3, 2017 · 6 comments
Open

Provide default pretty-printing at repl #314

mars0i opened this issue May 3, 2017 · 6 comments

Comments

@mars0i
Copy link

mars0i commented May 3, 2017

This is a followup to this conversation in the Numerical Clojure group.

I've been playing around with OCaml lately, using matrices from the Owl numerical library. Owl matrices automatically display themselves at the repl in a form that I find very convenient. (It's not perfect, but the flaws are insignificant. See examples below.)

I've also been struck by the fact that neither of implementations of sets in the two main OCaml standard libraries (Batteries and Core.Std) displays the contents of a set at the repl. Unlike Clojure sets, in OCaml you have to explicitly extract a set's elements in the form of a list in order to see them. (I submitted an issue about this for Batteries, so maybe it will change in the future.)

Both of these experiences have brought home to me how valuable it is to be able to easily see the contents of data structures when you're experimenting in a repl. I've come to feel that the fact that the contents of most Clojure data structures are automatically displayed in readable formats is something that makes a significant contribution to the value of the repl.

So if core.matrix automatically displayed matrices of all implementations in the same (or similar) convenient form(s) at the repl, this would provide a real benefit. At present, some matrix implementations display matrices in an inconvenient form (see Google group discussion), and some implementations, such as aljabr, don't display matrices' contents at all.

Explicitly using pm to display matrix contents is a workaround, but it's not a full solution. It's a glitch in the exploratory process at a repl.  core.matrix matrices should be like other Clojure data structures in this respect. (Well, maybe not precisely. Clojure data structures normally dump their entire contents to the terminal. I don't think that that should be the default for large matrices. See Owl example below.)

OCaml Owl example ("#" is the prompt):

# #require "owl";;
# Owl.Mat.uniform_int 4 4;;

   C0 C1 C2 C3
R0 42 85  3 21
R1 37 43  7 58
R2 70 69 90 14
R3 44 96 91  4

- : mat =
# Owl.Mat.uniform 4 4;;

         C0       C1       C2       C3
R0 0.171658 0.263432 0.449864 0.601548
R1 0.264161 0.998538 0.468655 0.331579
R2 0.405765 0.249231 0.351438 0.104658
R3 0.084066  0.23622 0.533686 0.726109

- : mat =
# Owl.Mat.uniform 1000 1000;;

           C0        C1        C2       C3        C4          C995      C996      C997     C998     C999
  R0 0.128934  0.881205 0.0442074 0.248657 0.0247644 ...  0.272193  0.877651  0.754746 0.821726 0.432518
  R1   0.7856  0.574938   0.31031 0.621463  0.218334 ...  0.372466 0.0338453 0.0184676 0.913124 0.216156
  R2 0.121451  0.992533  0.734909 0.442501  0.546771 ...  0.913111  0.768745  0.438555  0.51801 0.865787
  R3 0.117319 0.0247585  0.322614 0.724303  0.706684 ...  0.369522  0.498305  0.747255 0.947916  0.28679
  R4 0.444937  0.627634  0.265099 0.999857  0.812231 ...  0.532128  0.649032  0.975041 0.280382 0.550449
          ...       ...       ...      ...       ... ...       ...       ...       ...      ...      ...
R995 0.989908  0.887221  0.817898 0.756654  0.697814 ...  0.576036  0.356056   0.51701 0.432439 0.405795
R996 0.583019   0.75629  0.498756 0.390057  0.634644 ...  0.189142  0.940204   0.12525 0.186789 0.306112
R997 0.161567  0.134572  0.444717 0.705073  0.318954 ... 0.0109031   0.37995  0.863467 0.850773 0.606321
R998 0.338409  0.957307  0.501154 0.771503  0.778387 ...  0.792213 0.0681751  0.627889 0.662598 0.892282
R999 0.986906  0.252582  0.582292 0.402688 0.0976644 ...  0.716115  0.397157  0.668927 0.023836 0.541406

- : mat =
@mars0i
Copy link
Author

mars0i commented May 3, 2017

Later it occurred to me to add a few remarks on abbreviating large matrices. It seemed appropriate for a separate comment.

Although in OCaml, large data structures such as lists are truncated in the utop repl, in Clojure, this doesn't happen with any of the built-in data structures. I know of at least one plugin that will truncate lists with an ellipsis, but I find it annoying. When I evaluate a large list or a map at the repl, I know what I'm doing, and I often want to see what's at the end of it.

Why should matrices be treated differently in Clojure? Shouldn't large matrices be fully displayed if we're going to have a nice way of displaying of matrices by default? No, because matrices have a two-dimensional structure. That's their point. Dumping a large matrix into the space of a small terminal window just produces chaotic output that obscures relationships that are essential to a matrix. There's almost no point to it.

All of the common Clojure data structures are either linear (lists, sequences) or unstructured (sets, maps, records). Dumping them as a long line of text that wraps at the edge of the window is perfectly reasonable. Matrices are different. (You might call a map 2D, with key and value as one dimension, but it's such a short dimension that it's appropriate to display a map as if it's a sequence of pairs.)

(What about n-D arrays? How should they be displayed? I'm not sure, but the R language repl's method seems OK: Display a series of 2-D matrices, one on top of the other. In any event, I presume that 2-D matrices are more common than n-D arrays, and that asking for a nice two-dimensional display of something that has three or more dimensions is probably asking for too much. I'm not asking for that.)

@mikera
Copy link
Owner

mikera commented May 13, 2017

I don't think there is any way we can generically display the contents of a matrix / array at the REPL without changes to Clojure itself (unlikely to happen...). The reason for this is that an arbitrary object may implement the core.matrix protocols, and you have no way to guarantee they will all print in the same way. At best, you could establish a few optional conventions that implementations could follow.

Also, I think that it is pretty hard to define a standard way of printing that works for everyone, e.g.:

  • How to handle arrays that contain heterogenous types (images?)
  • Whether or not to print column names for arrays that are also datasets
  • At what point do you decide arrays are "too large" to print?
  • How do you format unusual / custom element types?

I'm all for improvements to pm though. The advantage of this approach is that it makes the options explicit, so people can control printing based on their use cases.

@mars0i
Copy link
Author

mars0i commented May 13, 2017

Thanks. I understand.

Just thinking out loud: Every implementation has to present some way of displaying matrices, although in some cases this may be done via a default that isn't very explicitly chosen. In some cases this display function seems to incorporate the result of toString (e.g. vectorz, ndarray, aljabar), but in others it doesn't (e.g. clatrix). If my arguments above make sense, could it make sense to have recommended guidelines for this function? Ideally, complying with the guideline should be as easy as possible--perhaps by incorporating the result of pm or a pm-like function into the output. The guidelines would have to be silent about the less common cases, I suppose, but for real and complex numbers, there could be a standard recommended format. Yes, one question would be what to do with column labels. I would suggest recommending including them if they exist, or maybe including short versions of them if they are long. An option would be to recommend printing partly numeric column labels for unlabeled matrices, as in the owl examples. Ideally truncation width and height could be user-definable. I suppose that's a role for a global variable. (This would allow users to set those variables by querying the terminal settings, if desired, but that's not the job of a core.matrix implementation. Users would have to do that on their own.)

I don't know how much sense this idea makes. I also know that adding an implementation isn't necessarily trivial, so there's a cost to adding another quasi-requirement. However, I feel that the benefit is significant. Clojure is a joy to use. I know I'm not the only person who feels that way. I'm suggesting a way to increase the convenience--the joy--of using core.matrix, which is already a pleasure to use.

@mars0i
Copy link
Author

mars0i commented May 14, 2017

I don't mean truly global variables. I had in mind something like the dynamic variables in implementations.cljc.

@mikera
Copy link
Owner

mikera commented May 14, 2017

I guess it would be possible to recommend that implementation override https://clojuredocs.org/clojure.core/print-method with pm to get consistent printing

@behrica
Copy link

behrica commented Oct 1, 2017

I have as well a bit of a problem by printing large matrices or data sets on the screen of the repl.

For me it is more the problem of the repl, as for example Emacs crashes, if I press "return" to quick and it wants to print a big dataset.
This is really annoying, but maybe not to be solved here.

I like a lot the behavior of "dplyr" datsets in R.
If an object is of type "tbl_df", it gets printed always very compact (first 10 rows, as fit size-by-size, others columns only as column names)

In my view the printing method of any data structure should not print more then fits on the screen.
I saw the implementation of a table https://github.com/cldwalker/table for printing table.

It tries to optimize the column with by finding out the terminal width...
I would print a table by cutting it at screen width and screen height (if detectable) and print the shape additionaly).

Maybe an additional protocol "pmos" (print-on-screen), could be envisioned which should be implemented and trying to avoid to print larger then screen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants