builtin: concurrency as builtin functions #224

katcipis · 2017-05-30T21:08:23Z

I thought about modeling our concurrency primitives as function, instead of syntactic constructions or overloading the rfork call. The idea would be to have two builtin functions:

go()
channel()

The go function will receive a function as parameter and execute it concurrently, like the go keyword:

go(fn() {
    # Do concurrent stuff
})

The semantics would be fairly similar to Go's (as usual =)).

The trick is on modeling channels as functions, with the channel function returning 3 values, the receive, the send, and the close functions:

receive, send, close <= channel()

The receive and send functions works in pair, calling receive blocks until send is called, and vice-versa.

If the idea makes sense, we need to define how to support buffered channels, and semantics on closed channels. It could be simpler than Go's and not have any kind of select magic, just to enable very simple usages.

One that is very common is when you want to wait for N operations to end, it could be something like this (I'll assume integers to make it simpler):

receive, send, close <= channel()
tasks = 10

for i = 0; i < $tasks; i++ {
    go(fn(){
        _, status <= cmd arg
        send($status)
     })
}

for i = 0; i < $tasks; i++ {
        status <= receive()
        print($status)
     })
}

close()

Lot of this code is boilerplate and further functions could make it even simpler, specially to when you just want to exec N concurrent versions of a command and wait for all to end, it could be modeled as a helper function on the stdlib.

This idea is on EXTREME draft phase =)

The text was updated successfully, but these errors were encountered:

ppizarro · 2017-05-31T11:38:15Z

Why not "go fn()"? What about if I have a function called go?

…

On May 30, 2017 18:08, "Tiago César Katcipis" ***@***.***> wrote: I thought about modeling our concurrency primitives as function, instead of syntactic constructions or overloading the rfork call. The idea would be to have two builtin functions: - go() - channel() The *go* function will receive a function as parameter and execute it concurrently, like the go keyword: go(fn() { # Do concurrent stuff }) The semantics would be fairly similar to Go's (as usual =)). The trick is on modeling channels as functions, with the *channel* function returning 3 values, the receive, the send, and the close functions: receive, send, close <= channel() The *receive* and *send* functions works in pair, calling *receive* blocks until *send* is called, and vice-versa. If the idea makes sense, we need to define how to support buffered channels, and semantics on closed channels. It could be simpler than Go's and not have any kind of select magic, just to enable very simple usages. One that is very common is when you want to wait for N operations to end, it could be something like this (I'll assume integers to make it simpler): receive, send, close <= channel() tasks = 10 for i = 0; i < $tasks; i++ { go(fn(){ _, status <= cmd arg send($status) }) } for i = 0; i < $tasks; i++ { status <= receive() print($status) }) } close() Lot of this code is boilerplate and further functions could make it even simpler, specially to when you just want to exec N concurrent versions of a command and wait for all to end, it could be modeled as a helper function on the stdlib. This idea is on EXTREME draft phase =) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <https://github.com/NeowayLabs/nash/issues/224>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJdUdRDyQ86Wsy8Ul90qEBQDRcDcSDjks5r_IVHgaJpZM4Nq2j9> .

katcipis · 2017-05-31T13:41:36Z

What about a function called exit ? It will happen the same thing that happens when you create a name that shadows a builtin name, it will be shadowed, because you did it explicitly in your code.

If we introduce a "go" keyword, how would you compile Go code in nash ? =P

We could come up with some clever way to eliminate ambiguity, but I'm not sure if it is worth it.

i4ki · 2017-05-31T18:16:07Z

I really enjoyed the idea. But if I'm not mistaken, your code example will block forever because close is being called after the receiver for loop. Close must send a signal to receivers that no more data will come up, then it must be called after the sender for and before the reading (the other for).

Regarding buffered channels, maybe this could be a parameter of the channel() function.

s, r, c <= channel(10)

But I think it will be very handy with channel type (instead of backing functions).

Select semantics can be achieve without built-ins only if we add support for non-blocked read/send's. We can make the channel function the power to create both blocking and non-blocking channels, or we can simple make the send/recv functions receive an additional parameter indicating if the send/recv must block.

For example:

fn getchan() {
    # functions returned supports a parameter indicating if they must block
    s, r, c <= channel() 
    return [$s, $r, $c]
}

# selectRead will recv from a number of channels. If def is not passed then it will
# block until some receive succeed, otherwise def function is invoked and returns.
fn selectRead(alt, def...) {
    defFn = ""
    if len($def) > 0 {
        defFn <= $def[0]
    }
    val = ""
    # something like select
    # shufle(alt) # optional
    for {
        for a in $alt {
            # false tells the recv to not block
            val, errcode <= $a["recv"](false) 
            if $errcode == 0 { # receive success
                return $a["fn"]($val)
            }
            if $errcode == 1 { # channel closed
               return false
            }
            # $errcode with any other value indicates 
            # that it's not ready, no data...
            if #defFn != "" {
                return $defFn()
           }
            # continue the loop
        }
        sleep 0.001
    }
    unreachable
}

c1 <= getchan()
c2 <= getchan()

# writers
go(fn() { write($c1) })
go(fn() { write($c2) })

# print the results

# Alt structure contains the recv channel funcs and callbacks to be executed
# when data arrives.
alt = [{
    "recv": $c1[1],
    "fn": $print,
}, {
    "recv": $c2[1],
    "fn": $print,
}]

for {
    ok <= selectRead($alt)
    if !$ok {
        break
    }
}

print("Done")

The example expects our new syntax changes.

That's much like the way that select is implemented in Plan9's C libc. Take a look in the Alt structure and documentation:
http://man.cat-v.org/plan_9/2/thread

katcipis · 2017-05-31T18:23:57Z

But if I'm not mistaken, your code example will block forever because close is being called after the receiver for loop

The example is terrible =P, but although being terrible it does not seem as a case of locking forever. To every send there is a matching read call, N to N, no one will get blocked. Only if one of the senders panics/explodes before sending. The close call was added just to show it...not really necessary.

I'll take a look on the rest of the ideas soon =)

i4ki · 2017-05-31T18:32:23Z

Ah, got it. My bad.. I thought this because of the close in the end.

i4ki · 2017-05-31T18:35:29Z

The example I made is terrible also, only to illustrate that select could not be required in the first version. Someone can use something like that if really need this, but it's very ugly..

The Alt in plan9 is very very hard to understand and use.. A hack to circunvent missing features of C.

ppizarro · 2017-06-03T13:21:09Z

Then I think that the problem is another. How can I distinguish a built-in name from a shell command?

…

On May 31, 2017 10:41 AM, "Tiago César Katcipis" ***@***.***> wrote: What about a function called exit ? It will happen the same thing that happens when you create a name that shadows a builtin name, it will be shadowed, because you did it explicitly in your code. If we introduce a "go" keyword, how would you compile Go code in nash ? =P We could come up with some clever way to eliminate ambiguity, but I'm not sure if it is worth it. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/NeowayLabs/nash/issues/224#issuecomment-305189984>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJdUd_4HlYS6i_Tg4avKBy0TW64fbm6ks5r_W4RgaJpZM4Nq2j9> .

i4ki · 2017-06-03T14:10:50Z

command names and functions have different syntaxes.

exit   # parsed as a command with name 'exit'
exit() # parsed as a function call

katcipis · 2017-06-03T15:33:13Z

@ppizarro to implement your idea the principle is the same as what we use today, the problem is only that we have to look ahead further on the parser, and do a more aggressive backtracking, which makes the parser considerably more complex. But it is not impossible.

@tiago4orion can give more info or point out if I'm saying something stupid =)

ppizarro · 2017-06-03T22:17:59Z

Ok, we can distinguish functions from commands but how can we distinguish commands from reserved words? go func() go build go() true /bin/true (command) for /bin/for (command)

…

On Jun 3, 2017 11:10, "Tiago Natel de Moura" ***@***.***> wrote: command names and functions have different syntaxes. exit # parsed as a command with name 'exit' exit() # parsed as a function call — You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/NeowayLabs/nash/issues/224#issuecomment-305977410>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJdUdajyigrsLwxGZwHELlyzImkQeqHks5sAWlqgaJpZM4Nq2j9> .

ppizarro · 2017-06-03T22:22:44Z

What do you think about this? $(command .....) $(go build) go func() go()

…

On Jun 3, 2017 12:33, "Tiago César Katcipis" ***@***.***> wrote: @ppizarro <https://github.com/ppizarro> to implement your idea the principle is the same as what we use today, the problem is only that we have to look ahead further on the parser, and do a more aggressive backtracking, which makes the parser considerably more complex. But it is not impossible. @tiago4orion <https://github.com/tiago4orion> can give more info or point out if I'm saying something stupid =) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/NeowayLabs/nash/issues/224#issuecomment-305982221>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJdUeQZ2hD8C-V3CGqBqa9LoIs6wdL-ks5sAXy5gaJpZM4Nq2j9> .

i4ki · 2017-06-04T11:47:58Z

@ppizarro

Keywords are emitted earlier by lexer, but their mean is deferred to parser.
There are two rules for their meaning:

1- If the keyword happens in a command argument, then it has no special meaning and parse into an unquoted ast.StringExpr.
2- If the keyword happens outside command arguments, then it has a proper syntax and cannot be used as identifiers for variables or functions.

Examples:

# keyword used in command argument
λ> echo for
for
λ> echo for fn rfork
for fn rfork

# keyword used outside arguments
λ> fn for(a) { }
ERROR: <stdin line 74>:1:3: Unexpected token for. Expected '('

What is an argument?
https://github.com/NeowayLabs/nash/blob/master/parser/parse.go#L1471

Parsers for keywords:
https://github.com/NeowayLabs/nash/blob/master/parser/parse.go#L40

This special syntax for command arguments makes every shell a non-context-free language.

About the syntax change you propose, I don't think it is good because it will harder the everyday use of shell in interactive mode.

λ> $(ls /etc)
λ> $(go build)
...

We need to balance the syntax to cli and scripts. It will never be pleasant to both cases, unfortunately, because of shell's nature (everything could be an argument value).

Let me know if this is useful, I can give more details.

katcipis · 2017-06-27T23:00:13Z

I was thinking about select as a function too, what about something like (assuming maps exists):

receive, send, close <= channel()

select({
    "reader": receive,
    "call": fn() {
        data <= receive()
    },
},
{
    "writer": send,
    "call": fn() {
        $send("hi")
    },
},
{
    "default": fn() {
        echo "nothing ready"
    },
},
)

We used the send and receive functions of the channel as instances that can be used on select to identify the channel + direction.

katcipis · 2017-10-16T19:33:08Z

I was thinking about what @tiago4orion said about being a little hard implementing concurrency since nothing is ready for all the races that are going to happen. I added this to my frustration of working with languages that even have the possibility of races, because they have the possibility of shared state. I'm aware this is a tradeoff for efficiency in some cases, but I think this is not the case for nash (we dont need to be stupidly slow but I dont see it as a system programming language =P).

Having that in mind this was my first draft (the idea is very new and may have obvious holes in it, that is why I'm sharing) of a different approach to concurrency.

We keep the idea of using functions, but what if when you call the go function the function passed to it will run completely isolated from anything else ? It is a very lightweight concurrent unit but completely isolated from everything else (like Erlang processes).

But different from Erlang we could still use CSP, but for this to make sense we would need to obligate every call to create a channel to communicate with the concurrent function being executed, I imagined something like this:

chan <= go(fn(chan) {
    # receive stuff
    a <= receive($chan)
})
# send stuff
send($chan, "lala")

# waiting the go routine to end is to wait the channel to be closed
wait($chan)

The wait function could accept a vargs, making it very easy to do simple fan-in's.

There is some holes...like what happens if you manually closes the channel...not sure if this channel closed semantic to mean that the goroutine has ended is a good idea...I just wanted an easy way to do that because the main use cases that we have on infrastructure building is basically running a lot of crap concurrently and waiting it to end.

Or we could see the channel closed semantics as a way of knowing that the job is done...not that you can be SURE that the concurrent execution actually ended. But it would be useful to implement logic that guarantees a close if the function ended.

Anyway...there will be some loose ends...do the idea seems interesting enough to be pursuit-ed ?

i4ki · 2017-10-20T23:27:19Z

I like the idea, but I'll stretch it a little bit =) Why then coroutines? If there are no sharing, and performance isn't a requirement, then why design concurrency with a thread model (sharing) ? We could use processes and message passing over some protocol (rpc, unix, tcp, shm, etc). Rfork in nash currently does exactly that but without concurrency (using unix sockets as message transport):

A = "some value"

rfork u {
    echo $A  # error, A isnt defined
}

Because using rfork we start a new process that sets up the right linux namespaces.
If we are going to this path (non-sharing threads), then I think is a good time to unify nash execution models and achieve one-way of get concurrent units.

Changing rfork syntax and semantics to be concurrent (maybe also the name) and adding the way to communicate by channels (over an reliable transport) is the most effective and simple approach in my opinion.
What do you think?

katcipis · 2017-10-27T00:56:27Z

I think this is a GREAT idea =), specially because we are not that concerned with performance (the only loss comparing to Go/Erlang model). No sharing will be enforced by the OS this way =D.

Lets start a design doc (like vars) with some of the syntax that we already discussed for rfork =)

katcipis · 2017-11-10T20:57:57Z

Design doc started: https://github.com/NeowayLabs/nash/pull/247

i4ki added the enhancement label May 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

builtin: concurrency as builtin functions #224

builtin: concurrency as builtin functions #224

katcipis commented May 30, 2017

ppizarro commented May 31, 2017 via email

katcipis commented May 31, 2017

i4ki commented May 31, 2017 •

edited

Loading

katcipis commented May 31, 2017

i4ki commented May 31, 2017

i4ki commented May 31, 2017 •

edited

Loading

ppizarro commented Jun 3, 2017 via email

i4ki commented Jun 3, 2017

katcipis commented Jun 3, 2017

ppizarro commented Jun 3, 2017 via email

ppizarro commented Jun 3, 2017 via email

i4ki commented Jun 4, 2017 •

edited

Loading

katcipis commented Jun 27, 2017

katcipis commented Oct 16, 2017

i4ki commented Oct 20, 2017

katcipis commented Oct 27, 2017

katcipis commented Nov 10, 2017

builtin: concurrency as builtin functions #224

builtin: concurrency as builtin functions #224

Comments

katcipis commented May 30, 2017

ppizarro commented May 31, 2017 via email

katcipis commented May 31, 2017

i4ki commented May 31, 2017 • edited Loading

katcipis commented May 31, 2017

i4ki commented May 31, 2017

i4ki commented May 31, 2017 • edited Loading

ppizarro commented Jun 3, 2017 via email

i4ki commented Jun 3, 2017

katcipis commented Jun 3, 2017

ppizarro commented Jun 3, 2017 via email

ppizarro commented Jun 3, 2017 via email

i4ki commented Jun 4, 2017 • edited Loading

katcipis commented Jun 27, 2017

katcipis commented Oct 16, 2017

i4ki commented Oct 20, 2017

katcipis commented Oct 27, 2017

katcipis commented Nov 10, 2017

i4ki commented May 31, 2017 •

edited

Loading

i4ki commented May 31, 2017 •

edited

Loading

i4ki commented Jun 4, 2017 •

edited

Loading