Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix some issue that some var use without lock #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

iamyh
Copy link

@iamyh iamyh commented May 3, 2018

解决了以下的问题:

  1. 一些变量读写并没有加锁保护
  2. 一些地方出现error,应该return而没有return
  3. 建议用同一的方法打日志,而不是fmt.Printf和自定义方法MyPrint混用(可能还存在其他地方)

一些建议:

  1. node那个struct由于有好几个属性变量,但是node只有一个lock,是否可以按照成员属性变量对于有一个锁,进而提高效率?
  2. 建议对属性变量的修改,涉及到锁保护的,都封装为函数,尽可能很简短的函数。函数里面统一加上mu.lock()和defer mu.Unlock(),否则非常容易出现死锁,尤其在node.go那个文件,里面有些函数比较长的情况。

@arcolife
Copy link
Collaborator

arcolife commented May 4, 2018

let me take a look. Could you please use English for future communication? thanks.

@iamyh
Copy link
Author

iamyh commented May 5, 2018

ok.

@rnx211 rnx211 force-pushed the master branch 4 times, most recently from 5679ff7 to 40c89ec Compare May 15, 2018 12:02
@arcolife
Copy link
Collaborator

arcolife commented May 15, 2018

oh just checked, once in about 5 times, for i in {1..5}; do go build engine.go && ./engine ; done ..this would still error out with:

...
<snip>
...

goroutine 1535 [semacquire]:
sync.runtime_SemacquireMutex(0xc42021447c, 0xc420374500)
        /usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc420214478)
        /usr/local/go/src/sync/mutex.go:134 +0x108
_/workspace/truechain-consensus-core/pbft-core.(*Node).incCommDict(0xc420214400, 0xc420899380, 0x80)
        /workspace/truechain-consensus-core/pbft-core/node.go:690 +0x3a
_/workspace/truechain-consensus-core/pbft-core.(*Node).processCommit(0xc420214400, 0x1, 0x1b, 0x0, 0x2, 0xc420899380, 0x80, 0xffffffffffffffff, 0xc420899400, 0x80, ...)
        /workspace/truechain-consensus-core/pbft-core/node.go:838 +0xa3
_/workspace/truechain-consensus-core/pbft-core.(*Node).ProxyProcessCommit(0xc420214400, 0x1, 0x1b, 0x0, 0x2, 0xc420899380, 0x80, 0xffffffffffffffff, 0xc420899400, 0x80, ...)
        /workspace/truechain-consensus-core/pbft-core/node.go:456 +0x156
reflect.Value.call(0xc42012cc60, 0xc420248010, 0x13, 0x138a9ed, 0x4, 0xc420239f18, 0x3, 0x3, 0xc4202f3e00, 0x0, ...)
        /usr/local/go/src/reflect/value.go:447 +0x969
reflect.Value.Call(0xc42012cc60, 0xc420248010, 0x13, 0xc42077af18, 0x3, 0x3, 0x0, 0x0, 0x0)
        /usr/local/go/src/reflect/value.go:308 +0xa4
net/rpc.(*service).call(0xc420242040, 0xc420244000, 0xc4200182e0, 0xc4200182f0, 0xc42024a100, 0xc4202b1b60, 0x1334280, 0xc4208803c0, 0x199, 0x12f6400, ...)
        /usr/local/go/src/net/rpc/server.go:384 +0x14e
created by net/rpc.(*Server).ServeCodec
        /usr/local/go/src/net/rpc/server.go:480 +0x43a

i'll take a stab at it!
https://github.com/truechain/truechain-consensus-core/blob/devel/pbft-core/node.go#L504-L507

@iamyh
Copy link
Author

iamyh commented May 18, 2018

which branch is it? i can not match the code.In my env,i just run go build engine.go && ./engine, it is ok.

@arcolife
Copy link
Collaborator

@iamyh yea but did you run for i in {1..5}; do go build engine.go && ./engine ; done and see if it fails even once? I've checked out your PR locally and ran this.

@arcolife
Copy link
Collaborator

arcolife commented May 24, 2018

included in 50cfb07 while merging devel.

Please, let's rebase and continue the discusison

@iamyh
Copy link
Author

iamyh commented May 24, 2018

@arcolife hey,when i run go build engine.go and for i in {1..5}; do ./engine>> log 2>&1 ; with master branch,it will also come out 'goroutine stack error'.

when i search log file with keyword goroutine, it show 'concurrent map writes'. it means there is so much bug in node.go,such as

func (nd *Node) newClientRequest(req Request, clientId int) { // TODO: change to single arg and single reply
	if v, ok := nd.active[req.Dig]; ok {

should be locked like:

var v ActiveItem
var ok bool
nd.mu.Lock()
v, ok = nd.active[req.Dig]
nd.mu.Unlock()

there are so many code without lock.please check it carefully.because node only has one lock mu,it will result in performance down,that is why i advise use muti lock and readwrite lock。

@hixichen
Copy link
Contributor

hixichen commented Jun 3, 2018

@iamyh , hi , could you please try with sync.Map to replace the nd.active map.
https://golang.org/pkg/sync.

please make sure your golang version is above 1.9.

@iamyh
Copy link
Author

iamyh commented Jun 3, 2018

@hixichen my go version is go version go1.9.2 darwin/amd64

@arcolife
Copy link
Collaborator

arcolife commented Jun 5, 2018

@iamyh Hi could you please join the invite from https://gitter.im/truechain-net/engg-foss-global
thanks

@arcolife arcolife force-pushed the master branch 3 times, most recently from fe17262 to 238aef5 Compare July 10, 2018 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants