Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CS GO crash every time it exits #166

Open
AnAkkk opened this issue Apr 18, 2015 · 37 comments
Open

CS GO crash every time it exits #166

AnAkkk opened this issue Apr 18, 2015 · 37 comments

Comments

@AnAkkk
Copy link

AnAkkk commented Apr 18, 2015

Since CS GO has been ported to linux, I've been running it through primusrun and always had a very annoying issue: it crashes every time I quit the game, which produces a core dump and freeze my laptop for up to 30s every time the game is exited.
I can't disable core dumps because I need them to debug crashes in applications I develop.

The backtrace (which doesn't have symbols) is the following:

0 0xeac140f6 in ?? ()

1 signal handler called

2 0xf7409546 in pthread_mutex_lock () from /usr/lib32/libpthread.so.0

3 0xf7180419 in ?? () from /usr/lib32/nvidia/libGL.so.1

4 0xf715ad0d in ?? () from /usr/lib32/nvidia/libGL.so.1

5 0xf715b4cc in ?? () from /usr/lib32/nvidia/libGL.so.1

6 0xf7154f68 in ?? () from /usr/lib32/nvidia/libGL.so.1

7 0xf7149e24 in glXDestroyPbuffer () from /usr/lib32/nvidia/libGL.so.1

8 0xf7532b04 in ?? () from /usr/lib32/primus/libGL.so.1

9 0xf7547c6a in ?? () from /usr/lib32/primus/libGL.so.1

10 0xf7547c5c in ?? () from /usr/lib32/primus/libGL.so.1

11 0xf7547c5c in ?? () from /usr/lib32/primus/libGL.so.1

12 0xf7547d1d in ?? () from /usr/lib32/primus/libGL.so.1

13 0xf758fc4c in __cxa_finalize () from /usr/lib32/libc.so.6

14 0xf752cc13 in ?? () from /usr/lib32/primus/libGL.so.1

15 0xf77af294 in _dl_fini () from /lib/ld-linux.so.2

16 0xf758f8c3 in __run_exit_handlers () from /usr/lib32/libc.so.6

17 0xf758f921 in exit () from /usr/lib32/libc.so.6

18 0xf757965a in __libc_start_main () from /usr/lib32/libc.so.6

19 0x08048645 in _start ()

@amonakov
Copy link
Owner

Please also show the console output if there's something else apart of Segmentation fault. Core dumped.

What is your distribution and libc version?

Please run the game with primusrun env LD_DEBUG=libs LD_DEBUG_OUTPUT=/tmp/ld-debug.txt and provide the resulting /tmp/ld-debug.txt file.

@AnAkkk
Copy link
Author

AnAkkk commented Apr 19, 2015

There's nothing else than Segmentation fault.

I'm on ArchLinux 64bit with libc 2.21.

It created 4 /tmp/ld-debug.txt.XXX files with different PIDs. Do you have an email where I can send them?

@amonakov
Copy link
Owner

My email is username@gmail.com — or you can put them into a Github gist.

@AnAkkk
Copy link
Author

AnAkkk commented Apr 19, 2015

I've sent them.
I've recompiled lib32-primus with debug symbols, this might be more helpful:

#0 0xeab620f6 in ?? ()
#1 signal handler called
#2 0xf7353546 in pthread_mutex_lock () from /usr/lib32/libpthread.so.0
#3 0xf70ca419 in ?? () from /usr/lib32/nvidia/libGL.so.1
#4 0xf70a4d0d in ?? () from /usr/lib32/nvidia/libGL.so.1
#5 0xf70a54cc in ?? () from /usr/lib32/nvidia/libGL.so.1
#6 0xf709ef68 in ?? () from /usr/lib32/nvidia/libGL.so.1
#7 0xf7093e24 in glXDestroyPbuffer () from /usr/lib32/nvidia/libGL.so.1
#8 0xf747cb34 in DrawableInfo::~DrawableInfo() () from /usr/lib32/primus/libGL.so.1
#9 0xf7491c8a in std::_Rb_tree<unsigned long, std::pair<unsigned long const, DrawableInfo>, std::_Select1st<std::pair<unsigned long const, DrawableInfo> >, std::less, std::allocator<std::pair<unsigned long const, DrawableInfo> > >::_M_erase(std::Rb_tree_node<std::pair<unsigned long const, DrawableInfo> >) ()
from /usr/lib32/primus/libGL.so.1
#10 0xf7491c7c in std::_Rb_tree<unsigned long, std::pair<unsigned long const, DrawableInfo>, std::_Select1st<std::pair<unsigned long const, DrawableInfo> >, std::less, std::allocator<std::pair<unsigned long const, DrawableInfo> > >::_M_erase(std::Rb_tree_node<std::pair<unsigned long const, DrawableInfo> >) ()
from /usr/lib32/primus/libGL.so.1
#11 0xf7491c7c in std::_Rb_tree<unsigned long, std::pair<unsigned long const, DrawableInfo>, std::_Select1st<std::pair<unsigned long const, DrawableInfo> >, std::less, std::allocator<std::pair<unsigned long const, DrawableInfo> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, DrawableInfo> >*) ()
from /usr/lib32/primus/libGL.so.1
#12 0xf7491d3d in PrimusInfo::~PrimusInfo() () from /usr/lib32/primus/libGL.so.1
#13 0xf74d9c4c in __cxa_finalize () from /usr/lib32/libc.so.6
#14 0xf7476c13 in __do_global_dtors_aux () from /usr/lib32/primus/libGL.so.1
#15 0xf76f9294 in _dl_fini () from /lib/ld-linux.so.2
#16 0xf74d98c3 in __run_exit_handlers () from /usr/lib32/libc.so.6
#17 0xf74d9921 in exit () from /usr/lib32/libc.so.6
#18 0xf74c365a in __libc_start_main () from /usr/lib32/libc.so.6
#19 0x08048645 in _start ()

@amonakov
Copy link
Owner

From the (privately sent) log it looks like glibc decides to run both nVidia and Mesa libGL destructors before primus. I suspect that is not supposed to happen. I'll try to look if anything goes wrong in glibc.

@AnAkkk
Copy link
Author

AnAkkk commented May 11, 2015

Have you had time to look?
I use primusrun in Dota 2 and TF2 too and I don't have the same issue, it seems to only happen in CS GO.

@AnAkkk
Copy link
Author

AnAkkk commented Jun 12, 2015

ping
Anything new here? :)

@tpruzina
Copy link

possibly related:

libGL DSO finalizer and pthreads

When a multithreaded OpenGL application exits, it is possible for libGL's DSO finalizer 
(also known as the destructor, or "_fini") to be called while other threads 
are executing OpenGL code. The finalizer needs to free resources allocated by libGL. 
This can cause problems for threads that are still using these resources. 
Setting the environment variable "__GL_NO_DSO_FINALIZER" to "1" will work around 
this problem by forcing libGL's finalizer to leave its resources in place. 
These resources will still be reclaimed by the operating system when the process exits.
Note that the finalizer is also executed as part of dlclose(3), 
so if you have an application that dlopens(3) and dlcloses(3) libGL repeatedly, 
"__GL_NO_DSO_FINALIZER" will cause libGL to leak resources until the process exits. 
Using this option can improve stability in some multithreaded applications, 
including Java3D applications.

http://us.download.nvidia.com/XFree86/Linux-x86_64/352.09/README/knownissues.html

@AnAkkk
Copy link
Author

AnAkkk commented Jun 14, 2015

Many thanks, this seem to fix the issue. Should this be included in primus by default?

@presianbg
Copy link

Same here. Any elegant way to solve this ?

@amonakov
Copy link
Owner

__GL_NO_DSO_FINALIZER simply hides the issue, so it's not appropriate to use it.

I now understand the issue: primus cannot expect that it can invoke functions from a shared library it dlopen'ed from its own destructors; since nVidia's constructors run after primus' (due to dlopen), it's actually natural that destructors are run before (i.e. in reverse order of constructors). Even though primus still has a handle to nVidia's dlopen'ed libGL, it doesn't "count" when destructors are run at exit.

Can you please test the following patch, ideally on multiple games, not just CS:GO? Sorry for taking so long, and thanks for your patience.

diff --git a/libglfork.cpp b/libglfork.cpp
index 03f514f..bb42f0d 100644
--- a/libglfork.cpp
+++ b/libglfork.cpp
@@ -259,6 +259,22 @@ static struct PrimusInfo {
   }
 } primus;

+static void cleanup()
+{
+  primus.drawables.clear();
+}
+
+static void register_cleanup_1()
+{
+  atexit(cleanup);
+}
+
+static void register_cleanup()
+{
+  static pthread_once_t once = PTHREAD_ONCE_INIT;
+  pthread_once(&once, register_cleanup_1);
+}
+
 // Thread-specific data
 static __thread struct {
   Display *dpy;
@@ -622,11 +638,6 @@ GLXContext glXCreateContextAttribsARB(Display *dpy, GLXFBConfig config, GLXConte
 void glXDestroyContext(Display *dpy, GLXContext ctx)
 {
   primus.contexts.erase(ctx);
-  // kludge: reap background tasks when deleting the last context
-  // otherwise something will deadlock during unloading the library
-  if (primus.contexts.empty())
-    for (DrawablesInfo::iterator i = primus.drawables.begin(); i != primus.drawables.end(); i++)
-      i->second.reap_workers();
   primus.afns.glXDestroyContext(primus.adpy, ctx);
 }

@@ -720,6 +731,7 @@ void glXSwapBuffers(Display *dpy, GLXDrawable drawable)
     di.actx = ctx;
     di.d.spawn_worker(drawable, display_work);
     di.r.spawn_worker(drawable, readback_work);
+    register_cleanup();
   }
   // Readback thread needs a sync object to avoid reading an incomplete frame
   di.sync = primus.afns.glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);

@presianbg
Copy link

@amonakov
Hi,

Thank you for the work you have done. Unfortunately I can't build pathced version of primus under Fedora 22. Can you guide me how to do it?

Best Regards,
Presian

@AnAkkk
Copy link
Author

AnAkkk commented Jun 17, 2015

This patch works fine on CS GO, CS 1.6 and TF2.

@presianbg
Copy link

@anakin1

Hi, any tips how to build patched primus on Fedora 22 ?

@AnAkkk
Copy link
Author

AnAkkk commented Jun 17, 2015

No, I'm on Arch Linux.

@presianbg
Copy link

@anakin1

So It should basically the same. Just point me the steps you follow to patch it.

Thank you in advance.

@AnAkkk
Copy link
Author

AnAkkk commented Jun 17, 2015

Well, not really, Arch Linux uses PKGBUILD. I don't know Fedora build system, but you need to use the patch command anyway:

patch -p1 -i "file"

@presianbg
Copy link

Yep, I'm familiar with the patching proccess. I can't build the patched version. But no problem, they should roll it out soon or later...

Cheers ;)

@gsgatlin
Copy link

for fedora 22 patched version see:

http://people.engr.ncsu.edu/gsgatlin/primus-1.1.03282015-2.fc22.src.rpm
and
http://people.engr.ncsu.edu/gsgatlin/primus-1.1.03282015-2.fc22.x86_64.rpm

link to specfile showing added patch.

http://fpaste.org/233006/42994143/

This package was build using "mock" on fedora 21 for fedora 22.

Hope that helps out your testing of this problem.

@presianbg
Copy link

@gsgatlin
Thank you :)

@amonakov
With the patched version above, CS:GO still making core dumbs:
root 3092 71.0 0.1 130084 20488 ? R 15:29 0:00 /usr/lib/systemd/systemd-coredump 2812 1000 1000 11 1434544161 csgo_linux

rpm -qa | grep primus
primus-1.0.07112014-1.fc22.i686
primus-1.1.03282015-2.fc22.x86_64

@amonakov
Copy link
Owner

If CS:GO is a 32-bit executable, you need new i686.rpm as well.

@gsgatlin
Copy link

@presianbg
Copy link

@amonakov and @gsgatlin

You guys are awesome! It's working 👍 To be honest, I was convinced that Steam games are 64-bit executables.

And again... Big thanks!

@gsgatlin
Copy link

Just to follow up, I tested the patched primus rpm with Minecraft, FEZ, Cogs, and dolphin-emu and did not see any problems.

@AnAkkk
Copy link
Author

AnAkkk commented Jun 25, 2015

@amonakov : can you please merge the fix?

@karolherbst
Copy link
Contributor

did anybody test it out with other games so far? Not that this one will break others.

@presianbg
Copy link

Tested on Plague:Inc, CS:GO, Dota 2 Source 1/2. Everything looks fine to me.

@AnAkkk
Copy link
Author

AnAkkk commented Nov 15, 2015

Anything new about this?

@AnAkkk
Copy link
Author

AnAkkk commented Dec 9, 2015

@amonakov ?

@ArchangeGabriel
Copy link

@amonakov Do you need more testing (I can try SuperTux kart on my setup for instance) or could this be merged?

@AnAkkk
Copy link
Author

AnAkkk commented Jan 21, 2016

It's been more than 6 months now...and there have been no commits since March 2015, it doesn't look like this project is still maintained :/

@gsgatlin
Copy link

I added this patch into the fedora rpm package back in July (I think) and it did not seem to cause any problems that I know about.

@karolherbst
Copy link
Contributor

yeah, no issues on my side either.

@tpruzina
Copy link

I guess it's time for somebody like @karolherbst to fork it and become new de-facto maintainer.
Or some competent package maintainer from some distro.

@karolherbst
Copy link
Contributor

nah I would rather spend time to improve nouveau, because prime offloading is the superior solution anyway.

@tpruzina
Copy link

@karolherbst guessed as much by looking at your nouveau commits and mailing list jitter.

@ArchangeGabriel
Copy link

Yes, but in the meantime, it’s nice to have a temporary working solution. But I agree @karolherbst is doing a great job on reclocking, plus OpenGL going well in mesa lately, I might be able to drop the proprietary driver soon.

@amonakov I’ve seen from you GitHub page that you’re still around. ;) Could you consider merging this and cleaning a bit the issue tracker? Or, in the event you’re not interesting in it anymore, which I can understand, could you envisage to transfer the repo to the Bumblebee Project organization? Thanks alot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants