Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not relay proxify after a while #99

Open
reith opened this issue Mar 4, 2017 · 17 comments
Open

Does not relay proxify after a while #99

reith opened this issue Mar 4, 2017 · 17 comments

Comments

@reith
Copy link

reith commented Mar 4, 2017

epoll_wait call will not finish after a while. It's probably something with libevent. My setup runs redsocks instance by an upstart script configured as follow:

base {
    log_debug = on;
    log_info = on;
    log = stderr;
    daemon = off;
    redirector = iptables;
}

redsocks {
    type = socks5;
    local_ip = 0.0.0.0;
    local_port = 9053;
    ip = 192.168.146.95;
    port = 14695;
    login = "foo";
    password = "bar";
}

It's last stack information for blocking call:

ep_poll+0x24e/0x310
SyS_epoll_wait+0xb4/0xd0
entry_SYSCALL_64_fastpath+0x16/0x75

Attached last 300 lines of log. As you see instance had not been responding for a long time. Redsocks instance is made from ce85086 so It's up to date ignoring documentation changes. Kernel version is ubuntu's 4.2.0-34-generic and libevent is installed by ubuntu libevent-dev=2.0.21-stable-1ubuntu1 package.

redsocks-last-300.txt

@aidansteele
Copy link

I have a similar issue on Amazon Linux (RHEL-derived) that pops up anywhere from 1-10 days after launching redsocks. Please let me know what you need from me in terms of debugging info and I'd be very happy to provide it.

@semigodking
Copy link

Would you mind have a try of https://github.com/semigodking/redsocks ? I would like to take this opportunity to see if my fork have the same issue. Thank you!

@darkk
Copy link
Owner

darkk commented Mar 7, 2017

Maybe that's some bug in the code that is triggered on connection pressure...
Try increasing ulimit -n (number of files) for the process, anyway, current limit is not enough to handle load spikes. redsocks_conn_max is deduced from ulimit -n, so I see no reason to tune it explicitly.

Also, kill -USR1 $(pidof redsocks) next time it gets stuck, it will dump some information to the log. It would be interesting to correlate it with netstat -tan and ss -tein output.

@aidansteele
Copy link

@darkk Thanks for such a super-fast response - it's much appreciated. Also your work on redsocks in general is very much appreciated. I'll try all those commands next time it gets stuck and get back to you ASAP. Hopefully in a few days 👍

@semigodking Looks interesting, I'll also have a play around with that.

@reith
Copy link
Author

reith commented Mar 7, 2017 via email

@aidansteele
Copy link

@darkk I've got the output of those commands as well as stderr from 17 hours of runtime before failure in this gist: https://gist.github.com/aidansteele/3d3f0ffb1126b8644df0eafa7c7fd285

Let me know what else you need!

@reith
Copy link
Author

reith commented Mar 8, 2017

Since instances restart at 11 hours ago, I now see all five instances with max fd limit of 1024 are stuck but the one with limit of 4096 is up and running.

@reith
Copy link
Author

reith commented Mar 8, 2017

Also I see, for instance with max fd limit of 4096, output of lsof -n -p $PID -a -iTCP:9055 is same as netstat -tan | grep 9055 and roughly same as redsocks dump on USR1 signal. So there is no dead connection that Its fd is still open by redsocks.

@reith
Copy link
Author

reith commented Mar 8, 2017

Another thing I noticed is for instances with max fd limit of 1024 redsocks logs conn_max as 128 and for instance with max fd limit of 4096 it's 512. If conn_max is maximum number of clients, why It's not division of max fd limit by 2? (one for client-redsocks socket and one for redoscks-socks_server socket)

My previous logs of netstat showed I had more than 128 client connections at some points and file descriptors were not properly closed.

Attached log of a failed instance. At 1488953244.393913 I killed redsocks and another one spawned by upstart.

redsocks-9050.txt

@darkk
Copy link
Owner

darkk commented Mar 8, 2017

@reith I see port 9050 and I assume that you're using tor to handle the traffic. If that's true, have you considered using TransPort feature of the tor daemon? It may be more convenient.

conn_max is calculated here, rule of thumb is that redsocks needs to reserve some FDs for logs / DNS / signal handling and other library needs and needs six file descriptors for splice() to work in worst-case. Usage of splice reduces number of memory-to-memory copies and reduces CPU load by factor of ~30% and it matters on embedded devices. Number of actually used sockets may be reduced, but worse-case scenario is still 6 file descriptors per every connection: one for client, one to server and two for every direction of the pipe.

I'll comment the logs a bit later.

@aidansteele
Copy link

Seems my problem was file description exhaustion. The default limit on Amazon Linux is a laughably low 1024, so redsocks calculates a conn_max of 128 - which we comfortably exceeded during spikes. I've since bumped it up to an fd limit of 80,000 and added some metrics so I can keep an eye on it. So far connectivity has been rock solid, but the open fd count almost always keeps growing (three lines for three EC2 instances):

fdcount

Is this expected behaviour? I calculate the open fds from ls /proc/$(pidof redsocks)/fd | wc -l. It does occasionally decrease, which confuses me even more.

@darkk
Copy link
Owner

darkk commented Mar 10, 2017

@aidansteele Occasional decrease may be cause by SO_KEEPALIVE kicking in and default keepalive timeouts are quite high, something like 2 hours.
Anyway, it's strange to see that number of connection almost never decreases, that looks like socket leak for sure. Is it latest redsocks version as well? May I have an SSH at one of those server box to try to understand the issue better? You can get my key at https://github.com/darkk.keys and write a PGP-encrypted email with ip/port/username to 6691DE6B4CCDC1C176A00D4AE1F2A9807F50FAB2 :)

@reith
Copy link
Author

reith commented Mar 10, 2017 via email

@reith
Copy link
Author

reith commented Mar 10, 2017 via email

@agusdallalba
Copy link

agusdallalba commented Mar 11, 2017

I found the bug. There's a situation where the connection pressure is solved but redsocks doesn't resume listening to new connections. See #100 :)

Edit: This of course doesn't solve the socket leaking problem.

@wolfwander
Copy link

wolfwander commented Apr 10, 2017

Darkk

I'm using redsocks for a long while and found a problem related do iptables conntrack module.
What I see in my servers is that conntrack module considers the connection CLOSED after a FIN succeded by a FIN ACK packtes are seen, any following packet after that is considered INVALID.
This seems a bug of conntrack to me, because connections should be considered closed only when FIN and FIN ACK packets are succeded by another FIN ACK and ACK packter or by a RST packtet.

So, in my iptables logs, I have a bunch of entries for FIN ACK and RST packets being dropped because conntrack considered them INVALID. As those packtes never reach redsocks, it can't close the connection that remains in CLOSE_WAIT state.

As redsocks relies on REDIRECT chain from iptables nat table, there is no workaround we can do using iptables to force that packet being redirected to redsocks.

I think that IP_TRANSPARENT socket option and TPROXY chain from iptables mangle table would be a nice try to avoid that conntrack issue. I know that Squid has an option to use that instead of nat REDIRECTion.

Another option is using Apache behavior of forking process instead of threads and killing forked process after it attended more N connections. But this could cause an unwanted behavior on persistent connection from APPs like WhatsApp.

Anyway, redsocks is an excelent proxyfier and I thank you all for the effort of bringing it to us.

PS: Sorry my poor English, it is not my main language and there is a lot of time I don't practice it...

@gogoseo
Copy link

gogoseo commented Mar 30, 2019

Hi,
im having a simmilar issue with probably redsocks maxing out on connections.
a) How can i check the actual max connections that redsocks has in Ubuntu?
b) How do i change the "max fd limit" in Ubuntu (i guess this is the way to increase the max connections)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants