Loomio

There was a problem with prosody, fixed by a restart of prosody

Pirate Praveen
Pirate Praveen Public Seen by 412
Aug 17 08:59:27 datamanager     error   Unable to write to offline storage ('/var/lib/prosody/poddery%2ecom/offline/praveen.list: Too many open files') for user: praveen@poddery.com 
Aug 17 09:10:23 datamanager     error   Unable to write to offline storage ('/var/lib/prosody/poddery%2ecom/offline/bsc.list: Too many open files') for user: bsc@poddery.com 

We got to dig deeper and find out more, anyone interested?

Pirate Praveen

Pirate Praveen November 27th, 2016 18:12

This keeps repeating often, I had to restart again today.

Pirate Praveen

Pirate Praveen November 27th, 2016 18:18

@balasankarchelamat @jayaura @akshay @anisha @fayadfami @tvm @isaagar @mintojoseph can any of you help with this issue?

Minto Joseph

Minto Joseph November 28th, 2016 01:47

@praveenarimbrathod Are you sure it is not caused by hitting open file descriptor limit? Checked ulimit -n ? If not fixed, will check when I am back on 30th..

Pirate Praveen

Pirate Praveen December 1st, 2016 12:38

We have added two cron jobs to log open files and sockets every day.

1 1 * * * lsof -u prosody > /root/debug/"lsof-`date --rfc-3339=date`"
1 1 * * * netstat -taupe > /root/debug/"netstat-`date --rfc-3339=date`"

When this happens next time, we have better data to analyze and find the root cause.

Vidyut

Vidyut December 2nd, 2016 02:36

Why are logged out users allowed access to view information like this?

Pirate Praveen

Pirate Praveen December 2nd, 2016 03:53

These issues are technical problems and not sensitive information. We have a private mailing list for the podmins and sensitive information like passwords are always encrypted. We have created private discussions here in the past when we felt the contents should not be public.

Pirate Praveen

Pirate Praveen December 2nd, 2016 03:55

If you meant the addresses, only our own addresees (both addresses mentioned are podmins) are copied.

Pirate Praveen

Pirate Praveen December 12th, 2016 10:28

We are hitting the limits again. I was not able to login some time back and log had same error message. I'm thinking there is a problem with prosody configuration as it should be using the database and not file system for offline messages.

Pirate Praveen

Pirate Praveen December 12th, 2016 15:08

@mintojoseph was analyzing the situation and figured out the issue using strace -xvtto /tmp/lua.st -p 29152 (pid of prosody process). We have set MAXFDS to a higher value in /etc/defaults/prosody and also modified /etc/prosody/prosody.cfg.lua to use sql backend for writing offline messages. Hope this fixes the issues. We are still monitoring the situation, we'll update here if there are more issues.