Project

General

Profile

Bug #6009

[DVR] 100%CPU after running some days

Added by Robert Heel 7 days ago. Updated 4 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
PVR / DVR
Target version:
-
Start date:
2021-02-21
Due date:
% Done:

0%

Estimated time:
Found in version:
multiple, updated up to 4.3-1923~gaaca05cc1
Affected Versions:

Description

I'm using tvheadend for a long time. Every now and then this issue happens. But the last weeks this issue reoccurs on a daily basis.
I think that's because over the time we collect over 5000 recordings - but I don't like to delete them to verify...

System: Ubuntu 20.04.2 LTS
Memory: 32 GB
Disk: 24 TB btrfs raid 5

top -H shows tvh:dvr on first place

strace tvheadend repeats the lines
futex(0x5631f0daed60, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x5631f0c4e028, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1613940659, tv_nsec=0}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT

strace tvh:dvr repeats the line
futex(0x5631f3e0adf4, FUTEX_WAKE_PRIVATE, 2147483647) = 0

killing the tvh:dvr thread or restarting tvheadend solves the issue for a while.

History

#1

Updated by Flole Systems 7 days ago

Update to latest master which contains many improvements and fixes.

Then please check with gdb what the Thread is doing.

#2

Updated by Robert Heel 5 days ago

As the cpu load is high again I just checked. I will now update to commit 00b35ec7803388eb08e4835a1df821283ddef4a9 and check these days again.

A endless loop in src/dvr/dvr_rec.c
1667 while(run) {
(gdb) n
1668 sm = TAILQ_FIRST(&sq->sq_queue);
(gdb) n
1669 if(sm == NULL) {
(gdb) n
1670 tvh_cond_wait(&sq->sq_cond, &sq->sq_mutex);
(gdb) n
1671 continue;
(gdb) n
1667 while(run) {
(gdb) n
1668 sm = TAILQ_FIRST(&sq->sq_queue);
(gdb) n
1669 if(sm == NULL) {
(gdb) n
1670 tvh_cond_wait(&sq->sq_cond, &sq->sq_mutex);
(gdb) n
1671 continue;
(gdb) n

#3

Updated by Flole Systems 5 days ago

Something is waking up the thread apparently. If that's done constantly you need to see what exactly is sending those wakeup instructions.

#4

Updated by Robert Heel 4 days ago

How? I have some coding skills, but I'm not familiar with threads...

TAILQ_FIRST should not be NULL? as long as it is NULL, code will loop ...

New version runs 22 hours without hang - will try next time to dig deeper.

#5

Updated by Flole Systems 4 days ago

The tvh_cond_wait should wait until there's a wakeup event. So you need to figure out where that wakeup event is coming from.

Also available in: Atom PDF