Em Smith wrote:
> It's possible to do it with 4.3, but not easy.
> 
> There are scraper configurations such as data/conf/epggrab/eit/scrape/uk.  These are applied to the epg grabber and can manipulate the epg fields before they are stored and used.
> 
> For example, in the UK some broadcasters prepend "New: " to the title, so the regex strips it from the title before using it.
> 
> Assuming the epg grabber is used, then a config could strip the "odcinek [0-9]+" from the title (using the UK config as a guide on writing the config file), and that should solve the problems.
> 
> Re: episode, yes I noticed that.  I'd have to re-check the code to see how that is extracted since 4.3 does it very differently and stores the fields separately so that it can be passed better to Kodi.
Thanks a lot for confirming how it does it in code as it is a guess work to do tests in order to identify what information it uses and how.
I am using external xmltv grabber that is downloaded from iptv provider and passed on to tvhadend through short script, below line is the most important.
cat /sharedfolders/Appdata/tvheadend42/scripts/epg.xml | socat - UNIX-CONNECT:/sharedfolders/Appdata/tvheadend42/epggrab/xmltv.sock
So i was thinking to add to my script to remove anything after comma in title using regex, some use abbreviations and some use just numbers but most use comma. I am not using dvb epg for this source and will also do the same on existing recordings in dvr log files. I will have to take backup and practice on few for a start.
I will also update autorec title regex to add $ at the end.
perl -pi -e 's/(^\s*\<title\ lang\=\"pl\"\>.*)(\,.*\<)/$1\</g' epg.xml
perl -pi -e 's/\s*\<\/title\>/\<\/title\>/g' epg.xml              #to remove any spaces in case there were any before comma
Epg xml file example entry
Programme start="20191128132000 +0100" stop="20191128135000 +0100" channel="HGTV">
    <title lang="pl">Odlotowy ogród 4, odc. 5/6</title>
    <desc lang="pl">W miejscowości Modliczki niedaleko Krakowa mieszka małżeństwo z pięciorgiem dzieci i psem. Pani Kasia jest psychologiem i doradcą zawodowym, a pan Paweł listonoszem. Marzą, by ich ogródek stał się miejscem wymarzonego relaksu, w którym każdy znajdzie swój ulubiony zakątek. Obecnie dostęp do ogrodu jest mocno utrudniony. Bezładnie porozmieszczane są tam piaskownica, trampolina i huśtawka. W dodatku pies skutecznie niszczy trawę. Dominik Strzelec proponuje rodzinie ciekawe rozwiązania</desc>
After further thinking i decided to add above method on each problematic recording, separate command for each. I just looked in epg for titles with commas and there were 1245 for 3 days, not the channels i mainly use but majority commas were part of the description.
I also found some titles that contain season number before comma so for each problematic tvshow i would have to copy and ammend below command e.g. "Odlotowy ogród 4, odc. 5/6".
perl -pi -e 's/(^\s*\<title\ lang\=\"pl\"\>Odlotowy\ ogród)(.*\<)/$1\</g' epg.xml
I will try rename of all dvr logs but it will be possibly in 2 days time.
One more time thank you Em and Joe for helping me with this. Please let me know if my approach is not appropriate.