tv_grab_zz_sdjson patches

Added by Robert Cameron over 3 years ago

I'm not sure if I'm the only one, but I've noticed a problem with the tv_grab_zz_sdjson grabbers from the XMLTV project. I personally much prefer Schedule Direct's JSON service API over their older DataDirect API (tv_grab_na_dd), as the newer JSON API provides better descriptions and season/episode information.

However, I noticed a problem with the grabbers when I set the "Broadcast type" of my Autorecs to "New/premiere": Tvheadend was not scheduling these airings to record because they had a previously-shown element in the XMLTV file. (I'm UTC-0700, so for prime time shows, this often put a show's previously-shown date into yesterday, as XMLTV usually has its date/time set for UTC.) Even after modifying the grabber script to output the previously-shown element in my locale, Tvheadend still wouldn't mark some shows for recording even if the new element was present.

Therefore, I tweaked the grabber script so that if a scheduled program has {"new": true} in the schedule response, then the generation of the previously-shown element is completely suppressed. (It also seems to make more sense semantically, too, that if this is indeed "new" and has not been aired before, then it doesn't make sense for there to be a "previously shown" date. In the Schedules Direct JSON API this is acceptable, because Schedules Direct reports this as {"originalAirDate": "YYY-MM-DD"}, but it's attached to the program response: meaning it relates to a program, not a particular airing of the program.

I also recently switched from using tv_grab_zz_sdjson to tv_grab_zz_sdjson_sqlite, as the SQLite version of the grabber is much MUCH faster and more efficient. However, the _sqlite version of the grabber suffers from almost the same problem: if a program has an originalAirDate associated with it, the previously-shown element is generated. The check for whether an airing is "new" is only checked if there is no originalAirDate, which seems a bit backwards. So, I made the same similar modification to tv_grab_zz_sdjson_sqlite: if a program is marked as "new" for that airing, the generated XMLTV will have a new element, and no previously-shown element will be generated.

I'm attaching my patches as unified diffs, in case anyone else might find this useful. The tv_grab_zz_sdjson patch is against the shipping 0.5.69 version from the XMLTV project. The tv_grab_zz_sdjson_sqlite version, though is against version 1.32 (2017-06-19) which I pulled from the GitHub repo of its author.

Replies (5)

RE: tv_grab_zz_sdjson patches - Added by Robert Cameron over 3 years ago

Here's an update:

After discussing the issue with the author of tv_grab_zz_sdjson_sqlite, he has decided not to modify the source. The reason is that the XMLTV DTD states that the purpose of the <new /> element serves a purpose different than Schedule Direct's schedule response of {"new": true}. Therefore, in order to keep his grabber as compliant as possible with the DTD (instead of deviating for other use cases), no changes will be made to his source.

Also, on 21 July 2017, he committed a few new changes and bumped the version to 1.33. So, here is a patch against tv_grab_zz_sdjson_sqlite v1.33 that adds a <new /> element if the schedule response indicates {"new": true} and suppresses the <previously-shown> element for that instance of the <programme>:

--- a/tv_grab_zz_sdjson_sqlite    2017-10-03 09:57:08.576347223 -0700
+++ b/tv_grab_zz_sdjson_sqlite    2017-09-22 13:22:00.512171274 -0700
@@ -1777,7 +1777,13 @@

         # XMLTV uses their standardized dates, while Schedules
         # Direct uses YYYY-MM-DD
-         if (defined($programDetails->{'originalAirDate'}))
+         # If "new", then do not generate previously-shown
+         if (defined($scheduleDetails->{'new'}))
+           {
+             my $new = $scheduleDetails->{'new'};
+             $w->emptyTag('new');
+           }
+         elsif (defined($programDetails->{'originalAirDate'}))
              my $originalAirDate = $programDetails->{'originalAirDate'};
              my $offset = ' +0000';

RE: tv_grab_zz_sdjson patches - Added by Em Smith over 3 years ago

Good write up.

However, one thing I don't understand is that the xmltv DTD also has a premiere element (in addition to the new element). The tvheadend code appears to parse either to set the "is new" episode flag.

From the DTD:

this [premiere] element doesn't have a clear meaning, just use it to represent
where 'premiere' would appear in a printed TV listing. You can use
the content of the element to explain exactly what is meant, for

<premiere lang="en">
  First showing on national terrestrial TV


My understanding is that "new" is for the very first episode of the very first series, whereas premiere could be set for the first showing of any episode on your channel.

From what you describe, wouldn't the episodes be premiere for your channel, but previously-shown on other timezone?

I noticed my sample file has:

<programme ...start="20170918081500 +0000"...>
 <previously-shown start="20170918 +0000"/>
  <premiere>Series Premiere</premiere>

And my movies have:


I don't know how consistent they are applied since I've never used that option.

If your (original) xmltv files have premiere flags consistently set then I noticed the tvheadend code appears to set "is repeat" if "previously-shown"; and the autorec appears to check the repeat flag rather than checking is_new flag. Perhaps it could check the is_new flag first and if that's set then assume new, otherwise check the is_repeat flag.

RE: tv_grab_zz_sdjson patches - Added by Robert Cameron over 3 years ago

The real problem is that there isn't a good mapping between Schedules Direct's JSON data and XMLTV. For instance, Schedules Direct has an "originalAirDate" that is present on nearly every program response, whereas the "new" tag is on the schedule response. As such, there isn't a direct analogue between previously-shown and originalAirDate. Also, the DTD is a bit ambiguous on what previously-shown means, and seems to almost indicate it is used to reference airings on other channels.

In short, XMLTV is rather an inadequate format for rich EPG data. Unfortunately, it's really the only option. (Unless someone wants to write an EPG module for SD as part of Tvheadend, that way data can properly be mapped to Tvheadend's database. But, because of the nature of SD's accounts/lineups and configuring it, it's not really something easily achieved with the way that TVH's code is structured.)

RE: tv_grab_zz_sdjson patches - Added by Em Smith over 3 years ago

I looked at how mythtv handles it and they ignore the "new" flag from the xmltv file and it looks like they set "new episodes" as being any episode for which originalairdate is within 14 days of the programme start time, and "repeat/unknown" as being a show outside that date range or where there isn't a previously shown date.

I guess that algorithm is to handle cases here where a show is premiere at 9pm Thurs, repeated at 10pm on the +1 channel, then repeated on a catch-up channel on Sunday 11pm and 12am, and then maybe again a week on Wednesday. So, depending on which series link you watch any of them could be considered "new" for you.

Thinking about it, I think the reason SD is giving previously-shown date in my example for 20170918081500 is because we have long running soap operas which do not have seasons like US soaps like Dallas, but instead just run several nights a week at 52 weeks a year, so originalairdate would be used to look up metadata.

Given that the date is being given in xmltv files for new showings, and the tvheadend internal repeat flag is only used for xmltv/pyepg, perhaps we should just alter tvheadend behaviour to match mythtv behaviour and ignore the repeat flag from input files and calculate it based on previously shown date?

RE: tv_grab_zz_sdjson patches - Added by Reggie Burnett over 3 years ago

This is really quite frustrating. I keep having to copy in the patched grabber every time I update the container.