I cannot comment on how to generate the Shepherd data.
I am using Shepherd data sourced from
http://www.siliconhill.com.au/shepherd/
If you are lucky enough to live in one of the 8 cities listed, it might be sufficient for your needs.
The gz free-to-air or Foxtel file is zipped in GZip format.
If you live in a timezone which is not in whole hours, I found I had to globally replace all references to " +0930" with "+09:30", otherwise all shows in the epg were wrong by 29 minutes.
If you require the data enhanced with season/episode information then you can try to use the XMLTVDB tool which I am scheduling on a Windows box.
XMLTVDB takes an XML file and tries to look up the TV show in TVDB.com using fuzzy matching of episode name, dates etc to determine season and episode numbers. It seems to work reasonably well so far.
• Install XMLTVDB from:
http://code.google.com/p/xmltvdb/
but take out any references to the ForTheRecord software from the properties file and configure it to your own watch folder and output destination.
• On some schedule
1. Start up XMLTVDB
2. Download the GZipped XMLTV file
3. Unzip this output.xmltv.gz into the watch folder
4. Give it roughly 10-15 minutes to generate the enhanced xml defined in the xmltvdb properties file
5. If you live in a timezone which is not in whole hours you might need to globally replace the timezone (e.g. replace " +0930" with "+09:30")
6. Point your xmltv grabber at the file. I found the easiest way was to upload the file to some personal web space and use tv_grab_url
http://wiki.saihtam.dk/TVHeadend-tv_grab_url. But the more Linux savvy should be able to work out how to access the file directly