What’s the connection between these disparate items. I’ll get on to that in a minute, but will talk about SqueezeCenter first.
I have been running SqueezeCenter (formerly SqueezeBox) since it’s earliest incarnation in about 2001 when I bought a single Slimp3 player. The fact that the server is based on an open source project running on different hardware and operating systems including the RPi means that it has stayed alive with the help of a dedicated team of developers. I am very grateful to them. I now run multiple players from a library of 22,000 tracks hosted on a Windows Home Server PC. I originally distributed Napster around the house using it, but Napster changed their licensing so I moved to Spotify using the excellent Triode addin. The RPi is not forgotten either and there are some excellent software players available.
My family are big fans of Melvyn Bragg’s In Our Time on BBC Radio 4. There are over 600 podcasts and the majority are still available on the IOT BBC Website. I’ve used Juice to download the podcasts automatically every week for many years. Though the BBC have done some work recently to make episodes easier to find, I wanted a way to index them into SqueezeCenter playlists so that it would be easier to find specific episodes or see the latest ones, which is where the Rpi comes in. The podcasts are available as category feeds or a feed that has them all. Some of the podcasts exist in multiple categories and the same podcast can have different file names. What I wanted was to create multiple playlists that lists all the episodes uniquely sorted by descending broadcast date and also by title. In addition I wanted to have different playlists for different categories such as Culture or Science. This should have been easy but the podcasts have very little tag information. In the end I used the broadcast date extracted from the file name as a unique identifier across all the categories and extracted the title from the MP3 tag information. A couple of podcasts had no MP3 tag information so I had to put the title tag back in manually.
The code is listed below. This was my first Python project, so I’m hoping that as I get more experienced the code will become cleaner.
#!/usr/bin/python
from os import walk
from os.path import splitext, join
import codecs
from datetime import datetime
from operator import itemgetter
import eyed3
import time
from urllib import quote
import re
PODCAST_LOCATION = "/mnt/podcasts/"
PLAYLIST_LOCATION = "/mnt/nas/Playlists/"
PODCAST_LOCATION_ON_SERVER = "ServerFolders/Podcasts/"
SERVER_DRIVE = "D"
def createplaylist(filename, files):
output = codecs.open(filename, 'w', encoding="utf-8")
output.write("#CURTRACK 0" + 'rn')
output.writelines("#EXTM3Urn")
for file in files:
localfile = file[0].replace(PODCAST_LOCATION, PODCAST_LOCATION_ON_SERVER)
quotedfile = quote(localfile)
output.write("#EXTURL:file:///" + SERVER_DRIVE + ":" + quotedfile + 'rn')
output.write("#EXTINF:" + str(file[3]) + "," + file[1] + 'rn')
localfile = localfile.replace("/", "\")
output.write(SERVER_DRIVE + ":\" + localfile + "rn")
output.close()
def get_title_time(filename):
print filename
audiofile = eyed3.load(filename)
title = audiofile.tag.title
title = title.replace("IOT: ", "")
sorttitle = title.upper()
sorttitle = sorttitle.replace("THE ", "")
sorttitle = sorttitle.replace("IN OUR TIME: ","")
sorttitle = re.sub(r'IOT.:','',sorttitle)
audiotime = audiofile.info.time_secs
return (title, sorttitle, audiotime)
def getdate(file, format):
for dashsplitfilename in file.split("-"):
for underscoresplitfilename in dashsplitfilename.split("_"):
try:
transmissiondate = datetime.strptime(underscoresplitfilename, format)
if transmissiondate.year > 1990 and transmissiondate.year < 2020:
return transmissiondate
except:
pass
return None
def get_transmission_date(file):
transmissiondate = getdate(file, "%Y%m%d")
if transmissiondate == None:
transmissiondate = getdate(file, "%d%m%y")
if transmissiondate == None:
ext = splitext(file)
transmissiondate = getdate(ext[0][-10:], "%d %m %Y")
return transmissiondate
def select_files(root, files, category):
selected_files = []
for file in files:
#do concatenation here to get full path
full_path = join(root, file)
ext = splitext(file)
if ext[1] == ".mp3":
title, sorttitle, time = get_title_time(full_path)
transmissiondate = get_transmission_date(ext[0])
selected_files.append((full_path, title, sorttitle, time, transmissiondate, category))
return selected_files
def build_recursive_dir_tree(path, category):
selected_files = []
for root, dirs, files in walk(path):
selected_files += select_files(root, files, category)
return selected_files
def main():
files = build_recursive_dir_tree(PODCAST_LOCATION + "In Our Time With Melvyn Bragg", "a")
files += build_recursive_dir_tree(PODCAST_LOCATION + "In Our Time Culture", "c")
files += build_recursive_dir_tree(PODCAST_LOCATION + "In Our Time History", "h")
files += build_recursive_dir_tree(PODCAST_LOCATION + "In Our Time Philosophy", "p")
files += build_recursive_dir_tree(PODCAST_LOCATION + "In Our Time Religion", "r")
files += build_recursive_dir_tree(PODCAST_LOCATION + "In Our Time Science", "s")
uniquedates = set()
iotlist = [item for item in files if item[4] not in uniquedates and not uniquedates.add(item[4])]
playlistnames = [("Culture","c"),
("History","h"),
("Philosophy","p"),
("Religion","r"),
("Science","s")]
iot = sorted(iotlist, key=itemgetter(2))
createplaylist(PLAYLIST_LOCATION + "IOT-All-Sorted By Title.m3u", iot)
iot = sorted(iotlist, key=itemgetter(4), reverse=True)
createplaylist(PLAYLIST_LOCATION + "IOT-All-Sorted By Date.m3u", iot)
for playlistname in playlistnames:
iotlist = [item for item in files if item[5] == playlistname[1]]
iot = sorted(iotlist, key=itemgetter(2))
createplaylist(PLAYLIST_LOCATION + "IOT-" + playlistname[0] + "-Sorted By Title.m3u", iot)
iot = sorted(iotlist, key=itemgetter(4), reverse=True)
createplaylist(PLAYLIST_LOCATION + "IOT-" + playlistname[0] + "-Sorted By Date.m3u", iot)
if __name__ == "__main__":
main()
Code is reproduced “as is”. No liability can be accepted for its use by the author.