How I Split Podcast Files

Update 2011-01-18: The Sansa m250 player finally died, and I now have a newer mp3 player that fast-forwards nicely, so I no longer do this goofy podcast splitting stuff.

Note: This is a “How-I” (works for me) not a “How-to” (do as I say) post.

I do goofy stuff sometimes. For example, I use Linux to download a couple podcasts targeted to Microsoft Windows developers. Specifically, I use µTorrent (that’s the “Micro” symbol so the name is pronounced “MicroTorrent”), a Windows BitTorrent client, running in Wine on Ubuntu to download the .NetRocks and Hanselminutes podcasts. I’ve had no problems running µTorrent in Wine. I got started doing this because my mp3 player was awkward to work with in Windows XP.

When I connected my Sansa m250 mp3 player to a Windows XP box, the driver software XP loaded wanted me to interact with the mp3 player as a media device. It has been a while, and I can’t recall exactly what it did, but I do recall it wanted me to use a media library application (one that would probably try to enforce DRM restrictions) and did not give me direct access to the file system on the player. There is probably a way around that, but I didn’t find it quickly at the time. What I did find was that when I connected the mp3 player to my old PC running Ubuntu it detected it and mounted it as a file system device that I could happily copy mp3 files to as I pleased. Good enough for me.

At first I was using the Azureus BitTorrent client, which is a Java app and runs on Ubuntu, to download the podcasts (and an occasional distro to play with). That application seemed to get more bloated with each release. It started displaying a bunch of flashy stuff and promoting things that you probably shouldn’t be downloading (but it’s okay if you don’t believe in copyright). I read about µTorrent and tried it on a Windows XP PC. It’s a lightweight program that does BitTorrent well without promoting piracy (personally, I do think copyright, with limits, is a good thing). While this worked well for downloading, I didn’t like the extra step of copying files from the PC running Windows to the other running Ubuntu to load them onto my mp3 player. After reading a timely article about Wine (the source of the article escapes me now), I decided to try running µTorrent using Wine. I don’t recall having any problems setting it up, it just worked. I did have to fiddle with my router to set up port forwarding but that’s not related to Wine or Ubuntu, just something you may have to do for BitTorrent to work.

This method of downloading the podcasts works well, but that’s not the end of the story. Occasionally I would be part way through a podcast and, for some reason (maybe I was trying to rewind a little bit within the file but my finger slipped and it went back to the beginning of the file), I would have to fast-forward to where I left off. Hour-long podcasts in a single mp3 file are not easy to fast forward with the Sansa player I have. It doesn’t forward faster the longer you hold the button like some devices do, it just goes at the same (painfully slow for a large file) pace. It seemed like splitting the mp3 files into sections would make that sort of thing easier. Bet there’s an app for that.

A search of the Ubuntu application repository turned up mp3splt. It has a GUI but I only wanted the command line executable which is available in the repository and can be installed from the command line (note that there’s no “i” in mp3splt):

sudo apt-get install mp3splt

After a couple trips to the man page to sort out which command line arguments to use, I had it splitting big mp3 files into sections in smaller mp3 files. That worked for splitting the files but I found that the player didn’t put those files in order when playing back. That’s not acceptable. I probably could just make a playlist file and use that to get the sections to play in order. I wondered if setting the ID3 tags in a way that numbered the tracks would make the player play them in order. Turns out it would. A search for “ID3” in the Ubuntu repository led to id3tool, a simple utility for editing the ID3 tags in mp3 files. I installed it too:

sudo apt-get install id3tool

I wrote a shell script named podsplit.sh to put this splitting apart all together. I use a specific directory to hold the mp3 files I want to split (but I’ll call it a “folder” since that’s the GUI metaphor, and I use the GNOME GUI to move the files around). I manually copy the downloaded mp3 files into the 2Split folder and then open a terminal and run the script. The script creates a sub-folder for each mp3 file that is split. When the script is finished I copy the sub-folders containing the resulting smaller mp3 files to the Sansa mp3 player.

Here’s the shell script:

#!/bin/bash

#------------------------------------------------------------
# podsplit.sh
#
# by Bill Melvin (bogusoft.com)
#
# BASH script for splitting mp3 podcasts into smaller pieces.
# I want to do this because it takes "forever" to fast-
# forward or rewind in a huge mp3 on my Sansa player.
#
# This script requires mp3splt and id3tool.
#
# This script, being a personal-use one-off utility, also 
# assumes some things:
# 1. mp3 files to be split are placed in ~/2Split
# 2. The file names are in the format showname_0001.mp3
#    or showname_0001_morestuff.mp3 where 0001 is the 
#    episode number.
# 
# I'm no nix wiz and I don't write many shell scripts so 
# this script also echoes a bunch of stuff so I can see 
# what's going on. 
#
#------------------------------------------------------------
# [2009-01-18] First version. 
#
# [2009-01-24] Use abbreviated show name for Artist.
#
# [2009-02-12] Changed split time from 3.0 to 5.0.   
#
# [2009-02-16] Use track number instead of end-time in track 
# title.
#
# [2009-02-19] Redirect some output to log file.
#------------------------------------------------------------

split_home=~/2Split
logfn="${split_home}/podsplit-log.txt"

ChangeID3() {
  filepath=$1
  filename=$2

  # Get track number from ID3.
  temp=`id3tool "$filepath" | grep Track: | cut -c9-`
  
  # Zero-pad to length of 3 characters.
  track=`printf "%03d" $temp`
    
  # Extract the name of the show and the episode number from 
  # the file name. This only works if the file naming follows 
  # the convention showname_0001_morestuff.mp3 where 0001 
  # is the episode number. The file name is split into fields 
  # delimited by the underscore character.
  show=`echo $filename | cut -d'_' -f1`
  episode=`echo $filename | cut -d'_' -f2`
  abbr="${show:0:6}"
  album="${abbr}_${episode}"
  title="${abbr}_${episode}_${track}"

  echo "ChangeID3"
  echo "filepath = $filepath" >> $logfn
  echo "filename = $filename" >> $logfn
  echo "show = $show" >> $logfn
  echo "abbr = $abbr" >> $logfn
  echo "episode = $episode" >> $logfn
  echo "album = $album" >> $logfn
  echo "title = $title" >> $logfn
  echo "track = $track" >> $logfn
  echo "BEFORE" >> $logfn
  id3tool "$filepath" >> $logfn
  
  id3tool --set-album="$album" --set-artist="$abbr" --set-title="$title" "$1"
  
  echo "AFTER" >> $logfn
  id3tool "$filepath" >> $logfn
}

SplitMP3() {  
  echo "SplitMP3"
  name1=$1
  echo "name1 = $name1"
  
  # Get file name and extension without directory path.
  name2=${name1#$split_home/}
  echo "name2 = $name2"
  
  # Get just the file name without the extension.
  name3=${name2%.mp3}
  echo "name3 = $name3"

  outdir=$split_home/$name3.split
  echo "Create $outdir"
  mkdir "$outdir"

  mp3splt -a -t 5.0 -d "$outdir" -o @t_@n $1

  for MP3 in $outdir/*.mp3
  do
    ChangeID3 "$MP3" "$name3"
  done   
}

for FN in $split_home/*.mp3
do
  SplitMP3 "$FN"
done

echo "Done."

This is not a flexible script as my folder for splitting files is hard-coded and it assumes a file naming convention for the mp3 files being split. If you’re an experienced shell scripter I’m sure you can do better. I still consider myself a Linux “noob” (and offer proof as well), intermediate in some areas at best. I am posting this because someone else may be trying to solve a similar problem and this can serve as an example of what worked for one person, in one situation, to work around the limitations of one particular mp3 player. Someone less goofy would probably just buy an iPod and use iTunes to handle the podcast files.