Rippin' DVDs

xrayspx's picture
Music: 

Dana Carvey - Choppin' Broccoli

Today in Lattice of Convenience news, here's how to rip DVDs.

I barely understand the mencoder command that is the backbone of this thing, and there are many better ways to do lots of the stuff in this script, in fact I know several of those better ways, and looking at it fresh, I see some redundant stuff that cancels out other stuff. But it runs, and I use it, so here goes.

Ripping DVDs isn't fun, the disk labels are iffy at best, even within a single box set you might go from the Gold Standard "TV Show - S1D1" to "DVD_VIDEO" as a disk label. So it can get kind of ugly. To mitigate that I create an output folder based on the DVD disk label + a timestamp. If you get a run of disks with the same name, at least they're not overwriting each others files because the timestamp will shift. I currently have a dvdrip-output directory with the following DVDs in it:

...
DVD_VIDEO-090720202337
DVD_VIDEO-090820201025
DVD_VIDEO-090820201027
DVD_VIDEO-090820201142
I_LOVE_LUCY_S2_D1-090520202354
I_LOVE_LUCY_S2_D3-090620201047
LUCY_S1D1-090520201043
LUCY_S1D2-090520201043
LUCY_S1D3-090520201359
...

Those are all from the same box set. So that's 3 naming conventions from one series. To be fair I think that while it's the same company producing them they probably came as separate "season" boxes rather than one big set. Still. Come on. Jesus.

Another big gotcha I've hit, again mainly with TV series box sets, a single show might exist on the disk as many as THREE times. Once as a "standalone episode", once as "episode with commentary track" and once as part of a massive concatenated file of all the episodes on that disk. In the case of the commentary track, that audio seems to be separate, so the actual episode rips to exactly the same filesize, the commentary track seems not to be something I have access to, so you just get two identical files at the end.

So as you're ripping, that's going to triple the rip time.

The way I'm trying to fix that is to rip the first 30 seconds of every Title on the disk, then do a SHA sum on those ripped sample files. As a Title rips, when it's done I'll drop its clip checksum into a "rippedchecksums" file. The next TItle starts the first thing it does is check to see if its checksum has already been ripped. If it has, skip it. It seems to catch 100% of repeated Titles, and probably 70% of the "Big Concatenated File" cases will match the sum for Title 1. Saves a shitload of time.

In this case, Title 1 is a standalone episode, and Title 21 is the Big Concatenated File of all the episodes on the disk. Title 21 will be skipped. Since I get about 70 or 80 FPS on my Mac Pro, that probably saved 90 minutes of rip time or so with 3 hours of video on the disk:

763b6035c4bf239b4425fb8f484018387574baca /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/1-sample.avi
59cca1b18759647e13e3e1b6a4facace0520fc06 /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/10-sample.avi
125add4181b9dc6eee57c32c07568765b8e4483b /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/11-sample.avi
4daae35d014032964fe57e70e2cc3450f7dac4e5 /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/12-sample.avi
a942f31a9ee42c5839772f733b2c666195397ad5 /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/13-sample.avi
8c9473a940a9bc685d84e0ac29c66f53efa6667d /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/14-sample.avi
29d2200d8c46ac11417119b4b7179e4b526d99cf /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/15-sample.avi
466860b79bba6d132fcc97d6dc7c0c3a20dd771c /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/16-sample.avi
f4ae11cca0752956c4d6025a8760a260a59fe79b /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/17-sample.avi
00753d529f4bbf4081f647056cf44db7c630c198 /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/18-sample.avi
b7f9c9087fed6b00d22de5033c153f9ffb3cd3b1 /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/19-sample.avi
14efcb6164f1424b894cc28200ab621ec805ecd0 /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/2-sample.avi
6c411c8869f1e6bc9a6ec298ba9b6a5c9eefc9ae /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/20-sample.avi
763b6035c4bf239b4425fb8f484018387574baca /Volumes/Filestore/dvdrip-output/DVD_VIDEO-090720202337/21-sample.avi

At the end of it, I still end up with just a directory full of files labeled 1 through whatever.avi. I have to take a few seconds per file to get it to "TV Show - S01E01.avi". But from there FileBot can mass-rename them with episode titles.

So here's the full ugliness. You'll want to adjust all the paths. I should have made variables, but I don't care, I maybe have 3 or 4 ripping trays running at a time on various machines, so I don't mind just changing the paths for each host. Works on OSX and Linux, and probably Windows with Cygwin, but I don't care about Windows so I'm not going to test it.


#! /bin/bash

timestamp=`date +%m%d%Y%H%M`

id=$(drutil status |grep -m1 -o '/dev/disk[0-9]*')

if [ -z "$id" ]; then
echo "No Media Inserted"
else
name=`df | grep "$id" |grep -o /Volumes.* | awk -F "Volumes\/" '{print $2}' | sed 's/ /_/g'`

fi
name=`df | grep "$id" |grep -o /Volumes.* | awk -F "Volumes\/" '{print $2}' | sed 's/ /_/g'`
echo $name
dir="$name-$timestamp"
mkdir /Volumes/Filestore/dvdrip-output/$dir

maxtitle=`/Applications/mencoder dvd://100 -o bob | grep "titles on this DVD" | awk '{print $3}'`

for title in {1..100}
do
if [ $title -le $maxtitle ]
then
/Applications/mencoder dvd://$title -alang en -ovc lavc -lavcopts vcodec=mpeg4:vhq:vbitrate="1200" -vf scale -zoom -xy 720 -oac mp3lame -lameopts br=128 -endpos 30 -o /Volumes/Filestore/dvdrip-output/$dir/$title-sample.avi
shasum /Volumes/Filestore/dvdrip-output/$dir/$title-sample.avi > /Volumes/Filestore/dvdrip-output/$dir/$title-checksum
touch /Volumes/Filestore/dvdrip-output/$dir/rippedchecksums.txt
fi
done

cat /Volumes/Filestore/dvdrip-output/$dir/*checksum >> /Volumes/Filestore/dvdrip-output/$dir/allchecksums.txt

for title in {1..100}
do
if [ $title -gt $maxtitle ]
then
chmod -R 775 /Volumes/Filestore/dvdrip-output/$dir
sleep 3
drutil tray eject
exit 0
fi
sum=`cat /Volumes/Filestore/dvdrip-output/$dir/$title-checksum | awk '{print $1}'`
match=`grep $sum /Volumes/Filestore/dvdrip-output/$dir/rippedchecksums.txt`
if [ -z $match ]
then
echo "CURRENTLY RIPPING TITLE #$title"
/Applications/mencoder dvd://$title -alang en -ovc lavc -lavcopts vcodec=mpeg4:vhq:vbitrate="1200" -vf scale -zoom -xy 720 -oac mp3lame -lameopts br=128 -o /Volumes/Filestore/dvdrip-output/$dir/$title.avi
echo $sum >> /Volumes/Filestore/dvdrip-output/$dir/rippedchecksums.txt
rm /Volumes/Filestore/dvdrip-output/$dir/$title-checksum
rm /Volumes/Filestore/dvdrip-output/$dir/$title-sample.avi
fi
done
chmod -R 775 /Volumes/Filestore/dvdrip-output/$dir