Charlie Harvey

Copying videos off Vimeo with bash

Vimeo screenshot

A lot of video people are hosting on Vimeo these days, it seems well featured reliable and fast, and it doen’t have quite the same psychotically disordered commenting culture as has developed at youtube. However, though there is good stuff to watch on there, not every video uploaded has a download button on it. You can see what I mean on The Ballad of the Psychotropic Robots — a download link for the game but not for the video. It isn’t obvious how to download it to watch later or on your mobile player. Sure, you could use a Firefox extension to accomplish this task, of course. My favourite is DownloadHelper (despite the design of its website). But I’m a commandline kind of guy, and I like to know exactly what is going on. So I cracked out Bash and got working on a one liner.

An open file conundrum

I started out by trying to work out what happened when I played a Vimeo file on the interwebs. It seemed that the sensible approach for the (nonfree) flash player to take would be to simply download a local temporary copy of the file and play that. It turns out that this is pretty much the case, except that it then unlinks the temporary file almost immediately. We can use lsof to verify that this is what is happening. I use the +L1 beause I am looking for files that are open but unlinked. First I look at what files are open already. This output is truncated a bit.$ lsof +L1 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME gvfs-gpho 2810 charlie 9u CHR 189,2 0t0 0 3529916 /dev/bus/usb/001/003 (deleted) gnome-ter 2858 charlie 50u REG 254,1 64 0 13852834 /tmp/vteVY28FW (deleted) ghc 26110 charlie 6u REG 254,1 4096 0 13852819 /tmp/ffipO22hU (deleted) firefox-b 29460 charlie 64u REG 254,1 360448 0 14032951 /tmp/mozilla-media-cache/media_cache (deleted) Now I fire up iceweasel and play a video. $ lsof +L1 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME gvfs-gpho 2810 charlie 9u CHR 189,2 0t0 0 3529916 /dev/bus/usb/001/003 (deleted) gnome-ter 2858 charlie 50u REG 254,1 64 0 13852834 /tmp/vteVY28FW (deleted) ghc 26110 charlie 6u REG 254,1 4096 0 13852819 /tmp/ffipO22hU (deleted) firefox-b 29460 charlie 64u REG 254,1 360448 0 14032951 /tmp/mozilla-media-cache/media_cache (deleted) plugin-co 29511 charlie 16w REG 254,1 39082 0 13852844 /tmp/FlashXXbA7uge (deleted) The temporary file is the one in the last line there.

The plot thickens

OK, so now I can see that the open file is there, and I can see that a process called plugin-co… with pid 29511 owns the file. I can find out more about the process by grepping the output of ps ax for the pid thus.$ ps ax | grep [2]9511 29511 ? Sl 0:16 /usr/lib/xulrunner-12.0/plugin-container /usr/lib/flashplugin-nonfree/libflashplayer.so -greomni /usr/lib/xulrunner-12.0/omni.ja 29460 pluginIncidentally adding the square brackets to the first letter of the argument of grep stops it from including the grep command in its output. Neat eh?

OK, well we know that the file is listed as being deleted from the lsof. We'll just make double sure by trying to play it directly with mplayer. $ mplayer /tmp/FlashXXbA7uge MPlayer SVN-r31918 (C) 2000-2010 MPlayer Team Playing /tmp/FlashXXbA7uge. File not found: '/tmp/FlashXXbA7uge' Failed to open /tmp/FlashXXbA7uge. Exiting... (End of file)

Schrödingers file?

There must be a reference to the file somewhere though! Otherwise

  1. It wouldn't be possible for it to play
  2. It wouldn't be possible for it to be open
At this point we need to remember that each file that a process has access to (on a UNIX or GNU/Linux box) will have a file descriptor. It turns out that I can use the file descriptor to talk to the file. To do this I use the proc filesystem. Recall that proc describes the resources of all the running processes on your box. This includes its file descriptors. So I can list the file descriptor directory for the pid that created that temporary file.$ ls -al /proc/29511/fd total 0 dr-x------ 2 charlie charlie 0 Jun 30 12:13 . dr-xr-xr-x 7 charlie charlie 0 Jun 30 12:13 .. lr-x------ 1 charlie charlie 64 Jun 30 12:13 0 -> /dev/null lrwx------ 1 charlie charlie 64 Jun 30 12:13 1 -> /home/charlie/.xsession-errors l-wx------ 1 charlie charlie 64 Jun 30 12:13 10 -> socket:[15694700] lrwx------ 1 charlie charlie 64 Jun 30 12:13 11 -> pipe:[15694701] lrwx------ 1 charlie charlie 64 Jun 30 12:13 12 -> pipe:[15694701] lrwx------ 1 charlie charlie 64 Jun 30 12:13 13 -> socket:[15694704] lrwx------ 1 charlie charlie 64 Jun 30 12:13 14 -> /home/charlie/.mozilla/firefox/9o92hxts.default/cert8.db lrwx------ 1 charlie charlie 64 Jun 30 12:13 15 -> /home/charlie/.mozilla/firefox/9o92hxts.default/key3.db l-wx------ 1 charlie charlie 64 Jun 30 12:13 16 -> /tmp/FlashXXbA7uge (deleted) lrwx------ 1 charlie charlie 64 Jun 30 12:13 2 -> /home/charlie/.xsession-errors lrwx------ 1 charlie charlie 64 Jun 30 12:13 3 -> socket:[15694628] lrwx------ 1 charlie charlie 64 Jun 30 12:13 4 -> pipe:[15694697] lr-x------ 1 charlie charlie 64 Jun 30 12:13 5 -> pipe:[15694697] l-wx------ 1 charlie charlie 64 Jun 30 12:13 6 -> pipe:[15694698] lr-x------ 1 charlie charlie 64 Jun 30 12:13 7 -> pipe:[15694698] l-wx------ 1 charlie charlie 64 Jun 30 12:13 8 -> anon_inode:[eventpoll] lr-x------ 1 charlie charlie 64 Jun 30 12:13 9 -> socket:[15694699]

Some interesting resources there. Note that STDOUT (i.e. 0) just dumps everything to /dev/null and that STDOUT (1) and STDERR (2) goes to an error log in my home directory. But the one I really care about is 16, which is pointing at my deleted file. Now I can play the video with mplayer rather than that horrible nonfree flashplayer!$ $ mplayer /proc/29511/fd/16 MPlayer SVN-r31918 (C) 2000-2010 MPlayer Team Playing /proc/29511/fd/16. libavformat file format detected. [flv @ 0x22eef20] Estimating duration from bitrate, this may be inaccurate [lavf] stream 0: video (vp6f), -vid 0 [lavf] stream 1: audio (mp3), -aid 0 VIDEO: [VP6F] 320x240 0bpp 30.000 fps 345.6 kbps (42.2 kbyte/s) Clip info: duration: 56 lasttimestamp: 56 datasize: 2891345 metadatacreator: FlixEngineLinux_8.0.5.0 (www.on2.com) canSeekToEnd: 1 videocodecid: 4 width: 320 height: 240 videodatarate: 337 framerate: 30 videosize: 2439741 audiocodecid: 2 audiodatarate: 63 audiosize: 451604 open: No such file or directory [MGA] Couldn't open: /dev/mga_vid open: No such file or directory [MGA] Couldn't open: /dev/mga_vid [VO_TDFXFB] This driver only supports the 3Dfx Banshee, Voodoo3 and Voodoo 5. s3fb: Couldn't map S3 registers: Operation not permitted Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory [vdpau] Error when calling vdp_device_create_x11: 1 ========================================================================== Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family Selected video codec: [ffvp6f] vfm: ffmpeg (FFmpeg VP6 Flash) ========================================================================== ========================================================================== Opening audio decoder: [mp3lib] MPEG layer-2, layer-3 AUDIO: 44100 Hz, 2 ch, s16le, 64.0 kbit/4.54% (ratio: 8000->176400) Selected audio codec: [mp3] afm: mp3lib (mp3lib MPEG layer-2, layer-3) ========================================================================== AO: [alsa] 48000Hz 2ch s16le (2 bytes per sample) Starting playback... Movie-Aspect is undefined - no prescaling applied. VO: [xv] 320x240 => 320x240 Planar YV12 A: 0.1 V: 0.0 A-V: 0.033 ct: 0.000 0/ 0 ??% ??% ??,?% 0 0 [VD_FFMPEG] DRI failure. A: 4.0 V: 4.0 A-V: 0.000 ct: -0.000 0/ 0 8% 0% 1.3% 0 0 Exiting... (Quit)

The one liner

Now, playing videos from file descriptors is a little ungainly, I’d like to copy the data into a convenient regular file. In this case I wrapped up all that we’ve learned into a Bash one-liner. First we need the pid of the flash process. It turns out that in all the experiments I did the string Flash was present in the filename, which makes it possible to grep it out of the rest of the files.$ lsof +L1 | grep -i flash plugin-co 29511 charlie 16w REG 254,1 2955068 0 13852844 /tmp/FlashXXbA7uge (deleted)

To extract the pid, I first use tr -s to squish all the spaces so that they are single spaces, then I can use cut with ' ' as a field delimeter and grab the third field, which is numbered 2 because we're counting from 0. I only need one copy of the pid so I assume that the first line is correct and grab that with head -n1.lsof +L1 | grep -i flash | tr -s ' ' ' ' | cut -d ' ' -f 2 | head -n1

Now that I have my pid, I can use find to track down the file descriptor. The reason for using find is that the same numbered file descriptor is not always used for the temporary file. Find will give me all of the file descriptors pointing to flash files for my process. I backick my previous code, which gets evaluated first and returns our pid. This is then used as part of the path that find is going to use — /proc/[pid]/fd. We tell find to look for symbolic links where the target has flash anywhere in its name, ignoring case (i.e. /tmp/FlashXXbA7uge). Now things get a little hairy at this point. Find might bring back more than one file descriptor. So, we will need to use the name of the file descriptor as part of the name to which we're going to copy. As far as I know the only way to do this within an exec is to fork a new Bash. Interested to hear if anyone knows a more ninja way to do this! I invoke bash with {} as an argument. Find’s exec will translate {} into the full filepath of the file descriptor and of course bash gets this as an arg so I can talk about it as $0. The line noise ${0##*/} is grabbing the file part of the filename. I asign that to a variable x. Finally we tell find to execute the command cp [file] to a new file (vimeo-$x.flv). I want to end up with files like vimeo-16.flv and so on. Here goes. $ find /proc/`lsof +L1 | grep -i flash | tr -s ' ' ' ' | cut -d ' ' -f 2 | head -n1`/fd/ -ilname '*flash*' -exec bash -c 'x=${0##*/}; cp $0 vimeo-$x.flv' {} \;

I am now finally able to retrieve the files from the file descriptors and play the videos with mplayer! Observant readers will notice that I’d watched a couple more videos by the time I ran my commands, hence vimeo-17.flv and vimeo-18.flv.$ ls *flv vimeo-16.flv vimeo-17.flv vimeo-18.flv $ file vimeo-16.flv vimeo-16.flv: Macromedia Flash Video


Comments

  • Be respectful. You may want to read the comment guidelines before posting.
  • You can use Markdown syntax to format your comments. You can only use level 5 and 6 headings.
  • You can add class="your language" to code blocks to help highlight.js highlight them correctly.

Privacy note: This form will forward your IP address, user agent and referrer to the Akismet, StopForumSpam and Botscout spam filtering services. I don’t log these details. Those services will. I do log everything you type into the form. Full privacy statement.