appleiosdev

How to download Apple developer videos

Command-line scripts to smartly download WWDC, Tech Talks and other videos from Apple's developer portal

Videos on the Apple’s developer portal are an amazing resource. They are easy to search, have load of content, are updated over time as they have new useful resources. So why would you want to download them locally?

Apart from sometimes being easier to scrub around depending on the internet connection you have at hand, there is another benefit. Apple is known to sometimes remove older videos, when they deem the information in them outdated. But you may find yourself in a situation that you need just that specific older piece of information, maybe to debug an issue on older iOS versions. Or sometimes there is no equivalent newer documentation for the covered material.

The scripts I use for few years back are based on some other script I can’t find now, tweaked to work properly for various HTML templates Apple is using. Unless Apple drastically alters their templates, it’s fairly easy to tweak these scripts for new stuff. It will automatically download HD version of the videos and accompanying PDF of the slides, if they exist. Each created file will be named using session numerical code and title, say 201 - Title.mp4, so they will properly order per name.


WWDC videos are grouped per year and I like to keep them that way. Make a folder called wwdc2018, create new text file called fetch.sh inside it and paste this into it.

#!/bin/bash

#Setup the environment
mkdir tmp_download
cd tmp_download

#Extract IDs
echo "Downloading the index"
wget -q https://developer.apple.com/videos/wwdc2018/ -O index.html
#	find parts of the document where data-released=true, all the way to the first H4 header where title of that talk is 
#	then find lines containing "videos/play/wwdc2018", then remove all chars except session number, then clean duplicated lines
cat index.html | sed -n '/data-released=\"true\"/,/\<h4/p' | grep videos/play/wwdc2018 | sed -e 's/.*wwdc2018\///' -e 's/\/\"\>//' | sed '$!N; /^\(.*\)\n\1$/!P; D' > ../downloadData

rm index.html

#Iterate through the talk IDs
while read -r line
do
	#Download the page with the real download URL and the talk name
	wget -q "https://developer.apple.com/videos/play/wwdc2018/$line/" -O webpage

	#We grab the title of the page then clean it up
	talkName=$(cat webpage | grep "<title" | sed -e "s/.*\<title\>//" -e "s/ \- WWDC 2018.*//")

	#We grep "_hd_" which bring up the download URL, then some cleanup
	#If we were to want SD video, all we would have to do is replace _hd_ by _sd_
	dlURL=$(cat webpage | grep _hd_ | sed -e "s/.*href\=//" -e "s/\>.*//" -e "s/\"//g")
	pdfURL=$(cat webpage | grep .pdf | grep devstreaming | sed -e "s/.*href\=//" -e "s/\>.*//" -e "s/\"//g"  -e "s/ .*$//g")

	rm webpage

	#Is there a video URL?
	if [ -z "$dlURL" ]; then
		echo
	else
		echo "Video $line ($talkName)"
		echo "	url: $dlURL"
		#Great, we download the file
		wget -c "$dlURL" -O "../$line - $talkName.mp4"
	fi

	#Is there a PDF URL?
	if [ -z "$pdfURL" ]; then
		echo
	else
		echo "PDF $line ($talkName)"
		echo "	url: $pdfURL"
		#Great, we download the file
		wget -c "$pdfURL" -O "../$line - $talkName.pdf"
	fi

done < "../downloadData"

#cleanup
cd ..
rm -rf tmp_download
rm downloadData

When you want to switch from 2018 to 2019 (or 2017 or whichever) – make new folder, copy this same fetch.sh into it and search & replace 2018 with the appropriate year.


Tech Talks script is almost identical, with altered URL path and web page title.

#!/bin/bash

#Setup the environment
mkdir tmp_download
cd tmp_download

#Extract IDs
echo "Downloading the index"
wget -q https://developer.apple.com/videos/tech-talks/ -O index.html
#	find parts of the document where data-released=true, all the way to the first H4 header where title of that talk is 
#	then find lines containing "videos/play/tech-talks", then remove all chars except session number, then clean duplicated lines
cat index.html | sed -n '/data-released=\"true\"/,/\<h4/p' | grep videos/play/tech-talks | sed -e 's/.*tech-talks\///' -e 's/\/\"\>//' | sed '$!N; /^\(.*\)\n\1$/!P; D' > ../downloadData

rm index.html

#Iterate through the talk IDs
while read -r line
do
	#Download the page with the real download URL and the talk name
	wget -q "https://developer.apple.com/videos/play/tech-talks/$line/" -O webpage

	#We grab the title of the page then clean it up
	talkName=$(cat webpage | grep "<title" | sed -e "s/.*\<title\>//" -e "s/ \- Tech Talks.*//")

	#We grep "_hd" which bring up the download URL, then some cleanup
	#If we were to want SD video, all we would have to do is replace _hd by _sd
	dlURL=$(cat webpage | grep _hd | sed -e "s/.*href\=//" -e "s/\>.*//" -e "s/\"//g")
	pdfURL=$(cat webpage | grep .pdf | grep devstreaming | sed -e "s/.*href\=//" -e "s/\>.*//" -e "s/\"//g"  -e "s/ .*$//g")

	rm webpage

	#Is there a video URL?
	if [ -z "$dlURL" ]; then
		echo
	else
		echo "Video $line ($talkName)"
		echo "	url: $dlURL"
		#Great, we download the file
		wget -c "$dlURL" -O "../$line - $talkName.mp4"
	fi

	#Is there a PDF URL?
	if [ -z "$pdfURL" ]; then
		echo
	else
		echo "PDF $line ($talkName)"
		echo "	url: $pdfURL"
		#Great, we download the file
		wget -c "$pdfURL" -O "../$line - $talkName.pdf"
	fi

done < "../downloadData"

#cleanup
cd ..
rm -rf tmp_download
rm downloadData

App Store Connect script, again with just slight changes for URL and web page title.

#!/bin/bash

#Setup the environment
mkdir tmp_download
cd tmp_download

#Extract IDs
echo "Downloading the index"
wget -q https://developer.apple.com/videos/app-store-connect/ -O index.html
#	find parts of the document where data-released=true, all the way to the first H4 header where title of that talk is 
#	then find lines containing "videos/play/app-store-connect", then remove all chars except session number, then clean duplicated lines
cat index.html | sed -n '/data-released=\"true\"/,/\<h4/p' | grep videos/play/app-store-connect | sed -e 's/.*app-store-connect\///' -e 's/\/\"\>//' | sed '$!N; /^\(.*\)\n\1$/!P; D' > ../downloadData

rm index.html

#Iterate through the talk IDs
while read -r line
do
	#Download the page with the real download URL and the talk name
	wget -q "https://developer.apple.com/videos/play/app-store-connect/$line/" -O webpage

	#We grab the title of the page then clean it up
	talkName=$(cat webpage | grep "<title" | sed -e "s/.*\<title\>//" -e "s/ \- App Store Connect.*//")

	#We grep "_hd" which bring up the download URL, then some cleanup
	#If we were to want SD video, all we would have to do is replace _hd by _sd
	dlURL=$(cat webpage | grep _hd | sed -e "s/.*href\=//" -e "s/\>.*//" -e "s/\"//g")
	pdfURL=$(cat webpage | grep .pdf | grep devstreaming | sed -e "s/.*href\=//" -e "s/\>.*//" -e "s/\"//g"  -e "s/ .*$//g")

	rm webpage

	#Is there a video URL?
	if [ -z "$dlURL" ]; then
		echo
	else
		echo "Video $line ($talkName)"
		echo "	url: $dlURL"
		#Great, we download the file
		wget -c "$dlURL" -O "../$line - $talkName.mp4"
	fi

	#Is there a PDF URL?
	if [ -z "$pdfURL" ]; then
		echo
	else
		echo "PDF $line ($talkName)"
		echo "	url: $pdfURL"
		#Great, we download the file
		wget -c "$pdfURL" -O "../$line - $talkName.pdf"
	fi

done < "../downloadData"

#cleanup
cd ..
rm -rf tmp_download
rm downloadData

Usage

Open the desired folder in Terminal and type sh fetch.sh; make sure your computer does not go to sleep. You can stop and re-start as you please. You can run these scripts many times, they will smartly check the local files and resume where they left off. Scripts are using wget, the version I have is:

$ wget --version
GNU Wget 1.19.4 built on darwin17.3.0.

I got it through Homebrew, as anything else command-line related.

Have fun. This is really useful today, since Apple just updated Tech Talks section with iPhone Xs and Watch Series 4 content.