Yumi's Blog

Extract URL for the pictures in Flickr's public album via python

When you make your own website using third-party clouds, e.g. Heroku, there is a restriction for the uploaded data size. For example, Heroku only allows 500MB memory space. This might become a too tight constraint if you want to add some pictures to your websites as high resolution pictures nowadays could easily be about 10MB.

Instead of uploading photos to these clouds together with your .html and .css files, it may be wise to use other image hosting service such as Flickr or Instagram: you may upload images to these image hosting services, make them public, and then simply add link to the photos in the image hosting service in your webpage.

In this blog post, I will explore this approach and present how to extract pictures from Flickr's public album using python.

I assume that you know the user_id of the owner of Flickr's public album. user_id is most likely of the form: 123456789@N12. For example my user id is 157237655@N08

See here to find user_id of Fliker users.


Step 0: Get an API key to make requests

Before you can make a request with the Flickr API, you’ll need an API key (free). Follow the instruction here. When you register an app, you’re given a key and secret.

Step 1: Get photoset id via flickr.photosets.getList endpoint

To extract photos in Flikr's public albums, I need to know a photoset id. This is an identifier for the album.

In [1]:
import requests
import json, sys
from personal import flikr_api_key as api_key

def get_requestURL(user_id,endpoint="getList"):
    user_id = user_id.replace("@","%40")
    url_upto_apikey = ("https://api.flickr.com/services/rest/?method=flickr.photosets." + 
                       endpoint + 
                       "&api;_key=" +  api_key + 
                       "&user;_id=" +  user_id + 

user_id = "157237655@N08"
url = get_requestURL(user_id,endpoint="getList") 
strlist = requests.get(url).content
json_data = json.loads(strlist)
albums = json_data["photosets"]["photoset"]

print("{} albums found for user_id={}".format(len(albums),user_id))
14 albums found for user_id=157237655@N08

Let's look at some of the album titles

In [2]:
photosetids, titles = [], []
for album in albums:
    print("album title={} photoset_id={}".format(album['title']['_content'],album["id"]))
album title= 5/5/2018 Day12 photoset_id=72157666947668397
album title=5/7/2018 Day14 photoset_id=72157695104270601
album title=5/6/2018 Day13 photoset_id=72157668988962728
album title=5/4/2018 Day11 photoset_id=72157695104114951
album title=5/2/2018 Day9 photoset_id=72157696204993104
album title=5/3/2018 Day10 photoset_id=72157696204977184
album title=4/28/2018 Day5 photoset_id=72157693696318772
album title=4/29/2018 Day6 photoset_id=72157695104025101
album title=4/30/2018 Day7 photoset_id=72157666947100257
album title=5/1/2018 Day8 photoset_id=72157695104003731
album title=4/27/2018 Day4 photoset_id=72157693696247802
album title=4/26/2018 Day3 photoset_id=72157695103914511
album title=4/25/2018 Day2 photoset_id=72157666947001577
album title=4/24/2018 Day1 photoset_id=72157668988636988

Step 2: For each album, extract infomation of all the photos.

In order to find the unique URL to each photo, I need to know:

  • farm ID
  • server ID
  • ID
  • secret

Such infomation is extracted using flickr.photosets.getPhotos API.

In [3]:
def get_photo_url(farmId,serverId,Id,secret):
    return (("https://farm" + str(farmId) + 
            ".staticflickr.com/" + serverId + 
            "/" + Id + '_' + secret + ".jpg"))

URLs = {}
for photoset_id, title in zip(photosetids,titles): ## for each album
    url = get_requestURL(user_id,endpoint="getPhotos") + "&photoset;_id=" + photoset_id
    strlist = requests.get(url).content
    json1_data = json.loads(strlist)
    urls = []
    for pic in json1_data["photoset"]["photo"]: ## for each picture in an album
        urls.append(get_photo_url(pic["farm"],pic['server'], pic["id"], pic["secret"]))
    URLs[photoset_id] = urls

Finally let's plot the extracted photos from the first 4 albums

The codes seem to be working!

In [4]:
from IPython.display import Image, display

count = 1
for i, (photoset_id, urls) in enumerate(URLs.items()):
    print("{}, photoset_id={}".format(titles[i],photoset_id))
    for url in urls:
        display(Image(url= url, width=200, height=200))
    count += 1
    if count > 4:
 5/5/2018 Day12, photoset_id=72157666947668397
5/7/2018 Day14, photoset_id=72157695104270601
5/6/2018 Day13, photoset_id=72157668988962728
5/4/2018 Day11, photoset_id=72157695104114951