Python Network Programming Constructing URLs

Python Network Programming: Building URLs

The requests module helps us build URLs and dynamically manipulate URL values. Any subdirectory of a URL can be programmatically retrieved, and then parts of it can be replaced with new values to build a new URL.

Build_URL

The following example uses urljoin to retrieve different subdirectories within a URL path. The urljoin method is used to append new values to a base URL.

from requests.compat import urljoin
base='https://stackoverflow.com/questions/3764291'
print urljoin(base,'.')
print urljoin(base,'..')
print urljoin(base,'...')
print urljoin(base,'/3764299/')
url_query = urljoin(base,'?vers=1.0')
print url_query
url_sec = urljoin(url_query,'#section-5.4')
print url_sec

When we run the above program, we get the following output −

https://stackoverflow.com/questions/
https://stackoverflow.com/
https://stackoverflow.com/questions/...
https://stackoverflow.com/3764299/
https://stackoverflow.com/questions/3764291?vers=1.0
https://stackoverflow.com/questions/3764291?vers=1.0#section-5.4

Splitting URLs

URLs can also be split into parts other than the main URL. Additional parameters for specific queries or tags appended to the URL can be separated using the urlparse method, as shown below.

from requests.compat import urlparse
url1 = 'https://docs.python.org/2/py-modindex.html#cap-f'
url2='https://docs.python.org/2/search.html?q=urlparse'
print urlparse(url1)
print urlparse(url2)

When we run the above program, we get the following output −

ParseResult(scheme='https', netloc='docs.python.org', path='/2/py-modindex.html', params='', query='', fragment='cap-f')
ParseResult(scheme='https', netloc='docs.python.org', path='/2/search.html', params='', query='q=urlparse', fragment='')

Leave a Reply

Your email address will not be published. Required fields are marked *