HTTP & Web Servers
https://classroom.udacity.com/courses/ud303
Lesson 2: The Web from Python
Python’s http.server
In the exercises in this lesson, you’ll be writing code that runs on your own computer. You’ll need the starter code that you downloaded at the end of the last lesson, which should be in a directory called course-ud303
.
Servers and handlers
- the
HTTPServer
class- built in to themodule and is the same for every web service
- knows how to listen on a port and accept HTTP requests from clients
- request handler
- different for every web service
What Python code needs to do:
-
Import
http.server
, or at least the pieces of it that you need. -
Create a subclass of
http.server.BaseHTTPRequestHandler
. This is your handler class. -
Define a method on the handler class for each
HTTP verb you want to handle.
- The method for GET requests has to be called
do_GET
. - Inside the method, call built-in methods of the handler class to read the HTTP request and write the response.
- The method for GET requests has to be called
-
Create an instance of
http.server.HTTPServer
, giving it your handler class and server information — particularly, the port number. -
Call the
HTTPServer
instance’sserve_forever
method.
Exercise: The hello server
/course-ud303/Lesson-2/0_HelloServer
from http.server import HTTPServer, BaseHTTPRequestHandler
class HelloHandler(BaseHTTPRequestHandler):
def do_GET(self):
# First, send a 200 OK response.
self.send_response(200)
# Then send headers.
self.send_header('Content-type', 'text/plain; charset=utf-8')
self.end_headers()
# Now, write the response body.
self.wfile.write("Hello, HTTP!\n".encode())
if __name__ == '__main__':
server_address = ('', 8000) # Serve on all addresses, port 8000.
httpd = HTTPServer(server_address, HelloHandler)
httpd.serve_forever()
Exercise: The echo server
/course-ud303/Lesson-2/1_EchoServer
#!/usr/bin/env python3
#
# This is the solution code for the *echo server*.
from http.server import HTTPServer, BaseHTTPRequestHandler
class EchoHandler(BaseHTTPRequestHandler):
def do_GET(self):
# First, send a 200 OK response.
self.send_response(200)
# Then send headers.
self.send_header('Content-type', 'text/plain; charset=utf-8')
self.end_headers()
# Now, write the response body.
self.wfile.write(self.path[1:].encode())
if __name__ == '__main__':
server_address = ('', 8000) # Serve on all addresses, port 8000.
httpd = HTTPServer(server_address, EchoHandler)
httpd.serve_forever()
Queries and quoting
The query part of the URI is the part after the ?
mark. Conventionally, query parameters are written as key=value
and separated by &
signs.
Python library: urllib.parse
https://docs.python.org/3/library/urllib.parse.html
>>> from urllib.parse import urlparse, parse_qs
>>> address = 'https://www.google.com/search?q=gray+squirrel&tbm=isch'
>>> parts = urlparse(address)
>>> print(parts)
ParseResult(scheme='https', netloc='www.google.com', path='/search', params='', query='q=gray+squirrel&tbm=isch', fragment='')
>>> print(parts.query)
q=gray+squirrel&tbm=isch
>>> query = parse_qs(parts.query)
>>> query
{'q': ['gray squirrel'], 'tbm': ['isch']}
URL quoting:urllib.parse.quote
https://docs.python.org/3/library/urllib.parse.html#url-quoting
"Quoting" in this sense doesn’t have to do with
quotation marks, the kind you find around Python strings. It means translating a string into a form that doesn’t have any special characters in it, but in a way that can be reversed (unquoted) later.
Exercise: HTML and forms
If you need a refresher on HTML forms, take a look at the MDN introduction (gentle) or the W3C standard reference (more advanced).
Lesson-2/2_HTMLForms/LoginPage.html
.
Exercise: Form up for action
Let’s do another example! This HTML form has a pull-down menu with four options.
<!DOCTYPE html>
<title>Search wizardry!</title>
<form action="http://www.google.com/search" method=GET>
<label>Search term:
<input type="text" name="q">
</label>
<br>
<label>Corpus:
<select name="tbm">
<option selected value="">Regular</option>
<option value="isch">Images</option>
<option value="bks">Books</option>
<option value="nws">News</option>
</select>
</label>
<br>
<button type="submit">Go go!</button>
</form>
GET and POST
GET
methods are good for search forms and other actions that are intended to look something up or ask the server for a copy of some resource. But GET
is not recommended for actions that are intended to alter or create a resource. For this sort of action, HTTP has a different verb, POST
.
idempotent
Exercise: Be a server and receive a POST request
Lesson-2/2_HTMLForms/PostForm.html
ncat -l 9999
A server for POST
pip3 install requests
-
The
do_POST
method:- read the request body by calling the
self.rfile.read
method self.rfile.read
needs to be told how many bytes to read. The browser sends the length of the request body in theContent-Length
header
- read the request body by calling the
-
Headers
-
the instance variable
self.headers
-
keys are case-insentive
-
the values are strings, needs to be converted to an integer
-
if the body is empty,
content-length
can be missing. Therefore we’ll use the.get
dictionary method to get the header value safely -
code in
do_POST
find the length of the request body and read it:length = int(self.headers.get('Content-length', 0)) data = self.rfile.read(length).decode()
-
Exercise: Messageboard
Lesson-2/3_MessageboardPartOne
- Find the length of the POST request data.
- Read the correct amount of request data.
- Extract the “message” field from the request data.
- Run the
MessageboardPartOne.py
server. - Open the
MessageboardPartOne.html
file in your browser and submit it - Run the test script
test.py
with the server running.
Lesson-2/4_MessageboardPartTwo
Post-Redirect-Get (PRG)
- A client POSTs to a server to create or update a resource
- On success, the server replies not with a
200 OK
but with a303
redirect - The redirect causes the client to GET the created or updated resources.
Lesson-2/5_MessageboardPartThree
- In the
do_POST
method, send a 303 redirect back to the root page (/
). - In the
do_GET
method, assemble the response data together out of the form template and the stored messages - Run the server and test it in your browser.
- Run the tests in
test.py
with the server running.
Making requests
The requests
library is a Python library for sending requests to web servers and interpreting the responses.
pip3 install requests
the quickstart documentation for requests
Response objects
When you send a request, you get back a Response
object.
r.text
read the content of the server’s responser.content
access the response body as bytes
Handling errors
- accessing a nonexistent site raises a Python exception
- accessing a nonexistent page on a real site gives you an object
r
wherer.status_code
is an error code.
Using a JSON API
If you call r.json()
on a Response
that isn’t made of JSON data, it raises a json.decoder.JSONDecodeError
exception. If you want to catch this exception with a try
block, you’ll need to import it from the json
module.
Exercise: Use JSON with UINames.com
Lesson-2/6_UsingJSON
- Decode the JSON data returned by the
GET
request - Print out the JSON data fields in the specified format
- Test your code by running
UINames.py
. - Run the test script in
test.py
.
Exercise: The bookmark server
Like the messageboard server, this bookmark server will keep all of its data in memory. This means that it’ll be reset if you restart it.
Your server needs to do three things, depending on what kind of request it receives:
-
On a GET request to the / path, it displays an HTML form with two fields. One field is where you put the long URI you want to shorten. The other is where you put the short name you want to use for it. Submitting this form sends a POST to the server.
-
On a POST request, the server looks for the two form fields in the request body. If it has those, it first checks the URI with
requests.get
to make sure that it actually exists (returns a 200).
- If the URI exists, the server stores a dictionary entry mapping the short name to the long URI, and returns an HTML page with a link to the short version.
- If the URI doesn’t actually exist, the server returns a 404 error page saying so.
- If either of the two form fields is missing, the server returns a 400 error page saying so.
-
On a GET request to an existing short URI, it looks up the corresponding long URI and serves a redirect to it.
Checklist
-
Write the
CheckURI
function. This function should take a URI as an argument, and returnTrue
if that URI could be successfully fetched, andFalse
if it can’t -
Write the code inside
do_GET
that sends a 303 redirect to a known name. -
Write the code inside
do_POST
that sends a 400 error if the form fields are not present in the POST. -
Write the code inside
do_POST
that sends a 303 redirect to the form after saving a newly submitted URI. -
Write the code inside
do_POST
that sends a 404 error if a URI is not successfully checked (i.e. if CheckURI returnsFalse
).