Communications between Nginx and Flask via uWSGI.

Recently I faced a little project, which I think illustrates a good use of the combination of Nginx proxying to uWSGI which is serving a Flask application.

I had a few dozen domains which I wanted to each have a customized landing page, including a simple form to collect visitor email addresses. If the site ever does go live, a quick email blast can be sent to those few inquisitive souls who wished to be alerted.

Setting up a Flask application per domain seemed wasteful in terms of server resource usage, these are domains that will only ever collect a handful of requests at best. So instead I looked at the problem from the other end of the stack-- Nginx is the first port of call for all requests, it knows what domain was requested and in fact passes that information down the line, which can be access via the flask.Request.environ['SERVER_NAME'] object within Flask. docs.

So one method would be to create a configuration file in the Flask application which would load a different config based on the domain being requested. Which seems reasonable, but once implemented that would mean I would have to do the following each time I wanted to add another landing page domain:

  1. Add an Nginx site-configuration for the new domain
  2. Restart Nginx
  3. Add a matching configuration file in the Flask application
  4. Restart uWSGI

So instead, I looked into using the Nginx site-configuration file as the configuration file for the Flask request, which turned out to be simple-- here's an example:

# new_domain.com
server {
    listen       80;
    server_name  .new_domain.com;

    location / { try_files $uri @app; }
    location @app {
        uwsgi_pass unix:/tmp/uwsgi.generic.landing.sock;
        include uwsgi_params;

        uwsgi_param     UWSGI_PYHOME    /var/www/generic_landing/venv;
        uwsgi_param     UWSGI_CHDIR     /var/www/generic_landing;
        uwsgi_param     UWSGI_MODULE    run;
        uwsgi_param     UWSGI_CALLABLE  app;

        uwsgi_param     LANDING_TITLE   "Site Name";
        uwsgi_param     LANDING_EMAIL   "contact@new_domain.com";
    }
}

Then in the Flask application, I can just politely look for those values:

@app.route('/')
def landing_page():
    title = request.environ.get('LANDING_TITLE', 'Default Site')
    email = request.environ.get('LANDING_EMAIL', 'admin@defaultsite.com')
    return render_template('landing_page.html', title=title, email=email)

So now if I ever need to add another landing page domain I can just:

  1. Create new Nginx site-configuration
  2. Restart Nginx

Which saves me two steps and therefore reduces me making a simple mistake.

Dell 5130CDN 007-340 Error

This error is a bit cryptic when you look at the support documentation on the Dell site, their advice boils down to: re-seat all the drum units, and if that fails, replace the drum units. At least in my case, there was an easy way to get it working again which doesn't entail throwing the drum unit away.

While the Dell printer has a waste toner box, it seems that's only for collecting some excess at the end, and the drum unit itself holds waste toner from the initial stage, if you do a lot of full page colour printing, it seems this can fill up and causes the 007-340 error.

To remedy that, open the front cover of the printer, then unlock and open the access panel and remove the drum unit (if you don't know which is causing the problem, you can just do the same process to each). If you look at the far side of the drum unit, there is an access hatch on the top-- carefully slide that hatch back then empty most of the contents into the bin. The toner that will spill out will look like waste toner (slightly burnt colour to it).

Then simply re-install the drum and you should be golden.

Privacy

We live in interesting times-- Snowdon has lifted the curtain and shown us a little of what power our governments have at looking through our data streams. Each time they're questioned we're reassured that the data is limited in scope, precisely targeted and you have absolutely nothing to be concerned with-- and each time they do this the Guardian releases a little more to expose how false that is.

Sen. Ron Wyden asked director of National Intelligence James Clapper a simple question: “Does the NSA collect any type of data at all on millions or hundreds of millions of Americans?”. Clapper, the director of National Intelligence replied, “No sir … not wittingly.”

Since then, we learned about PRISM, but were reassured that, while, yes, that is actually hundreds of millions of people being spied upon, it's only meta data, so don't worry-- it's harmless, without obviously addressing the fact that if that is the case, why is it being so secretly and eagerly harvested.

When questioned on this flat lie, Clapper responded with "[that question was] not answerable necessarily by a simple yes or no. So I responded in what I thought was the most truthful, or least untruthful, manner by saying, ‘No.’"

Least untruthful really is a phrase to ponder.

And now we have XKeyscore which seemingly allows anyone to just type a user identifier (email, ip, etc) and it will give you everything they've done and are doing, in real-time. Let us wait again to hear about how narrowly it's targeted. Here's a clue, it isn't-- everything is stored, that's how developers and computers handle this kind of problem. You don't want to be the system admin that deleted 50 TB of data because you judged it clean, because.. it's easier not to! Having more data in a datamining operation isn't a problem, it's the goal, especially when you have government sized budgets to do the querying.

Talk to your representatives, it's only through law that this kind of program can be curtailed-- the progression of technology makes programs like this continually easier and cheaper, recording everyones phonecalls seems implausible, until you realize that's it is relatively cheap. The Internet Archive which aims (and largely succeeds) to store a historical record of the entire Internet estimates the cost at less than thirty million dollars and only five-thousand square feet of storage space in a datacenter. The National Intelligence budget for 2014 is going to be 48.2 billion dollars, so that twenty-nine million is a mere 0.06% of the budget.

The government has the tools to create a very panoptican society, they see all our online and phone conversations, if you're in a built up area, you're probably on hundreds of private or government cameras a day-- they can scan your house (heat usage, electricity, water, etc) from afar.

The tools that tie all that data together with your shopping habits, medical needs, credit history, sexual preferences, political leanings are only ever going to get better, and cheaper to build. It seems to be legal to observe someone within their home from a telescopic lens, if they leave a crack in their curtain, or window ajar. Cameras are getting smaller, drones are too.

We need laws to protect our data, and those people, no matter who they are, private citizens, businesses or government agencies needs to be punished severely if they're broken, because the law is the only real thing that can protect us against people misusing the data. People should have gone to jail for lifetimes during the FISA scandal just before and into Obamas first term, but amnesties were granted instead.

And the next time you hear someone say "Well, it doesn't matter if you're not breaking the law" please ask them "So, you'd be fine with the government setting up cameras in your home and bedroom?" when they say "Well, that's different" just ask them to explain why.

Privacy is a right, not something that should be granted.

"All are equal before the law and are entitled without any discrimination to equal protection of the law." If we'd just adhere to that, bring that simple need back into reality, we'd see a sea change in the way we're governed.

Persuading WTForms to Generate Checkboxes

WTForms takes the pain out of handling forms and validation, you define your form, tell it what validation checks the data needs to pass, then WTForms uses it's widgets to generate the html.

A common problem you'll face is "Mutiple select fields suck, they are confusing-- how can I show the user a nice list of checkboxes instead?"

The answer is to tell WTForms to use a different widgets to render it's html. Lets first show how we'd render a simple SelectMultipleField form:

from flask import Flask, render_template_string
from wtforms import SelectMultipleField, Form

app = Flask(__name__)

data = [('value_a','Value A'), ('value_b','Value B'), ('value_c','Value C')]

class ExampleForm(Form):
    example = SelectMultipleField(
        'Pick Things!',
        choices=data
    )

@app.route('/')
def home():
    form = ExampleForm()
    return render_template_string('<form>{{ form.example }}</form>',form=form)

if __name__ == '__main__':
    app.run(debug=True)

Voila, we have a basic horrible to use multiple select field, now we just need to change how WTForms renders that SelectMultipleField by telling it to use a couple of different widgets to produce our checkboxes

from wtforms import widgets

class ExampleForm(Form):
    example = SelectMultipleField(
        'Pick Things!',
        choices=data,
        option_widget=widgets.CheckboxInput(),
        widget=widgets.ListWidget(prefix_label=False)
        )

And that's it-- you'll be rendering check boxes for each of the sets of data, congratulations!

Using Flask, Flask-WTF and Boto to Upload Files to S3

Ideally a user would be able to post directly to an S3 storage bucket, and that in turn would alert you to the identifier, but this is the ugly version that ends up with the same result, good enough for images, but latency would only grow with filesize.

So we are going to create a very simple Flask application, that serves up a form to the user, asking for a file. When the file is uploaded to the server, the server then sends it off to S3 for long-term storage, and returns the unique file that was created in the storage bucket to the user.

The HTML template that gets rendered is a very simple form, but because we're using Flask-WTF for handling our form validation and presentation details, we also need to remember the CRSF protection that handily comes with it:

{% with messages = get_flashed_messages() %}
    {% if messages %}
        <ul class=flashes>
            {% for message in messages %}
                <li>{{ message }}</li>
            {% endfor %}
        </ul>
    {% endif %}
{% endwith %}

<form method="POST" enctype="multipart/form-data" action=".">
    {{ form.csrf_token }}
    {{ form.example }}
    <input type="submit" value="Go">
</form>

The {% %} might look a little strange, but all it is doing is saying 'If the webserver has passed me a flashed message, show it'. You can read more about message flashing in the documentation. We're going to use it to send a message back to the user when the S3 upload is complete.

The the application itself, very bare bones

from flask import Flask, render_template, flash
from flask.ext.wtf import FileField, Form
from tools import s3_upload

app = Flask(__name__)
app.config.from_object('config')

class UploadForm(Form):
    example = FileField('Example File')

@app.route('/',methods=['POST','GET'])
def upload_page():
    form = UploadForm()
    if form.validate_on_submit():
        output = s3_upload(form.example)
        flash('{src} uploaded to S3 as {dst}'.format(src=form.example.data.filename, dst=output))
    return render_template('example.html',form=form)

if __name__ == '__main__':
    app.run()

Nothing out of the ordinary here. The basic process is to check if the form sent by the user is valid, if it is, call the s3_upload function and pass in the FileField, which includes a FileStorage object holding the uploaded data. The application brings in some configuration values from a config.py file, shown below— customize it to match your account details.

S3_LOCATION = 'http://your-amazon-site.amazonaws.com/'
S3_KEY = 'YOURAMAZONKEY'
S3_SECRET = 'YOURAMAZONSECRET'
S3_UPLOAD_DIRECTORY = 'what_directory_on_s3'
S3_BUCKET = 's3_bucket_name'

SECRET_KEY = "FLASK_SECRET_KEY"
DEBUG = True
PORT = 5000

So the only bit of code left is the actual S3 upload implementation, and luckily with the boto library it's pretty straight forward.

from uuid import uuid4
import boto
import os.path
from flask import current_app as app
from werkzeug import secure_filename

def s3_upload(source_file,acl='public-read'):
    source_filename = secure_filename(source_file.data.filename)
    source_extension = os.path.splitext(source_filename)[1]

    destination_filename = uuid4().hex + source_extension

    # Connect to S3
    conn = boto.connect_s3(app.config["S3_KEY"], app.config["S3_SECRET"])
    b = conn.get_bucket(app.config["S3_BUCKET"])

    # Upload the File
    sml = b.new_key("/".join([app.config["S3_UPLOAD_DIRECTORY"],destination_filename]))
    sml.set_contents_from_string(source_file.data.readlines())

    # Set the file's permissions.
    sml.set_acl(acl)

    return destination_filename

By default we are going to give the uploaded file permissions to be publically read, there are a number of other ACL options which you can optionally choose.

That's it in a nutshell, have a look at the full project on Github and drop me a note if you feel I've skimmed over something, or have suggestions to improve the implementation.