Uploading Files

Ah yes, the good old problem of file uploads. The basic idea of fileuploads is actually quite simple. It basically works like this:

  • A <form> tag is marked with enctype=multipart/form-dataand an <input type=file> is placed in that form.

  • The application accesses the file from the filesdictionary on the request object.

  • use the save() method of the file to savethe file permanently somewhere on the filesystem.

A Gentle Introduction

Let’s start with a very basic application that uploads a file to aspecific upload folder and displays a file to the user. Let’s look at thebootstrapping code for our application:

  1. import os
  2. from flask import Flask, flash, request, redirect, url_for
  3. from werkzeug.utils import secure_filename
  4.  
  5. UPLOAD_FOLDER = '/path/to/the/uploads'
  6. ALLOWED_EXTENSIONS = {'txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'}
  7.  
  8. app = Flask(__name__)
  9. app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER

So first we need a couple of imports. Most should be straightforward, thewerkzeug.secure_filename() is explained a little bit later. TheUPLOAD_FOLDER is where we will store the uploaded files and theALLOWED_EXTENSIONS is the set of allowed file extensions.

Why do we limit the extensions that are allowed? You probably don’t wantyour users to be able to upload everything there if the server is directlysending out the data to the client. That way you can make sure that usersare not able to upload HTML files that would cause XSS problems (seeCross-Site Scripting (XSS)). Also make sure to disallow .php files if the serverexecutes them, but who has PHP installed on their server, right? :)

Next the functions that check if an extension is valid and that uploadsthe file and redirects the user to the URL for the uploaded file:

  1. def allowed_file(filename):
  2. return '.' in filename and \
  3. filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
  4.  
  5. @app.route('/', methods=['GET', 'POST'])
  6. def upload_file():
  7. if request.method == 'POST':
  8. # check if the post request has the file part
  9. if 'file' not in request.files:
  10. flash('No file part')
  11. return redirect(request.url)
  12. file = request.files['file']
  13. # if user does not select file, browser also
  14. # submit an empty part without filename
  15. if file.filename == '':
  16. flash('No selected file')
  17. return redirect(request.url)
  18. if file and allowed_file(file.filename):
  19. filename = secure_filename(file.filename)
  20. file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
  21. return redirect(url_for('uploaded_file',
  22. filename=filename))
  23. return '''
  24. <!doctype html>
  25. <title>Upload new File</title>
  26. <h1>Upload new File</h1>
  27. <form method=post enctype=multipart/form-data>
  28. <input type=file name=file>
  29. <input type=submit value=Upload>
  30. </form>
  31. '''

So what does that secure_filename() function actually do?Now the problem is that there is that principle called “never trust userinput”. This is also true for the filename of an uploaded file. Allsubmitted form data can be forged, and filenames can be dangerous. Forthe moment just remember: always use that function to secure a filenamebefore storing it directly on the filesystem.

Information for the Pros

So you’re interested in what that secure_filename()function does and what the problem is if you’re not using it? So justimagine someone would send the following information as filename toyour application:

  1. filename = "../../../../home/username/.bashrc"

Assuming the number of ../ is correct and you would join this withthe UPLOAD_FOLDER the user might have the ability to modify a file onthe server’s filesystem he or she should not modify. This does require someknowledge about how the application looks like, but trust me, hackersare patient :)

Now let’s look how that function works:

  1. >>> secure_filename('../../../../home/username/.bashrc')
  2. 'home_username_.bashrc'

Now one last thing is missing: the serving of the uploaded files. In theupload_file() we redirect the user tourl_for('uploaded_file', filename=filename), that is, /uploads/filename.So we write the uploaded_file() function to return the file of that name. Asof Flask 0.5 we can use a function that does that for us:

  1. from flask import send_from_directory
  2.  
  3. @app.route('/uploads/<filename>')
  4. def uploaded_file(filename):
  5. return send_from_directory(app.config['UPLOAD_FOLDER'],
  6. filename)

Alternatively you can register uploaded_file as build_only rule anduse the SharedDataMiddleware. This also works witholder versions of Flask:

  1. from werkzeug import SharedDataMiddleware
  2. app.add_url_rule('/uploads/<filename>', 'uploaded_file',
  3. build_only=True)
  4. app.wsgi_app = SharedDataMiddleware(app.wsgi_app, {
  5. '/uploads': app.config['UPLOAD_FOLDER']
  6. })

If you now run the application everything should work as expected.

Improving Uploads

Changelog

New in version 0.6.

So how exactly does Flask handle uploads? Well it will store them in thewebserver’s memory if the files are reasonable small otherwise in atemporary location (as returned by tempfile.gettempdir()). But howdo you specify the maximum file size after which an upload is aborted? Bydefault Flask will happily accept file uploads to an unlimited amount ofmemory, but you can limit that by setting the MAX_CONTENT_LENGTHconfig key:

  1. from flask import Flask, Request
  2.  
  3. app = Flask(__name__)
  4. app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024

The code above will limit the maximum allowed payload to 16 megabytes.If a larger file is transmitted, Flask will raise aRequestEntityTooLarge exception.

Connection Reset Issue

When using the local development server, you may get a connectionreset error instead of a 413 response. You will get the correctstatus response when running the app with a production WSGI server.

This feature was added in Flask 0.6 but can be achieved in older versionsas well by subclassing the request object. For more information on thatconsult the Werkzeug documentation on file handling.

Upload Progress Bars

A while ago many developers had the idea to read the incoming file insmall chunks and store the upload progress in the database to be able topoll the progress with JavaScript from the client. Long story short: theclient asks the server every 5 seconds how much it has transmittedalready. Do you realize the irony? The client is asking for something itshould already know.

An Easier Solution

Now there are better solutions that work faster and are more reliable. Thereare JavaScript libraries like jQuery that have form plugins to ease theconstruction of progress bar.

Because the common pattern for file uploads exists almost unchanged in allapplications dealing with uploads, there is also a Flask extension calledFlask-Uploads that implements a full fledged upload mechanism with white andblacklisting of extensions and more.