The GridFS plugin

Beginning in uWSGI 1.9.5 a “GridFS” plugin is available. It exports both arequest handler and an internal routing function. Its official modifier is‘25’. The routing instruction is “gridfs” The plugin is written in C++.

Requirements and install

To build the plugin you need the libmongoclient headers (and a functioningC++ compiler). On a Debian-like system you can do the following.

  1. apt-get install mongodb-dev g++

A build profile for gridfs is available:

  1. UWSGI_PROFILE=gridfs make

Or you can build it as plugin:

  1. python uwsgiconfig.py --plugin plugins/gridfs

For a fast installation of a monolithic build you can use the networkinstaller:

  1. curl http://uwsgi.it/install | bash -s gridfs /tmp/uwsgi

This will install a gridfs enabled uwsgi binary.

Standalone quickstart

This is a standalone config that blindly maps the incoming PATH_INFO toitems in the GridFS db named “test”:

  1. [uwsgi]
  2. ; you can remove the plugin directive if you are using a uWSGI gridfs monolithic build
  3. plugin = gridfs
  4. ; bind to http port 9090
  5. http-socket = :9090
  6. ; force the modifier to be the 25th
  7. http-socket-modifier1 = 25
  8. ; map gridfs requests to the "test" db
  9. gridfs-mount = db=test

Assuming you have the myfile.txt file stored in your GridFS as “/myfile.txt”,run the following:

  1. curl -D /dev/stdout http://localhost:9090/myfile.txt

and you should be able to get it.

The initial slash problem

Generally PATH_INFO is prefixed with a ‘/’. This could cause problems inGridFS path resolution if you are not storing the items with absolute pathnames. To counteract this, you can make the gridfs plugin to skip theinitial slash:

  1. [uwsgi]
  2. ; you can remove the plugin directive if you are using a uWSGI gridfs monolithic build
  3. plugin = gridfs
  4. ; bind to http port 9090
  5. http-socket = :9090
  6. ; force the modifier to be the 25th
  7. http-socket-modifier1 = 25
  8. ; map gridfs requests to the "test" db
  9. gridfs-mount = db=test,skip_slash=1

Now instead of searching for /myfile.txt it will search for “myfile.txt”.

Multiple mountpoints (and servers)

You can mount different GridFS databases under different SCRIPT_NAME (orUWSGI_APPID). If your web server is able to correctly manage theSCRIPT_NAME variable you do not need any additional setup (other than–gridfs-mount). Otherwise don’t forget to add the –manage-script-name option

  1. [uwsgi]
  2. ; you can remove the plugin directive if you are using a uWSGI gridfs monolithic build
  3. plugin = gridfs
  4. ; bind to http port 9090
  5. http-socket = :9090
  6. ; force the modifier to be the 25th
  7. http-socket-modifier1 = 25
  8. ; map gridfs requests to the "test" db
  9. gridfs-mount = db=test,skip_slash=1
  10. ; map /foo to db "wolverine" on server 192.168.173.17:4040
  11. gridfs-mount = mountpoint=/foo,server=192.168.173.17:4040,db=wolverine
  12. ; map /bar to db "storm" on server 192.168.173.30:4040
  13. gridfs-mount = mountpoint=/bar,server=192.168.173.30:4040,db=storm
  14. ; force management of the SCRIPT_NAME variable
  15. manage-script-name = true
  1. curl -D /dev/stdout http://localhost:9090/myfile.txt
  2. curl -D /dev/stdout http://localhost:9090/foo/myfile.txt
  3. curl -D /dev/stdout http://localhost:9090/bar/myfile.txt

This way each request will map to a different GridFS server.

Replica sets

If you are using a replica set, you can use it in your uWSGI config with thissyntax: <replica>server1,server2,serverN…

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. gridfs-mount = server=rs0/ubuntu64.local\,raring64.local\,mrspurr-2.local,db=test

Pay attention to the backslashes used to escape the server list.

Prefixes

As well as removing the initial slash, you may need to prefix each item name:

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. gridfs-mount = server=rs0/ubuntu64.local\,raring64.local\,mrspurr-2.local,db=test,prefix=/foobar___

A request for /test.txt will be mapped to /foobar_/test.txt

while

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. gridfs-mount = server=rs0/ubuntu64.local\,raring64.local\,mrspurr-2.local,db=test,prefix=/foobar___,skip_slash=1

will map to /foobar___test.txt

MIME types and filenames

By default the MIME type of the file is derived from the filename stored inGridFS. This filename might not map to the effectively requested URI or you maynot want to set a content_type for your response. Or you may want to allowsome other system to set it. If you want to disable MIME type generation justadd no_mime=1 to the mount options.

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. gridfs-mount = server=ubuntu64.local,db=test,skip_slash=1,no_mime=1

If you want your response to set the filename using the original value (the onestored in GridFS) add orig_filename=1

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. gridfs-mount = server=ubuntu64.local,db=test,skip_slash=1,no_mime=1,orig_filename=1

Timeouts

You can set the timeout of the low-level MongoDB operations by addingtimeout=N to the options:

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. ; set a 3 seconds timeout
  5. gridfs-mount = server=ubuntu64.local,db=test,skip_slash=1,timeout=3

MD5 and ETag headers

GridFS stores an MD5 hash of each file. You can add this info to your responseheaders both as ETag (MD5 in hex format) or Content-MD5 (in Base64). Useetag=1 for adding ETag header and md5=1 for adding Content-MD5. There’snothing stopping you from adding both headers to the response.

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. ; set a 3 seconds timeout
  5. gridfs-mount = server=ubuntu64.local,db=test,skip_slash=1,timeout=3,etag=1,md5=1

Multithreading

The plugin is fully thread-safe, so consider using multiple threads forimproving concurrency:

  1. [uwsgi]
  2. http-socket = :9090
  3. http-socket-modifier1 = 25
  4. ; set a 3 seconds timeout
  5. gridfs-mount = server=ubuntu64.local,db=test,skip_slash=1,timeout=3,etag=1,md5=1
  6. master = true
  7. processes = 2
  8. threads = 8

This will spawn 2 processes monitored by the master with 8 threads each for atotal of 16 threads.

Combining with Nginx

This is not different from the other plugins:

  1. location / {
  2. include uwsgi_params;
  3. uwsgi_pass 127.0.0.1:3031;
  4. uwsgi_modifier1 25;
  5. }

Just be sure to set the uwsgi_modifier1 value to ensure all requests getrouted to GridFS.

  1. [uwsgi]
  2. socket = 127.0.0.1:3031
  3. gridfs-mount = server=ubuntu64.local,db=test,skip_slash=1,timeout=3,etag=1,md5=1
  4. master = true
  5. processes = 2
  6. threads = 8

The ‘gridfs’ internal routing action

The plugin exports a ‘gridfs’ action simply returning an item:

  1. [uwsgi]
  2. socket = 127.0.0.1:3031
  3. route = ^/foo/(.+).jpg gridfs:server=192.168.173.17,db=test,itemname=$1.jpg

The options are the same as the request plugin’s, with “itemname” being theonly addition. It specifies the name of the object in the GridFS db.

Notes

  • If you do not specify a server address, 127.0.0.1:27017 is assumed.
  • The use of the plugin in async modes is not officially supported, but may work.
  • If you do not get why a request is not serving your GridFS item, consideradding the —gridfs-debug option. It will print the requested item in uWSGIlogs.