Especially on busy sites with lots and files and images (e.g. portraits by users) which are stored in the content repository, doing the round trip to AOLserver clutters the backend with requests which are unnecessary IF you are storing the content-repository files in the directory content-repository-content-files and not in the database. Although Gustaf's patch for background delivery of files works a long way towards solving the performance issue, the best way to handle this is not to bother AOLserver at all with this.
The easiest way of achieving this is to use NGINX as a mirror on demand with the object_id acting as a reference. This means that you need to have a single URL from which you can access all files stored in the content repository (e.g. /acs-subsite/www/file.vuh) and use the proxy_store directive (seehttp://cognovis.de/developer/en/nginx_accelerating_proxy) to mirror the result.
root /var/lib/aolserver/resources/SERVER0;
error_page 404 = /fetch$uri;
}
location /fetch {
internal;
proxy_pass http://127.0.0.1:80;
proxy_store on;
proxy_store_access user:rw group:rw all:r;
proxy_temp_path /var/lib/aolserver/resources/SERVER0/cache;
alias /var/lib/aolserver/resources/SERVER0;
}
This method is fast, reliable and easy to achieve (once all pages in OpenACS use /file/$object_id to serve a file instead of using cr_write_content on it's own). But it has a huge drawback:
Security
As you can imagine, if it is acting as a mirror on demand, the file will be mirrored for the first user accessing it. If the user has permission to see it then the file is mirrored. All subsequent requests will not have this permission check anymore. This is great if you don't need permissions on files or in special parts of the OpenACS installation like serving portraits (assuming anyone can see the portraits of users). But in all other circumstances you have opened up a security breach.
One way of preventing this is to use /file and within /file, after permission check, redirect to a mirrored /file2 to serve the file from there. Especially with file downloads there is a low risk that users could do URL forging, especially if you use some kind of encrypted object_id as a key so they could not access files they are not allowed to, unless though a privileged user copies the redirected URL from his Download Manager and hands this out (so the /file2 one instead of the /file one). This does need quite an amount of effort and it might be easier for the user just to hand out his username and password, but it is a security breach nonetheless.
X-Accel-Redirect
To prevent this breach there is a method provided by NGINX called "X-Accel-Redirect" which allows the server to send back "X-Accel-Redirect" in the Response header and NGINX will ignore the rest of the page and redirect internally to the location specified in X-Accel-Redirect. This happens without the knowledge of the browser, therefore it is impossible for the Visitor of the site to do some URL forgery as he is unable to access the internal URL in the first place. Why? Because if a location in NGINX is marked as "internal" it is only accessible from within NGINX or the backend, never from the Browser.
To make use of X-Accel-Redirect with OpenACS there are two ways to go. One is to use the same internal caching as above, but the much smarter way is to grant NGINX direct read access on the content-repository-content-files location. This way AOLserver does not even have to serve the file once. If NGINX is running on the same machine as the AOLserver, then it is easy (and described below), otherwise you have to mount the content-repository-content-files directory using NFS or any other network drive method you prefer.
To provide NGINX with the location for the CR files, the following directive is enough:
location /content-repository-content-files {
internal;
root /var/lib/aolserver/SERVICE0/;
access_log /var/log/nginx/SERVICE0.content.log;
}
Two things are worth noticing. The location is called content-repository-content-files, which, together with root, will look in the correct directory. The access_log directive is primarily for you to monitor the serving of the files and checking that the changes in OpenACS work.
In OpenACS you need to make a change to file.vuh and image.vuh (or any place that uses cr_write_content, I prefer though to change the cr_write_content to head over to /file so we can do view tracking in a central place). Instead of using cr_write_content you need to figure out the filename (Oracle) or content (Postgres) of the revision you want to serve. This is the path description where under /var/lib/aolserver/SERVICE0/content-repository-content-files you can find the file (physical location).
Postgres
id = ci.item_id and ci.item_id = :item_id"
Oracle
id = ci.item_id and ci.item_id = :item_id"
Instead of name you could also use title. The name/title will be used for the browser as the filename to save the file under on the users harddisk.
In file.vuh, as some invocators use the ability to put the desired filename after the slash of the object_id you can add the following to define your filename:
if {$anchor ne ""} {
set name $anchor
}
Now come the crucial four lines of code which actually do the redirect and instruct NGINX to serve the file:
ns_set put [ns_conn outputheaders] "X-Accel-Redirect" "/content-repository-content-files/$path"
ReturnHeaders "$mime_type"
ns_write ""
The first one makes sure the Browser picks up the correct filename.
The second one is the redirect which uses the path for the file location. NGINX will use the path, append it to root and content-repository-content-files (see above location section) and serve the file located there.
ReturnHeaders returns the two outpuheaders along with the mime_type.
ns_write is just for finishing the response.
