Code

This portion of the site is devoted to software code and other topics of interest for a software engineer.

Running Python With Apache on Windows

Category: Code

Posted:

My day job has a Windows environment and avoids installing other OSes unless absolutely necessary. Windows is great on the desktop, but has many limitations when tasked with running a web site that relies heavily upon open source languages and products. Its limitations are front and center when trying to host a Python web application. Below are some of the gotchas that I’ve encountered and how I addressed the problem.

The Python GIL

Every Python developer should be familiar with the GIL. If you’re not, follow that link and start reading. TL;DR? Threads of the same process block each other on all IO. By itself, the GIL is an easy beast to tame by spawning more processes, instead of threads. On *nix OSes, processes are relatively light weight and you can fork a process and continue on your merry way. Processes on Windows are heavier and it does not support fork without installing more software. Cygwin and Windows Services for Unix are the two ways of making Windows behave more like *nix for specially compiled programs.

Apache MPM (Multi-Processing Modules)

Apache is a great HTTP server application that provides more functionality than most sites will ever need or use. It is also one of the few options that is designed to run as a service on Windows. The lighter weight options, nginx and lighttpd are usually better for my needs, but both have issues with running as a Windows service.

Apache supports a few different Multi-Processing Modules, prefork is the default for *nix. On Windows, there is only one MPM, mpm_winnt. This MPM works great and follows the Windows idea of spawning threads when you want work done. The GIL makes almost any Python website unusably slow when run on Windows with Apache. Most web applications are a bunch of IO (Network, database, disk, etc.) with a small amount of CPU. This basically turns Apache in to a single threaded web server. The prefork MPM does not experience this problem due to its use of processes instead of threads.

I observed the situation where a page that by itself would take about 1 second to generate, could take tens of seconds to finish if there were more than 2-3 overlapping requests. Each new request (in a thread) would get roughly equal time and slow down all previous threads. It was possible to block the site for minutes with as few as 10 requests.

Faking a Python MPM

There is a way of configuring a Windows server so that it can serve a Python web application and avoid the GIL. A multi-process MPM is conceptual the same as a load balancer sitting in front of several web servers. An incoming request is routed to a individual web server to handle.

Apache as a Balancer

Apache can function as a load balancer if another option is not available. Here’s a configuration snippet that will equally balance requests among three Apache instances running on the same machine as the instance acting as the load balancer (reverse proxy). See mod_proxy documentation for the rest of the configuration directives that you will need to fully configure.

<Proxy balancer://cluster>
	BalancerMember http://192.168.0.10:9001 smax=3 max=10 ttl=120 route=www_1
	BalancerMember http://192.168.0.10:9002 smax=3 max=10 ttl=120 route=www_2
	BalancerMember http://192.168.0.10:9003 smax=3 max=10 ttl=120 route=www_3
</Proxy>

ProxyPass / balancer://cluster/ ProxyPassReverse / balancer://cluster/

Gotchas

This configuration of faking a server farm on a single machine requires some new problems to be resolved. Thankfully, these are not that difficult once you are aware of them.

Lots-O-Logs

Every request will be logged by the load balancing Apache instance and the proxied instance that does the work. This is not necessarily a bad thing by itself. It is useful to know which Apache instance handled a specific request and also get the aggregate view (load balancer logs). Unless steps are taken, this will double the disk IO and space requirements.

The simple resolution is to disable all logging on the worker instances. This can be accomplished by using CustomLog and a conditional environment variable that is never set.

LogFormat " " empty
# Below will never output anything, but it will create an empty file
CustomLog "D:/logs/carme/apache/access-1.log" empty env=NOTHING_IS_LOGGED

Logging has now been reduced to a normal volume, but you will not know which instance handled the request. To regain that bit of information, you can add %{BALANCER_WORKER_ROUTE}e to the LogFormat of the load balancer. This will include whatever value is set for route= in the above BalancerMember configuration. E.g. www_1, www_2, or www_3.

Fixing IPs

The instances behind the load balancer, and application code will see every request as if it is coming from the load balancing instance. This can be resolved with the Apache module mod_rpaf.

Auto-enabling Textpattern Plugins

Categories: Code, Projects

Posted:

Upload plugin. View the code and help. Scroll. Click “submit”. Scroll. Click “No” to enable plugin. I’ve repeated this pattern way to often whenever I set up or update a Textpattern based site. No more.

Textpattern added the ability for plugins to run some code when they are installed; PLUGIN_LIFECYCLE_NOTIFY. This was intended to let a plugin update database schemas, create default forms, or do any of the other set up work the plugin needs to only do once.

Going forward, I plan on updating all of my plugins to enable themselves when installed. My first public plugin to gain this benefit is mem_self_register v0.9.9.

Release Notes:

  • added ability to customize new password reset mail message.
  • auto-enabled on install.
  • numerous fixes related to creating email messages.

How to auto-enable a plugin

Set the plugin[‘flag’].

$plugin['flags'] = PLUGIN_LIFECYCLE_NOTIFY;

Register the callback function.

if (@txpinterface == 'admin')
{
	register_callback('mem_self_auto_enable', 'plugin_lifecycle.mem_self_register', 'installed');
}

Provide the function that sets the status of the plugin in the database.

/** Automatically enable plugin when installed */
function mem_self_auto_enable($event, $step)
{
	$plugin = substr($event, strlen('plugin_lifecycle.'));
	$prefix = 'mem_self_register';
	if (strncmp($plugin, $prefix, strlen($prefix)) == 0)
	{
		safe_update('txp_plugin', "status = 1", "name = '" . doSlash($plugin) . "'");
	}
}

Facebook Decrapifier

Categories: Code, Projects

Posted:

No big surprise, I’m not a fan of the most recent changes to Facebooks UI. I’m sure they spent hundreds of person hours discussing, designing and implementing the latest changes, but it only took me 5 minutes with a grease monkey script to remove the worst bits of it. I present to you, the Facebook Decrapifier. This script will hide the “byline”, random images and ads from all user profiles.

facebook decrapifier

To use this script, you will need to install the Grease Monkey (firefox) extension and then click this link. I haven’t tested this with any of the grease monkey clones for other browsers, but they should work.

Moved from SVN to Hg

Categories: Code, Projects

Posted:

I have migrated my SVN repository over to Hg at http://bitbucket.org/Manfre/txp-plugins”. I will leave the SVN repository online for a while, so any links pointing at it don’t die without warning. I’ll eventually put a forced redirect over to bitbucket. I’ve used bitbucket for my managing private plugins and it works really well for me.

A few of the big gains by the switch are an integrated issue tracker and a wiki to build up documentation about using the plugins. A single forum thread for a plugin has never been a good way of reporting and handing bugs.

You can report bugs, request features, download the latest releases and find documentation over on bitbucket. Any help with writing documentation and examples would be greatly appreciated.

C++ dynamic_cast

Categories: Code, Projects

Posted:

I just spent way too much time trying to figure out a work related problem. A shared piece of code was consistently failing on the same line with a KERNEL32.DLL Exception 0xe0fd7363. The Visual Studio 6 debugger showed the proper information and the object was indeed valid.

<derived_type> *p = dynamic_cast<derived_type *>(pObj);

Searches on google and msdn yielded me nothing useful. I finally asked my “Crazy Russian” friend who enlightened me to RTTI and how if you want to use a dynamic cast in C++ code, you need to turn that on for the project.

How To

  1. Open Project Settings and select the project
  2. Select the C/C++ tab
  3. Change Category to “C++ Language”
  4. Check “Enable Run-Time Type Information (RTTI)”
  5. Rebuild All on the project

It makes perfect sense now that I know the answer, but this is something I figured the compiler would do automatically and I wouldn’t have to enable. I’m sure it would have been a few more days before I figured this out. I’m posting this for the “next guy”. Hopefully their searching will be more fruitful.

« Newer - Older »

Code Repositories