Zend Framework Q&A

Posted 15:34, 9/5/2008, in Web

Last week Zend held a Q&A for the Zend Framework, which gave a good insight into the areas they're working on and the direction the framework is heading. Audio is now available (direct mp3 link) - there were a few technical issues (varying microphone levels etc.), but it's an interesting listen.

PHP version compatibility

One thing discussed was ZF's compatibility with previous versions of PHP. The framework is already a little unusual in that it requires PHP 5.x - most PHP frameworks work with PHP4, but this allows them to take advantage of the OO improvements that came with PHP5.

Backward compatibility has been both a blessing and a curse for PHP. The PHP team have done pretty well to ensure that for the most part, new releases of PHP don't break any existing scripts. I have plenty of scripts written back in the days of PHP3 that still work perfectly well.

For this reason, it's been easy for open source projects to ensure their scripts are compatible with older versions, and the majority list PHP 4.3 as their minimum requirement. Unfortunately this has greatly slowed the adoption of new PHP releases - hosts have few reasons to spend the time upgrading if existing scripts that work perfectly well don't use any of the new features.

There were some figures released recently about the adoption rate of PHP5, and they're really quite scary. 5.0 came out in July 2004, that's almost 4 years ago; but the majority of hosts (I think it was about 70%) are still running 4.x, which will no longer be supported after this summer.

It's a vicious circle that will only be broken if either the new PHP releases contain massive performance increases to tempt web hosts, or some amazing new features to tempt the big open source projects.

ZF's take on this was that they will continue to take advantage of features in new PHP releases where appropriate, but at the same time they can't just update their minimum version requirement whenever a new version of PHP is released, as vendors (Red Hat was given as an example) tend to lag behind the official releases somewhat.

In many ways it seems the development of the framework is driving updates to PHP itself. Some of the features upcoming in the next releases (PHP 5.3) such as namespaces and late static binding, have clear uses in the framework, and the team have said they will take advantage of these.

Performance evaluation

Also discussed was performance of the framework. I agree with the sentiment that hardware is cheap, developer time is not; and so spending ages trying to tweak the performance of an application before it becomes a problem can be a waste of time. This is certainly something I've been guiltly of in the past. However I'd also say that in general, performance of a framework should be more of a priority than with a typical web app, since any applications built on top of the framework can only be as fast as the framework itself.

The team are planning a performance review at some point in the future, and they'll be using the same tools developed for testing the performance of PHP itself, so it'll be interesting to see any tweaks made as a result.

Overall the team seem to be looking at the right areas and making the right decisions, and so the future of the framework is looking bright.

Comments (0)

Zend Framework in Ubuntu 8.04

Posted 14:42, 27/4/2008, in Web

With the release of Hardy, the Ubuntu repositories now include a package for the Zend Framework, so you can have just one copy of the library on your server that is automatically updated. To install and use this:

sudo apt-get install zend-framework

then add it to the include_path for your app, in a .htaccess file:

php_value include_path '.:/usr/share/php/libzend-framework-php'

you can then require in Zend Framework classes as you need them, or use the Zend autoloader to pull them in automatically when instantiated.

Comments (0)

MySQL - weighting fields in LIKE searches

Posted 13:32, 22/4/2008, in Web

A client asked us today if we could do some work on their search feature to weight the ordering of results if the search term appeared in the title. The site is running a fairly old version of one of our products which uses a very basic LIKE search (the current version of the same system uses a Lucene-based system). The example search term they gave was a 3-letter word, which pretty much rules out MySQL fulltext (when you start reducing ft_min_word_len fulltext searching gets pretty slow), so we came up with a way of weighting standard LIKE searches:

SELECT title, description,
IF(title LIKE '%who%', 3, 0) +
IF(description LIKE '%who%', 2, 0) AS weight
FROM `products`
WHERE fullname LIKE '%who%' OR brief_description LIKE '%who%'
ORDER BY weight DESC

What this does is create an arbitary 'weight' value purely based on which field(s) the search term appears in. This value is used to order the products.

Yes, it's slow, as it's doing a two LIKE searches for every field, for every search term. But in this relatively small dataset (~200 products), it still runs in a fraction of a second so it's okay as a temporary solution.

Comments (0)

Interesting uses of the WoW Armoury

Posted 19:15, 8/9/2007, in Web

A while ago Blizzard launched the World of Warcraft Armory - a site which shows detailed information about in-game characters, including their gear, guild, use of talent points, reputation with the various in game factions and a number of other things. It's basically a pretty Web frontend to part of Blizzard's huge internal game database.

What makes it interesting is that the site makes extensive use of XSLT, view the source of any of the charater pages and what you get is raw XML. This makes it very easy for crawlers to grab and parse this data, and a number of sites have sprung up that are doing this in interesting ways:

  • wowrankings - this site has adds up the item values of all items each player currently has equipped to give them a total 'score'. Higher score means better gear (generally), so by searching this site you can see lists of the best equipped players on your realm, the best equipped players of your class (globally).
  • wowjutsu - this site has looked at where each item a player has equipped comes from and used this to build up a list of the bosses each guild must have killed. This data has been used to produce guild rankings, which you can view either globally, by region, or by realm.
  • ArmoryMusings - this guy has setup a blog purely to showcase some of the info he's got from spidering the armoury. In particular, he regularly produces graphs to highlight how the various classes fare in the different arena bands.

On the negative side, the armoury makes pretty heavy use of Ajax, much of it questionable. And rather than using one of the well known Ajax libraries to avoid the headache of browser compatibility issues, it's all ground-up JS. Consequently the site can be horrendously slow, and in busy periods your browser can just sit there hanging for several minutes whilst it tries to make sense of everything that's thrown at it.

Comments (0)

OpenID and CMS integration

Posted 21:48, 17/6/2007, in Web

Short of a couple of minor display bugs, I've now finished adding OpenID support to Fabric. Adding support for OpenID logins was pretty easy, I did this a few weeks ago. But tying this in with the existing user account system took a bit more thought. I'd be interested to know if there are any 'best practices' for this sort of thing, particularly in terms of the options you present to a user when they login with an OpenID for the first time.

As I see it, when integrating OpenID into a system which already has a conventional user accounts, there are two approaches:

  1. Keep your OpenID users completely separate to your normal users, so users logged in with OpenID can perform some basic functions on the frontend, but to do anything on the backend they need to create a proper account and login with a conventional username/email and password.
  2. Link OpenID logins in with user accounts, so when a user logs in with an OpenID for the first time you create a user account for them, using data from simple registration (sreg) if available.

The first option would be easier to implement, but if users have to create a normal account to do anything substantial then they have a new username and password to remember, and you've lost the main benefit of OpenID. So I went for option two.

The next decision was how transparent to make this account creation step. When a user logs in with an OpenID for the first time, if the data you get back from sreg contains all the info you need to create a user account, this whole process could be transparent. But what if you didn't get any data back from sreg, or what if the user wanted to use a different email address on your site? So, when you login to this site for the first time with an OpenID, you're presented with an account creation form with the data from sreg pre-filled in. Just hit submit if it's all there, or you can change it first. You only have to do it once.

The next problem was what to do with email addresses. When a user creates a conventional account in Fabric, their email address is validated (they have to click on an activation link that is emailed to them). If we get an email address from sreg, can we guarantee that they own this? If not, do we validate this before the user can do anything on the site (thus putting back in the barriers to entry OpenID tries to remove)? Or do we just not validate email addresses in accounts created by OpenID? I'm not sure what the answer is to this one. For now if OpenID is enabled on a site, email addresses are not validated; but this is a less than ideal solution.

So that's the basic functionality in place. At some point I'll have to add a way for users to link more than one OpenID to an account (preferably at login, so they don't end up creating duplicate accounts on the system), but this will require some more thought.

Comments (0)

Localising content with GeoIP and Apache

Posted 17:13, 24/5/2007, in Web

Today I came across a page on MaxMind's site which shows some of the things you can do using mod_geoip (the GeoIP Apache module). Using the GeoIP data in applications isn't rocket science anyway, but the Apache module allows you to do some fairly funky stuff all server-side. E.g. say you had a download site with a few mirrors in different parts of the world, you could automatically redirect users to the correct one with something like:

GeoIPEnable On
GeoIPDBFile /path/to/GeoIP.dat

RewriteEngine on
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^US$
RewriteRule ^(.*)$ http://us-mirror.mydomain.com$1 [L]
RewriteCond %{ENV:GEOIP_COUNTRY_CODE} ^GB$
RewriteRule ^(.*)$ http://uk-mirror.mydomain.com$1 [L]

and so on.

Comments (0)

MSNbot still overspidering

Posted 19:58, 17/3/2007, in Web

I've been monitoring the traffic to Archivist quite a bit recently. Archivist is a publically searchable mailing list archive, you subscribe the system's email address to your mailing list and all posts automagically appear on the site (threaded, and searchable). 

Because Archivist is basically a text-only site, the search engine robots love it, and the majority of the site is from search engine referrals. And because of the archive nature of the site, most of the pages on there never change; so we send appropriate last modified HTTP headers to aid caching and help keep the bandwidth usage down.

Unfortunately, unlike all the other major robots, MSNBot completely ignores these and is constantly indexing the same content over and over again. It doesn't take long to find proof of this:

Screenshot of Archivist’s robot activity

So, over this time period (April '07) MSN has done only about 50% more requests than Googlebot, but has used more than six times the bandwidth. (The number after the + is the number of hits to the robots.txt file, for those who aren't familiar with AWStats.)

At the same time MSN provides just 0.4% of the site's search engine referrals (Google is 97.6%). With numbers like this, it's hard to justify not blocking MSN completely.

Comments (0)

Search this site
Login
(or login/signup the old fashioned way)
Elsewhere

External URLs/articles that may be of interest:

ZF Blog Application tutorial

This is the first part in a multi-part tutorial in creating a blogging app using ZF. It serves as a great introduction for people trying to get started with the framework, and looks at application structure, mvc, templating and ACL.

Drupal in the Enterprise

An interesting look at why the author doesn't feel Drupal is ready for the enterprise. Drupal certainly appears to be one of the better CMSes around, so some of these criticisms seem a little harsh when you compare it to some of its inferior competitors. However many can also be applied to PHP web apps in general.

BBC admits Linux usage figures were off

Recently the BBC said one of the reasons they didn't push for a Linux compatible version of their iPlayer was that their website only received about 600 visits a week from Linux users. They've now admitted this figure is closer to 100,000. Just a bit off then.

Online ad tracking 'opt out' list

Lobbyists in the US are trying to get an 'opt out' list created for online advertising, so users could choose not to have their browsing habits tracked. The irony is, for this to work technically, some system somewhere would have to create a huge list of IP addresses and personal information, which seems to defeat the point.