Quicksearch
Calendar
Syndicate This Blog
|
Monday, May 10. 2004PHP Optimization Tricks
There are a number of tricks that you can use to squeeze the last bit of performance from your scripts. These tricks won't make your applications much faster, but can give you that little edge in performance you may be looking for. More importantly it may give you insight into how PHP internals works allowing you to write code that can be executed in more optimal fashion by the Zend Engine. Please keep in mind that these are not the 1st optimization you should perform. There are some far easier and more performance advantageous tricks, however once those are exhausted and you don't feel like turning to C, these maybe tricks you would want to consider. So, without further ado...
1) When working with strings and you need to check that the string is either of a certain length you'd understandably would want to use the strlen() function. This function is pretty quick since it's operation does not perform any calculation but merely return the already known length of a string available in the zval structure (internal C struct used to store variables in PHP). However because strlen() is a function it is still somewhat slow because the function call requires several operations such as lowercase & hashtable lookup followed by the execution of said function. In some instance you can improve the speed of your code by using a isset() trick.
Ex. if (strlen($foo) < 5) { echo "Foo is too short"; } vs. if (!isset($foo{5})) { echo "Foo is too short"; } Calling isset() happens to be faster then strlen() because unlike strlen(), isset() is a language construct and not a function meaning that it's execution does not require function lookups and lowercase. This means you have virtually no overhead on top of the actual code that determines the string's length. 2) When incrementing or decrementing the value of the variable $i++ happens to be a tad slower then ++$i. This is something PHP specific and does not apply to other languages, so don't go modifying your C or Java code thinking it'll suddenly become faster, it won't. ++$i happens to be faster in PHP because instead of 4 opcodes used for $i++ you only need 3. Post incrementation actually causes in the creation of a temporary var that is then incremented. While pre-incrementation increases the original value directly. This is one of the optimization that opcode optimized like Zend's PHP optimizer. It is a still a good idea to keep in mind since not all opcode optimizers perform this optimization and there are plenty of ISPs and servers running without an opcode optimizer. 3) When it comes to printing text to screen PHP has so many methodologies to do it, not many users even know all of them. This tends to result in people using output methods they are already familiar from other languages. While this is certainly an understandable approach it is often not best one as far as performance in concerned. print vs echo Even both of these output mechanism are language constructs, if you benchmark the two you will quickly discover that print() is slower then echo(). The reason for that is quite simple, print function will return a status indicating if it was successful or not, while echo simply print the text and nothing more. Since in most cases (haven't seen one yet) this status is not necessary and is almost never used it is pointless and simply adds unnecessary overhead. printf Using printf() is slow for multitude of reasons and I would strongly discourage it's usage unless you absolutely need to use the functionality this function offers. Unlike print and echo printf() is a function with associated function execution overhead. More over printf() is designed to support various formatting schemes that for the most part are not needed in a language that is typeless and will automatically do the necessary type conversions. To handle formatting printf() needs to scan the specified string for special formatting code that are to be replaced with variables. As you can probably imagine that is quite slow and rather inefficient. heredoc This output method comes to PHP from PERL and like most features adopted from other languages it's not very friendly as far as performance is concerned. While this method allows you to easily output large chunks of text while preserving things like newlines and even allow for variable handling inside the text block this is quite slow and there are better ways to do that. Performance wise this is just marginally faster then printf() however it does not offer nearly as much functionality. ?> <? When you need to output a large or even a medium sized static bit of text it is faster and simpler to put it outside the of PHP. This will make the PHP's parser effectively skipover this bit of text and output it as is without any overhead. You should be careful however and not use this for many small strings in between PHP code as multiple context switches between PHP and plain text will ebb away at the performance gained by not having PHP print the text via one of it's functions or constructs. 4) Many scripts tend to reply on regular expression to validate the input specified by user. While validating input is a superb idea, doing so via regular expression can be quite slow. In many cases the process of validation merely involved checking the source string against a certain character list such as A-Z or 0-9, etc... Instead of using regex in many instances you can instead use the ctype extension (enabled by default since PHP 4.2.0) to do the same. The ctype extension offers a series of function wrappers around C's is*() function that check whether a particular character is within a certain range. Unlike the C function that can only work a character at a time, PHP function can operate on entire strings and are far faster then equivalent regular expressions. Ex. preg_match("![0-9]+!", $foo); vs ctype_digit($foo); 5) Another common operation in PHP scripts is array searching. This process can be quite slow as regular search mechanism such as in_array() or manuall implementation work by itterating through the entire array. This can be quite a performance hit if you are searching through a large array or need to perform the searches frequently. So what can you do? Well, you can do a trick that relies upon the way that Zend Engine stores array data. Internally arrays are stored inside hash tables when they array element (key) is the key of the hashtables used to find the data and result is the value associated with that key. Since hashtable lookups are quite fast, you can simplify array searching by making the data you intend to search through the key of the array, then searching for the data is as simple as $value = isset($foo[$bar])) ? $foo[$bar] : NULL;. This searching mechanism is way faster then manual array iteration, even though having string keys maybe more memory intensive then using simple numeric keys. Ex. $keys = array("apples", "oranges", "mangoes", "tomatoes", "pickles"); if (in_array('mangoes', $keys)) { ... } vs $keys = array("apples" => 1, "oranges" => 1, "mangoes" => 1, "tomatoes" => 1, "pickles" => 1); if (isset($keys['mangoes'])) { ... } The bottom search mechanism is roughly 3 times faster. If you know or have any additional optimization tricks let me know ![]() Comments
Display comments as
(Linear | Threaded)
Just a bit of correction. print returns always with 1, not the length of the text printed. (http://www.faqts.com/knowledge_base/view.phtml/aid/1/fid/40)
But it is still faster because echo has no return value.
Thanks for the update, I've corrected the comment.
I have doubts about the 5) _topic_ - about the hash table eficiency. Could you tell me where I can read more specifc information about PHP internal workings with arrays?
There is no documentation on the matter, if you doubt the results run your own benchmarks
![]()
Could you rewrite topic 5 (hash table efficiency) please? Perhaps in simpler terms cos I've read it twice and still could not understand.
I'm still quite new to PHP though, but I feel I've learnt a lot from this blog entry of yours. But if it's too troublesome then it's okae, I'd understand. ![]()
IMO you forgot to mention why post-increment is usefull. Of course it has to save the current value of the variable, because it can be assigned to an other variable before the increment.
However - I haven't needed this since my C-days (how time goes by ![]()
Nice write-up...
I think you should add a little something about single vs. double quotes. -Philippe
topic 5:
what is faster:] if (isset($keys['mangoes'])) { ... } or if(array_key_exists('mangoes', $keys)) ?
if (isset($keys['mangoes'])) { ... }
would be faster then if(array_key_exists('mangoes', $keys)) because #1 you don't have a function overhead and #2 you don't have as much argument parsing overhead.
Not necessarily -- the lookup times don't diverge until you've got a very considerable amount of data in your array.
Imagine having 2-3 entries in your array. In this case, it would take more time to hash the values and perform the lookup than it would to perform a simple linear search. O(n) vs. O( log n )
6) single quotes are faster than double quotes
7) if (42 == $foo) is faster then if ($foo == 42) ![]()
ad 7) tried it. They're both taking the same time.
ad 8 ) http://www.oreilly.com/catalog/regex/ This books explains regex and why a posix regex is slower.
As long as people are adding things like ' vs ", here are a few more:
* ++$i is faster than $i++ * true is faster than TRUE
This is because when looking for constants PHP does a hash lookup for name as is. And since names are always stored lowercased, by using them you avoid 2 hash lookups.
What about creating arrays dynamically? Is there a faster method than $array[$item->Key] = $item? to implement a (keyed) set?
I've did some checking on for, while and do-while.
you can see the result at: http://cosminb.blogspot.com/2004/09/performance-tweaking-for-vs-while-vs.html
commonwealth bank, national bank, aussie home loans, wizard home loans
7) if (42 == $foo) is faster then if ($foo == 42)
This recommendation doesn't come from speed. It's to make sure you don't make a stupid typo that ends up as: if ($foo = 42) Using the (value == $variable) system, you'll get an error message instead of an if statement that always returns true
This site is fantastic for easy to assimulate tips, but the idiot spammers abusing this comment system are just that... idiot spammers.
Repect to the PHP programmers... make sure to benchmark when testing: $timeparts = explode(" ",microtime()); $starttime = $timeparts[1].substr($timeparts[0],1); --insert code here to benchmark-- $timeparts = explode(" ",microtime()); $endtime = $timeparts[1].substr($timeparts[0],1); print "".$buildtime."Time to execute: ". bcsub($endtime,$starttime,6)." seconds!\n"; I have been using that within my functions and page builds, its helped fine-tune every ounce of speed. Don't forget useful tools like ob_start() and pear::Cache when needed... can help a lot as well.
re: "Since in most cases (haven't seen one yet) this status is not necessary and is almost never used it is pointless and simply adds unnecessary overhead."
I've found it convenient to use print while debugging: if ($a == $b) ... elseif ($a == $c) ... //etc becomes (for debugging only): if ($a == $b && print('b')) ... elseif ($a == $c && print('c')) ... //etc echo doesn't work here, but print does S
I found using 1 (for TRUE) and 0 (for FALSE) to be considerably faster than using TRUE and FALSE. I thought that this would be worth mentioning.
Another useful optimization technique for PHP would be changing how you use 'for' loops. Most people use the obligatory (assuming $j is an array):
for ($i = 0; $i < count($j); $i++) { ... } There are two obvious performance hits here. The first, as mentioned in your blog, is the post-increment. You can EITHER do ++$i, or you can change it to '$i = $i + 1', respectively. The second thing would be to move the count() function out of the middle of the 'for' loop, to avoid the multiple calls on count(). So, the above example could easily written out in a new fashion: for ($i = 0, $k = count($j); $i < $k; $i = $i + 1) { ... } I have found that in my code writing I am able to shave a few microseconds off it's execution; on bigger, more robust applications, those microseconds have added up to whole seconds! Thank you Ilia for even more tips for my code writing experiences! ![]()
The count($j) doesn't necessarily perform the function call to count everytime through the loop.
I can't quote authoritatively here because it has been a while, but I believe that count is interpreted and replaced with the value before the loop runs. If in doubt, feel free to benchmark it. And if you shave microseconds off there is a good chance you are experiencing anomaly's. Also, I would say as a matter of good programming practice that probably isn't the place to put a count anyway, but I do it because it cuts down on variable assignment while still being readable.
In PHP for loop with a count() inside the control block will be executed on EVERY loop itteration.
It's even faster if you eliminate the call to count() AND the explicit use of the counter by using a foreach loop in place of the for loop.
As a spinoff of your example, for ($i in $j) { ... } However, a change of variable names is recommended, as both $i and $j typically refer to local counting variables.
I havnt tried these tricks but i found very interested. I will try all these tricks to improve the performance of php code.
thanks
Very Interesting tricks but it's too complex for me. I will analize this later, but idea to improve the performance of php code is good. thank you
Thanks for the tips...
One thing I noticed, you repeatedly used the word "then" when you should have used "than". For example: Calling isset() happens to be faster then strlen() $i++ happens to be a tad slower then ++$i. There were about 5 instances of this in your write-up.
Nice and extremely interesting. However, sometimes readability is suffering. Remember, the majority of the lifecycle of a software is about maintenance.
I mean, this is very clear: if (strlen($foo) < 5) { echo "Foo is too short"; } Yet the purpose of the following takes more time for a human to understand: if (!isset($foo{5})) { echo "Foo is too short"; }
In regards to the strings being used as keys to speed and facilitate the search of arrays, there can be an issue.
For strings that have quotes, and in other languages like french "accents". Or maybe spaces or hyphens or underscores. What would be the best way to proceed from there? Store the text in the value instead of the keys no? Thanks.
well here is another optimazation of sorts if you are writing you're code check that the if-statements you write in most cases are true.Because if the code has to jump to false branch of the code it will loose some speed since it has to look for it before it can execute it's code.
Just a thought. I admit it is not perse only in php this helps but more in all of the current programming languages.
First I want to thank you for this tutorial; it was very useful!
And I have one question: Do you know if it is better to use single quotes ('') instead of double quotes ("") in the PHP?
no, single quotes are faster interpreted. Double quotes are parsed single quotes not.
"example text number: $num" The complete string will be parsed and instances like variables and special characters will be replaced. The difference in speed only applies to compile-time, that means if used in a loop there'll hardly be any difference.
This is wrong and un-supported by any consistent benchmarks. Even though "" are interpreted the opcodes generated for "" and '' are the same.
I know what you are saying, but in practice I have personally seen a huge difference. Converted over almost the entire PHP-Nuke CMS to using single quotes, including all the language defines (yes, I'd like to change those to a language array), and with no other performance "tweaks", everyone who is using this particular release is saying its much faster.
foreach is the fastest loop an array in PHP5, where as fetching array_keys first then do a for loop is the fastest way to loop an array.
Thank you Thank you Thank you
I'll be checking back here often, please keep the main text up to date. i'd love to get some shell scripts together for use in SVN to check and replace with equivalent but quicker code and also strip out comments (purely for the live only site) again thank you
I don't believe the cat is faster than the ball. This is the test that I did:
1) I wrote 2 PHP files throught BaSH. A file whith simple array and file with associative array, this way: ## t1.php echo "" >> t1.php ## t2.php echo "" >> t2.php 2) The I used this command to measure the elapsed time: time php t1.php && time php t2.php 3) And this is the result: 1 real 0m0.033s user 0m0.020s sys 0m0.016s 1 real 0m0.034s user 0m0.020s sys 0m0.012s So, is this right? |
Categories
|
Following on the back of my recent posts looking at the (hopefully) best and worst of benchmarks I thought it would be useful to finish off with some genuine tips for creating 'lightning fast' websites. I probably lack the experience and insight to bring
Tracked: Sep 29, 11:25
Tracked: Feb 02, 02:10
Tracked: May 28, 16:14