Quicksearch
Calendar
|
Wednesday, June 6. 2012
Slides from PHP-GTA User Grouping ... Posted by Ilia Alshanetsky
in PHP, Talks at
08:36
Comments (4) Trackbacks (0) Slides from PHP-GTA User Grouping Meeting
The slides from my presentation at PHP-GTA about the Hidden Features of PHP is now available online and can be downloaded here: http://ilia.ws/files/php-toronto_hidden_features.pdf. Hopefully everyone in attendance learned something new about PHP and many thanks to all the people who had asked questions helped make the discussion that much more interesting.
Sunday, June 3. 2012
Database connection fallback with PDO Posted by Ilia Alshanetsky
in PHP at
12:09
Comments (7) Trackbacks (0) Database connection fallback with PDOFor our database connections we PDO at work and we've extended the class with PHP to offer some other convenience functionality and wrappers. One of the things I wanted to do recently is allow the constructor of the PDO class to fail-over to our backup database connection pool in the event the primary was not available. The idea was to do something along the lines of: PHP:
<?php
class DB extends PDO {
public function __construct($dsn, $login, $pass, $backup_dsn) {
try {
parent::__construct($dsn, $login, $pass);
} catch (Exception $e) {
parent::__construct($backup_dsn, $login, $pass);
}
}
}
?>
Essentially the code would call the PDO's own constructor, if it would fail, an exception would be raised, which would then be caught by the exception handler that will attempt to connect to the backup database connection pool. Unfortunately this simple solution does not work, the reason being, is that when PDO's constructor fails to connect, it destroys the object. Which means any attempts to use or access the object fail, even $this is equal to NULL, effectively making the 2nd construct call pointless. While this behaviour makes sense on most cases, in some cases such as a fallback scenario illustrated above this is an undesired behaviour. To address this limitation I've written a small patch (http://ilia.ws/patch/pdo.txt) which introduces a PDO:: ATTR_KEEP_CLASS_CONN_FAILURE configuration option that can be passed to PDO's constructor, telling it to keep the object alive after a failed attempt to connect to the database, allowing re-connection to be attempted. With this patch in place, the above code can be implemented as per example below. PHP:
<?php
class DB extends PDO {
public function __construct($dsn, $login, $pass, $backup_dsn) {
try {
parent::__construct($dsn, $login, $pass, array(PDO:: ATTR_KEEP_CLASS_CONN_FAILURE => 1));
} catch (Exception $e) {
parent::__construct($backup_dsn, $login, $pass);
}
}
}
?>
Saturday, March 3. 2012
Performance Analysis of isset() vs ... Posted by Ilia Alshanetsky
in PHP at
09:23
Comments (19) Trackbacks (0) Performance Analysis of isset() vs array_key_exists()At Confoo I had an interesting conversation with Guilherme Blanco regarding the fact that in Doctrine 2 they had a performance issue due to usage of array_key_exists() and how it was significantly slower than isset(). His anecdotal example was that doing isset() took 0.5 seconds, while array_key_exists() for the same operation took 5 seconds! That seemed wrong, given that array_key_exists() simply does a hash table look-up, to determine if the array key exists and the only speed difference would be due to the fact that isset() is a language construct and does not have the overhead of function & parameter processing. So, I've decided to do a quick benchmark using a 5,000 element array. PHP:
<?php
$arr = array();
$fp = fopen("/usr/share/dict/words", "r");
while ($i < 5000 && ($w = fgets($fp))) {
$arr[trim($w)] = ++$i;
}
$s = microtime(1);
for ($i = 0; $i < 100000; $i++) {
isset($arr['abracadabra']);
}
$e = microtime(1);
echo "Isset: ".($e - $s)."\n";
$s = microtime(1);
for ($i = 0; $i < 100000; $i++) {
array_key_exists('abracadabra', $arr);
}
$e = microtime(1);
echo "array_key_exists: ".($e - $s)."\n";
?>
The above benchmark executed on PHP 5.4 shows that while isset() is 2.5 times faster taking a mere 0.0219 seconds vs 0.0549 seconds taken by array_key_exists(), both operations are extremely quick and take a fraction of a second even when doing 100k operations. Any "optimization" between the calls is purely within the super-micro-micro optimization realm and is not going to make any application measurably faster. The bottom line is that if your application does not need to distinguish between an array key that does not exist and one whose value happens to be NULL you should use isset() because it happens to be a little faster. However, if a NULL array key value and a non-existant array key are something you need to differentiate between, use array_key_exists() and you don't need to worry about the performance, the function is extremely fast, even in the case of Doctrine 2 that apparently may do as many as 50,000 calls to the function per-request in some cases. Friday, March 2. 2012
Introduction to PHP 5.4 Slides Posted by Ilia Alshanetsky
in PHP, Talks at
04:59
Comments (4) Trackbacks (0) Introduction to PHP 5.4 Slides
My slides from the Confoo presentation on PHP 5.4 are up and can be viewed/downloaded from here:
http://ilia.ws/files/confoo_php54.pdf I look forward to everyone's feedback either on this blog or via Joind.in. And in case you didn't know, PHP 5.4 was released yesterday! [Update March 3, 2012] Based on great suggestion from Rasmus, I've updated the charset slide to clarify that the change introduced in 5.4 relates to the default charset used by internal entities functions (htmlspecialchars, htmlentities, etc...) and updating the default_charset INI is one of the changes you may need to do to account for this change. Wednesday, February 29. 2012
Introduction to PostgreSQL Posted by Ilia Alshanetsky
in PHP, Talks at
14:44
Comment (1) Trackbacks (0) Introduction to PostgreSQL
The slides from my "Introduction to PostgreSQL" talk at Confooare now available for view/download
and can found here: http://ilia.ws/files/confoo_pgsql.pdf. Hopefully it will make people more interested in PostgreSQL, which is a great database system and take it into consideration when making their database platform decisions. For me personally, it was quite interesting, as it is one of the rare chances I get to speak about something that is not directly related to PHP, although I did sneak-in a few PHP specific slides I would very much appreciate feedback from all who had attended the talk and any suggestions on how to make it better are always welcome. Please send me your comments via this blog or via Joind.in. A big thanks for to Bruce Momjian from Enterprise DB who gave me some really good suggestions on improving the slides (already reflected in the PDF) and Christophe Pettus from PostgreSQL Experts, Inc. whose original PostgreSQL Intro talk had inspired mine. Wednesday, December 7. 2011PHP's Output BufferingWhile profiling our application I came across a a rather strange memory usage by the ob_start() function. We do use ob_start() quite a bit to defer output of data, which is a common thing in many applications. What was unusual is that 16 calls to ob_start() up chewing through almost 700kb of memory, given that the data being buffered rarely exceeds 1-2kb, this was quite unusual. I started looking at the C code of the ob_start() function and found this interesting bit of code inside php_start_ob_buffer() initial_size = 40*1024; block_size = 10*1024; Which directs PHP to pre-allocate 40kb of data for each ob_start() call and when this proves to be insufficient, increase by 10kb each time. Ouch! PHP does allow you to say how much memory ob_start() can use, via 2nd parameter to the function. However, if you exceed that size, PHP will promptly flush the captured data to screen, which means that unless you are really good at predicting your buffer sizes or vastly overestimate, there is a risk that the data will be dumped to screen by PHP if you use this option. Since I am not really good at guessing, I've decided to make a small, backwards compatible tweak to PHP's code that allow specification of custom buffer sizes, but allow the buffer size to be increased if the initial buffer size proves to be insufficient, ensuring that the data can be safely buffered. This functionality is implemented through a change (see patch below) to the 1st parameter of the ob_start() function, which normally is used to provide the callback function. With the patch in place the parameter, can be a number, which defines the desired buffer size. With the patch, ob_start(1024) means that the 1kb buffer should be used and when it is exceed keep allocating 1kb at a time to allow for additional data to be stored. This solution does mean you cannot use custom, resizable buffer sizes with a callback function, however it does provider a backwards (PHP API wise) compatible way of implementing the functionality in PHP 5.2 and 5.3. Here is a simple before & after example:PHP:
<?php
ob_start(null, 1024);
echo str_repeat("a", 1500);
var_dump(strlen(ob_get_clean()));
?>
PHP:
<?php
ob_start(1024);
echo str_repeat("a", 1500);
var_dump(strlen(ob_get_clean()));
?>
CODE: Index: main/output.c
===================================================================
--- main/output.c (revision 320624)
+++ main/output.c (working copy)
@@ -155,10 +155,14 @@
initial_size = (chunk_size*3/2);
block_size = chunk_size/2;
} else {
- initial_size = 40*1024;
- block_size = 10*1024;
+ if (output_handler && Z_TYPE_P(output_handler) == IS_LONG && !chunk_size) {
+ initial_size = block_size = Z_LVAL_P(output_handler);
+ } else {
+ initial_size = 40*1024;
+ block_size = 10*1024;
+ }
}
- return php_ob_init(initial_size, block_size, output_handler, chunk_size, erase TSRMLS_CC);
+ return php_ob_init(initial_size, block_size,
+ (output_handler && Z_TYPE_P(output_handler) != IS_LONG ? output_handler : NULL),
+ chunk_size, erase TSRMLS_CC);
}
/* }}} */
Tuesday, October 18. 2011
"Under the Hood" Slides Posted by Ilia Alshanetsky
in PHP, Talks at
19:22
Comment (1) Trackbacks (0) "Under the Hood" Slides
My slides for my "Under the Hood" talk at ZendCon are now online and can be downloaded here.
Thanks to all the attendees, especially those who left feedback at Joind.In. Wednesday, August 3. 2011
New PHP-Excel (0.9.5) was just ... Posted by Ilia Alshanetsky
in PHP at
12:49
Comments (6) Trackbacks (0) New PHP-Excel (0.9.5) was just released!
I've just released a new version of php-excel extension that exposes the new functionality offered by libxl 3.2.0.
The new functionality in this release includes the following: - ExcelSheet::setPrintFit(int wPages, int hPages) that fits sheet width and sheet height to wPages and hPages respectively - ExcelSheet::getPrintFit() that returns whether fit to page option is enabled, and if so to what width & height - ExcelSheet::getNamedRange(string name) that gets the named range coordianates by name, returns false if range is not found - ExcelSheet::getIndexRange(int index) that gets the named range coordianates by index, returns false if range is not found - ExcelSheet::namedRangeSize() that returns the number of named ranges in the sheet - ExcelSheet::getVerPageBreak(int index) that returns column with vertical page break at position index - ExcelSheet::getVerPageBreakSize() that returns a number of vertical page breaks in the sheet - ExcelSheet::getHorPageBreak(int index) that eturns column with horizontal page break at position index - ExcelSheet::getHorPageBreakSize() that returns a number of horizontal page breaks in the sheet - ExcelSheet::getPictureInfo(int index) that returns a information about a workbook picture at position index in worksheet - ExcelSheet::getNumPictures() that returns a number of pictures in this worksheet - ExcelBook::biffVersion() that returns BIFF version of binary file. (Used for xls format only) - ExcelBook::getRefR1C1() that returns whether the R1C1 reference mode is active - ExcelBook::setRefR1C1(bool active) that sets the R1C1 reference mode - ExcelBook::getPicture(int picture_index) that returns a picture at position index - ExcelBook::getNumPictures() that returns a number of pictures in this workbook - ExcelSheet ExcelBook::insertSheet(int index, string name [, ExcelSheet sh]) that inserts a new sheet to this book at position index, returns the sheet handle. If ExcelSheet parameter is missing a new sheet will be created. The source code & tar balls can be found at https://github.com/iliaal/php_excel Wednesday, June 1. 2011
IPC: Hidden Features of PHP Slides Posted by Ilia Alshanetsky
in PHP, Talks at
01:44
Comment (1) Trackbacks (0) IPC: Hidden Features of PHP Slides
The slides from my talk on the Hidden Features of PHP are now available and can be downloaded from here:
ipc_2011_hidden_features.pdf Thanks to all the people who attended the talk and I really am looking for your feedback via Joind.in. Tuesday, May 31. 2011
Memcached, the Better Memcache ... Posted by Ilia Alshanetsky
in PHP, Talks at
22:58
Comments (7) Trackbacks (0) Memcached, the Better Memcache Interface Slides
The slides from my talk on the Memcached extension are now available and can be downloaded from here:
ipc_Memcached_2011.pdf Thanks to all the people who attended the talk and I really am looking for your feedback via Joind.in. Friday, March 11. 2011
ConFoo - Hidden PHP Features Posted by Ilia Alshanetsky
in PHP, Talks at
12:48
Comments (5) Trackbacks (0) ConFoo - Hidden PHP Features
My slides for the "Hidden PHP Features" talk at ConFoo are now available at http://ilia.ws/files/confoo_2011_hidden_features.pdf.
If you were at the talk, please give me your feedback/suggestions at: http://joind.in/2905. Thursday, March 10. 2011
Confoo - APC & Memcached Talk Posted by Ilia Alshanetsky
in PHP, Talks at
05:44
Comments (0) Trackbacks (0) Confoo - APC & Memcached Talk
Thanks to all the people who attended my talk on the APC and Memcached yesterday at Confoo. The slides are now available to download at: http://ilia.ws/files/confoo_APC_MEM2011.pdf
If you have any feedback, or comments they would be much appreciated via Joind.In (http://joind.in/2806) Tuesday, January 18. 2011PHP Excel Extension 0.9.1
The 0.9.1 version of the Excel extension was released and is now available for download. This is mostly a bug fix release, with a number of contributions by Rob Gagnon. The 2 main fixes are related to detection of custom formatted numeric fields, that were incorrectly detected as dates and readRow()/readCol() methods that had a bug when 2 and 3rd parameters are supplied, causing the last row/column not to be read. Additionally a getSheetByName() method was introduced that allows locating a sheet by it's name in either case sensitive or insensitive form.
GitHub: http://github.com/iliaal/php_excel/ Source: http://github.com/downloads/iliaal/php_excel/php-excel-0.9.1.tar.bz2 Wednesday, December 22. 2010
ISP Popularity by Domain Count Posted by Ilia Alshanetsky
in PHP, Stuff at
12:06
Comments (2) Trackback (1) ISP Popularity by Domain Count
The past two information gathering runs showed that GoDaddy is the world largest ISP, but I was curious who else falls into the “Top ISP” category as determined by consumer shopping habits. To do this I’ve used my resolved IP database of 124 million domains and an ISP database from MaxMind.
The results are pretty interesting, and it clearly shows that a small number of ISPs are definitely doing something right, which is causing many consumers to vote with their dollars in those ISPs favor. As usual the information is shown in graph form, to filter down the data to just the large providers I’ve set a minimum at 100,000 domains, leaving me with just 122 ISPs. The image below shows the break-down of the Top 25. If you click on it, you will be able expand the chart to 50,100 or the entire list in a form of a bar chart, or explore the pie chart that includes %s. Since we already know GoDaddy is #1, I will skip them over them and focus on the next largest 3. At #2, we have The Planet, a Texas based data-center provider with 6.1 million domains (4.91% of total). It should mention that The Planet had recently merged with Softlayer Technologies, #8 on our list, which would further expand their prominence by another 1.8 million domains. At #3 we have 1&1 Internet, a German ISP with 5 million domains (4.04%). And finally at #4 we have Internap, an Atlanta based ISP with 2.88 million domains (2.32%). In total, the 25 largest ISPs represent 52.3% of all the domains, which quite impressive since overall there are 31,173 distinct ISPs. Even within the top 25 things are not entirely even, only the first 15 can claim over 1 million domains. Expanding the ISP list to Top-50, expands the domain coverage to 59.67%, not a whole lot. However once the entire list of ISPs with >100k domains is considered we are looking at 68.9% representation. This means that 2/3 of the entire internet is effectively managed by only 122 companies. Kinda scary, actually. In an effort to validate the popularity of the ISPs, I’ve also decided to look at the IP address breakdown (how many IPs per domain) and which ISPs the Top-IPs belong to. This should identify the ISPs whose business is predominantly the result of parked domains. The results for Top 25 IP addresses are shown in the table below When it comes to IP addresses, GoDaddy is by far the most frugal, which probably relates to the fact that they do a lot of domain parking. Both #1 and #2 on the list belong to GoDaddy, representing 12.9 million and 6 million domains respectively. I should mention that #4 with 1.9 million domains and #6 with another 1.1 million domains also belong to GoDaddy. Based on this information it would seem that GoDaddy’s prominence in the ISP space is predominantly based on parked domains, which take up 22 million of the 26.4 million being hosted with them. This can probably be attributed to GoDaddy’s extremely aggressive pricing when it comes to buying domains. The 3rd most popular IP address belong to Oversee.net, with just over 2 million IPs. Given that Oversee.net only has 2.57 million domain under management, it would seem that most of them are parked domains, same situation as with GoDaddy. Couple of other big IPs worth noting are 64.95.64.197, which belongs to Internap, with 1.31 million domains resolving to it. Given that Internap hosts 2.88 million domains, it seems just shy of 1/2 of their domain base is parked. Another “winner” is 216.21.239.197 which belongs to Register.com with 669k domains, which represents nearly the entirety of of the 703k domains at that ISP. Unsurprisingly Network Solutions is in the same boat, of the 1.32 million domains under management, 1.15 are parked on 3 IP addresses. On a general note, the 124 million domains are hosted on just 5.39 million IP addresses, which raises interesting questions about IPv4 exhaustion, clearly domain hosting is not the reason. So what the heck are people using those public IPs for? If they are being used by Internet Providers to provide internet to consumers, one would think that more extensive use of NAT would certainly alleviate the IPv4 shortage for quite some time. There are some 81,164 IPs that have over 100 domains resolve to them, if combined, their total domain count reached 87.4 million, 70% of all registered domains. Since It is safe to assume that in most cases virtual hosting won’t have more then 200 sites per IP, than based on the gathered data, 81.4 million domains are parked on 37,865 IPs. This in turn means that 2/3 (65.4%) of all domains are not actually in-use. A number very similar to the % of domains hosted the world’s Top ISPs, so perhaps the things are not so bad after-all and the non-parked domains are more evenly distributed than the initial numbers might suggest. Tuesday, December 21. 2010
Domain Distribution by City Posted by Ilia Alshanetsky
in PHP, Stuff at
08:04
Comments (4) Trackback (1) Domain Distribution by City
As part of my on-going domain informatics coverage, I am now publishing some additional information that I’ve been able to gather in the last few days.
I am making available two additional geographic chats that breakdown the domain distribution by top world cities. The first chart a preview of which can been below (click to see full, browse-able/zoomable version) shows the Top 150 cities, by domain distribution. These cities represent a total 91.3% of some 102 million domains that could be resolved to a city level. The most popular city in the world, with 26.4 million domains calling it home is Scottsdale, Arizona in United States. Which is not entirely surprising, given that it is the hometown on GoDaddy, world’s largest domain provider/hosting company. This coincidentally means that Arizona, which had won the US state domain count contest with 26.7 million domains in our previous round of statistics, is entirely due to GoDaddy. The 2nd largest city is San Francisco, with a respectable 14.3 million domains, which represents over 1/2 of the 24.3 million domains hosted in California. And the 3rd place goes to Houston, Texas with just shy of 6.1 million domains. Overall there are only 15 cities in the world that can claim to host over a 1 million domains and all of them are found in the US. The first non-US city is actually Toronto, Canada at #19 with 850 thousand domains, shortly followed by Beijing China with 845 thousand domains. The smallest city on the Top 150 list is Oklahoma (Capital of Oklahoma, US) with just 43.4 thousand domains. Seems tiny compared to the beginning of the list, but on global country scale it would actually make it into Top 50 at 49, with almost twice as many domains as Iran (23.4 thousand domains) that would now be displaced to 50th place. To give a slightly better domain concentration view, I am publishing a dynamic cluster map that groups cities together by geographic proximities. Zooming in on the map will breakdown the larger clusters into smaller eventually, eventually resolving all the way to the individual cities. Due to better visualization of markers on this map, I’ve expanded the city list to the Top 400 cities. Which takes us down to as little as 8,600 domains and provides a slightly more worldly view. Click on the map below to see the dynamic version. |
Categories
|