Search engines have for a long time been a good helper of people trying to find sensitive information or vulnerabilities on the web. When you have a few billion documents indexed, it is inevitable some things that should remain private inadvertainly end up in public directories and get indexed, then its just a matter of writing a sufficiently creative search query to find that data.
There are even sites that aggregate "interesting" search queries designed to quickly locate sensetive data such as
from "Johny" that has queries to find everything from old vulnerable software to credit card numbers, etc...
There have also been attempts to identify things like SQL injection and XSS by locating sites collecting common form of input and then checking to see if said input is not validated. A good example of this can be found on
, who used Google to generate statistics to identify the frequency of SQL injections.
But this approach is does not really show you the full extent of the exploit, just indicates presence of SQL injection, which can then be explored further mostly through trial an error. Well, no more, thanks to
, which as the name suggests indexes publicly available source code. Meaning that now not only can you easily find exploits, but also get the full context of the code allowing for a much nastier exploitation. Let's give it a shot
To start things off lets look at our common friend, XSS (Cross Site Scripting):
lang:php (echo|print).*\$_(GET|POST|COOKIE|REQUEST)
The search above will find all instances of code where PHP input variables collected from un-trusted sources such as GET/POST/COOKIE/REQUEST are output to screen via echo or print. A fairly good trawl, about 17,500 results, mind you not every site is vulnerable since they could be filtering the data and keeping it within the super-global, however a brief spot check shows that more the 1/2 the code bits found do not actually do that.
Next lets take a glance at another old favorite, SQL injection:
lang:php query\(.*\$_(GET|POST|COOKIE|REQUEST).*\)
In this instance we use the same un-safe input sources as parameters, to SQL queries, which often are executed via functions ending with "query", of course you could search for something more specific like mysql_query() to focus on MySQL users. Another mod could be to take commonly used database wrappers like PEAR DB and ADODB and look for their query execution. However, even this simplistic search shows about 3,000 results.
Perhaps one of the more dangerous security exploits is remote code execution, let check it out:
lang:php (include|require)\s*(\(|\s).*\$_(GET|POST|COOKIE|REQUEST)
Wow!!! I don't know wether to be scared or impressed by the fact that there are nearly 14,000 results for what amounts to remote shell in dozens of pieces of software. I can only hope that people running this code have disable allow_url_fopen, otherwise they better do it quick. The only silver lining is that it would appear far fewer people are willing to trust eval blindly and searches for eval([user_input]) do not reveal a significant number of results:
lang:php \s+eval\s*\(\s*\$_(GET|POST|COOKIE|REQUEST) only 4 entries were found.
We could of course also try searching for preg_replace() with /e flag exploits, but that would require a far trickier regex then I want to write
(Additional "fun" queries)
One more common mistake is to include user input inside header() calls there by allowing header injection, cache poisoning and other fun attacks, let's check for frequent those problems are. First I did a search for code where user input when it appears inside the redirect headers, probably one of the most common instances where injection is possible.
lang:php header\s*\("Location:.*\$_(GET|POST|COOKIE|REQUEST).*\)
Not bad, 2,000 hits, fortunately sending of \n is no longer possible with new versions of PHP, which does reduce the amount of damage that can be caused, you can no longer inject arbitrary headers into most of the code found. However there are plenty of gems such as this:
PHP:
<?php
header("Location: " . $_POST['referrer']);
header("Location: {$_GET['return']}");
?>
That allow to send the user to any page of your choosing and if their session id happens to be in the URL happily extract it from the referring string.
If we drop the Location limiter, focusing on any user input within the header() as can be seen in this search query:
lang:php header\s*\(.*\$_(GET|POST|COOKIE|REQUEST).*\)
We get a fairly impressive 11,000+ result set, where you can find plenty of instance where totally arbitrary headers can be injected by the hostile user.
There are also plenty of developers who seem to forget that passing user input to command line execution functions is a bad idea, especially when the command specified can be provided by the user
lang:php (system|popen|shell_exec|exec)\s*\(\$_(GET|POST|COOKIE|REQUEST).*\) thankfully it does not appear to be a common case.
There are also plenty of other things we could search for like use of $_SERVER values such as PHP_SELF, PATH_INFO, HTTP_USER_AGENT, QUERY_STRING and many others. I am sure you can craft your own google queries to find those. You can also search for old style input sources such as HTTP_GET_VARS and a-like, so amount of interesting queries as endless. And let's not forget about $_FILES, that can also be misused if passed around without proper validation.
So what does it all mean?
Well for one developers now have another tool (if grep is not to your liking) to examine their code for common mistakes, which hopefully translates to safer code for all. It also means that it is now easier to locate common mistakes in indexed code, so you may want to think twice before putting your code online or at least in a manner visible to search engines if you are unsure about its security worthiness.
I can only hope that the presence of this tool will make developers pay closer attention to the security of their application because with Google (and eventually probably Yahoo as well) on their case there is no where to hide
.
Do You PHP ??? on : [?????]Google Code Search???????
Ngoprek Web on : Why PHP Programmers are Leaving
Contentsis - Script Reviews on : Google Code Search - A Hackers Best Friend