Guide to PHP SecurityQuicksearchCalendar
|
Sunday, January 22. 2006mysql_real_escape_string() versus Prepared Statements
Chris has written a compelling piece about how the use of addslashes() for string escaping in MySQL queries can lead to SQL injection through the abuse of multibyte character sets. In his example he relies on addslashes() to convert an invalid multibyte sequence into a valid one, which also has an embedded ' that is not escaped. And in an ironic twist, the function intended to protect against SQL injection is used to actually trigger it.
The problem demonstrated, actually goes a bit further, which even makes the prescribed escaping mechanism, mysql_real_escape_string() prone to the same kind of issues affecting addslashes(). The main advantage of the mysql_real_escape_string() over addslashes() lies in the fact that it takes character set into account and thus is able to determine how to properly escape the data. For example, if GBK character set is being used, it will not convert an invalid multibyte sequence 0xbf27 (¿’) into 0xbf5c27 (¿\’ or in GBK a single valid multibyte character followed by a single quote). To determine the proper escaping methodology mysql_real_escape_string() needs to know the character set used, which is normally retrieved from the database connection cursor. Herein lies the “trick”. In MySQL there are two ways to change the character set, you can do it by changing in MySQL configuration file (my.cnf) by doing: CODE: [client] default-character-set=GBK Or you can change on a per-connection basis, which is a common practice done by people without admin level access to the server via the following query: CODE: SET CHARACTER SET 'GBK' The problem with the latter, is that while it most certainly modified the charset it didn’t let the escaping facilities know about it. Which means that mysql_real_escape_string() still works on the basis of the default charset, which if set to latin1 (common default) will make the function work in a manner identical to addslashes() for our purposes. Another word, 0xbf27 will be converted to 0xbf5c27 facilitating the SQL injection. Here is a brief proof of concept. PHP: So what can you do? The solution is to use prepared statements, which are supported by nearly all PHP database extensions with the notable exceptions of MySQL (ext/mysql) and SQLite2 (ext/sqlite). So, to be on the safe side, I'd recommend using the PDO interface to talks with those databases or in the case of MySQL using the newer MySQLi (ext/mysqli) extension. Those interfaces provide prepared statement support, which allows for separation between query structure and the query parameters. It should be noted that while PDO does emulated prepared statements for older versions of MySQL that do not support them natively, emulation is still prone to the same kind of issues demonstrated here and in Chris’ article. Therefore for security reasons you should definitely consider upgrading to a more modern version of MySQL and SQLite (SQLite 3). Trackbacks
mysql_real_escape_string() vs addslashes() vs Prepared Statements
Interesting posts by Chris and Ilia on their respective blogs. In The addslashes() Versus mysql_real_escape_string() Debate Chris starts discussing the Google XSS exploit and then goes on to explaining about character set encoding and how one particular character set allows...
Weblog: Jacques Marneweck's Blog
Tracked: Jan 22, 15:20
[ZF-1541] Character set option for Db adapters
Andrey did not provide any information about the problem witht this type of sql injections and while it is hard to find anything on that i provide a source: http://ilia.ws/archives/103-mysql_real_escape_string-versus-Prepared-Statements.html
Weblog: JIRA: Zend Framework
Tracked: Feb 25, 11:00 Comments
Display comments as
(Linear | Threaded)
So you're saying the mysql_real_escape_string() isn't 100% secure either?
Crikey, if that's true, then I'm willing to bet A LOT of scripts are "vulnerable" to this problem. Is there a fix that doesn't involve perpared statements? Perhaps a function that checks for this problem, and filters it? My charset/encoding knowledge is a bit limited, so I'd very much appreciate an answer. Thanks!
As Ilia points out, it only applies to situations where the script actually modifies the charset, for instance using SET CHARACTER SET. Personally, I've never used this functionality and if you haven't either you're fine.
That is precisely what the example demonstrates. The bottom line while the problem is serious, it would only affect people changing character sets from single-byte encodings to multibyte ones. As long as you stay away from multibyte encodings, with the exception of UTF8 you should be safe.
Ah, I see, thanks for the clarification, was a bit worried there.
Interesting article Ilia, thanks.
This sentence isn't clear to me though: "As long as you stay away from multibyte encodings, with the exception of UTF8 you should be safe" From what I understand from this sentence is: - as long as you do not use multibyte encodings, you should be safe - except when you use UTF8 But,... is utf 8 not pretty standard? When am I safe using addslashes and when with mysql_real_escape_string?
I think Ilia means to say that UTF-8 is also safe from this, despite being a multi-byte character set.
I finished my post with a note: "This type of attack is possible with any character encoding where there is a valid multi-byte character that ends in 0x5c." 0x5c is 92 in decimal, which is clearly in the lower 128. Because UTF-8 supports representing these characters with a single byte, this particular problem has already been addressed. Check out the table listed here: http://en.wikipedia.org/wiki/UTF-8 If you look at each of the formats in the UTF-8 column, you'll notice that the multi-byte formats all force every byte except the first follows the format 10xxxxxx. In other words, every subsequent byte has a value that is 128 or greater in decimal. I'm not sure if that clears up anything, but I hope so.
Just as we validate request data, the same should be done for database data. In the above example, we know a given username should only match one row by setting it as a primary key. So, only add username to query. First check that only one row is returned. Then check that the returned password matches the posted.
Why so complicated way ? Make md5 from both values and compare with md5 in DB. And they can put anything into those fields. For user login it's fine and working for me quite good.
I think that might differ a little bit. My technique has been to allow a regular username as I use that for printing out who the user is for an identity. As for logging in I use a hash only for the password unless there is a login by cookie which uses a salted hash value for the username and password (2 different complex hashes).
But as for all data it should always be checked, as in, check information even twice... One on the way in and one on the way out. Say your database server somehow was exposed... This is why there is always double checks. Application hardening should be a must.
Mine allow printable username too.
Maybe i describe it too simple. On login, user data are retrieved from db using md5 from login name and password in where clausule. Both values are stored id db as md5 hashes. If user is found, hash from several information is made and stored with timestamp into db. On each request, hash recreated and compared with one stored in db. If they don't match or user was inactive, he is logged out User data are stored in session (not pass/login name). Session id is regenerated for each request. For me the way you do check does not make sense. You retrieve row from db where name=$_POST['name'] and than compare $_POST['pass'] with value from db. Whats difference between doing both checks in query ? If database is compromised, they already have all informations, so there is no difference whether you compare password in DB or application.
Is input lenght limitation cannot protect us from sql injection?
"chr(0xbf)chr(0x27)' OR 1=1" string lenght is 26 .. we can fix it, example 20 .. Is there any other security problem from CSRF or something ? Thank You...
Limiting the input length may make the exploitation more difficult, but certainly not impossible.
bug found and fixed:
http://dev.mysql.com/doc/refman/5.0/en/news-5-0-22.html anyhow, be sure to know the difference between SET NAMES 'charset_name' SET CHARACTER SET charset_name http://dev.mysql.com/doc/refman/4.1/en/charset-connection.html
I never have any trouble with quotes in strings. I always use:
$data = str_replace("'", '`', $data); as soon as I get the data from an untrusted source. Even if I forget to change it back later, it still looks something like a quote mark. If it is a trusted source of mine, the string does not have a quote in it - period. Over the years it certainly seems that the data processing industry could have fixed this problem.
A project I was hired to help fortify is implemented on a web tier built with Linux, Apache, and PHP 5.2.4. It talks to a database tier using MS SQL Server. Is there an equivalent to mysql_real_escape_string for ms sql server? Being a hired hand on this project I don't have the option of changing the database tier to MySQL.
The code that binds the web tier to the application and database tiers is well encapsulated with the code that validates user-input data in one place and the database calls in another. Given that these two chunks of functionality are nicely encapsulated I am free to explore a new paradigm such as the prepared statement model. I must admit I have no experience with that paradigm but I am a fast learner TIA. I welcome your thoughts. Here's to a productive and secure 2008! -Rich
Since PHP 5.2.3, it is possible to use mysql_set_charset() which is respected by mysql_real_escape_string().
|
ArchivesCategoriesSyndicate This BlogBlog Administration |
|||||||||||||||||||||||||||||||||||||||||||||||||










Comments