Hidden Features Of Perl, PHP, Javascript, C, C++, C#, Java, Ruby, Python, And Others [Collection Of Incredibly Useful Lists]
Sunday, June 21st, 2009
Updated: June 23rd, 2009
Introduction
StackOverflow is an amazing site for coding questions. It was created by Joel Spolsky of joelonsoftware.com, Jeff Atwood of codinghorror.com, and some other incredibly smart guys who truly care about user experience. I have been a total fan of SO since it went mainstream and it's now a borderline addiction (you can see my StackOverflow badge on the right sidebar).
The Story
Update 6/21/09 9:30pm PST: This server is currently under very heavy load (10-200), even with caching plugins enabled. Please bear with me as I try to resolve the situation.
Feel free to bookmark this page and return to it later when the fires have been put out.
Update 6/21/09 11:37pm PST: I think I've got the situation under control now. The load is between 0 and 3 now and pages load relatively fast. I will be posting about the getting redditted/delicioused experience later.
Update 6/23/09 12:06am PST: Added jQuery, Greasemonkey, Ruby on Rails, and Objective-C, broke databases into their own section, and sorted everything alphabetically.
Update 6/23/09 7:10pm PST: Added Scala, Lua, TCL, F#, Regex, and HTTP.
So, one day someone at StackOverflow started a "Hidden features of" post about a famous language (I don't feel like finding out which one was first exactly), and it turned out to be so popular that other posts in the same series started popping up.
Such questions were quickly turned into community wikis, for the purposes of harvesting and organizing information coming from the best developers on the planet and voted by users of the site. There are literally hundreds of answers, sorted by votes.
Hidden Features Of
Programming Languages
Hidden features of ASP.NET
Hidden features of C
Hidden features of C++
Hidden features of C#
Hidden features of D
Hidden features of Delphi
Hidden Features of F#
Hidden features of Java
Hidden features of JavaScript
Hidden features of Haskell
Hidden features of Lua
Hidden features of Objective-C
Hidden features of Perl
Hidden features of PHP
Hidden features of Python
Hidden features of Ruby
Hidden Features of Ruby on Rails
Hidden features of Scala
Hidden Features of TCL
Hidden features of VB.Net
Databases
Hidden features of MySQL
Hidden features of Oracle
Hidden features of PostgreSQL
Hidden features of SQL Server
Other
Hidden features of Bash - also see my bash cheatsheet.
Hidden features of CSS
Hidden features of Eclipse
Hidden features of Greasemonkey
Hidden features of HTML
Hidden features of HTTP
Hidden features of jQuery
Hidden features of mod_rewrite
Hidden Features of RegEx
Hidden features of Visual Studio (2005-2008)
I will try to maintain this list, adding new languages that join the series as I find them. Now go learn something new!
[MySQL] Deleting/Updating Rows Common To 2 Tables - Speed And Slave Lag Considerations
Monday, May 11th, 2009
Introduction
A question I recently saw on Stack Overflow titled Faster way to delete matching [database] rows? prompted me to organize my thoughts and observations on the subject and quickly jot them down here.
Here is the brief description of the task: say, you have 2 MySQL tables a and b. The tables contain the same type of data, for example log entries. Now you want to delete all or a subset of the entries in table a that exist in table b.
Solutions Suggested By Others
DELETE FROM a WHERE EXISTS (SELECT b.id FROM b WHERE b.id = a.id);
DELETE a FROM a INNER JOIN b on a.id=b.id;
DELETE FROM a WHERE id IN (SELECT id FROM b)
The Problem With Suggested Solutions
Solutions above are all fine if the tables are quite small and the SELECT/JOIN is fast. However, in large scale situations with replication, these queries could potentially lock up the tables from writes and severely aggravate slave lag because, as I mentioned in the MySQL Slave Lag (Delay) Explained And 7 Ways To Battle It post, replication is single-threaded.
Thus, if a single UPDATE/DELETE query takes a considerable amount of time, when it propagates to the slaves, they will be stuck executing it and doing nothing else, lagging behind more and more.
My Thoughts And Solution
I have personally dealt with having to delete many rows from one table that exist in another and in my experience it's best to do the following, especially if you expect lots of rows to be deleted. This technique most importantly will improve replication slave lag.
So, here it is: do a SELECT first, as a separate query, remembering the IDs returned in your script/application, then continue on deleting in batches (say, 50,000 rows at a time). This will achieve the following:
- each one of the delete statements will not lock the table for too long, thus not letting replication lag get out of control. It is especially important if you rely on your replication to provide you relatively up-to-date data. The benefit of using batches is that if you find that each DELETE query still takes too long, you can adjust it to be smaller without touching any DB structures.
- another benefit of using a separate SELECT is that the SELECT itself might take a long time to run, especially if it can't for whatever reason use the best DB indexes. If the SELECT is inner to a DELETE, when the whole statement migrates to the slaves, it will have to do the SELECT all over again, potentially lagging the slaves because of how long that SELECT will take. If you use a separate SELECT query, this problem goes away, as all you're passing to the DELETE query is a list of IDs.
Do you have another opinion or see a fault with my logic? Feel free to share in the comments.
P.S. One thing to be careful about is, of course, potential edits to the table between the times the SELECT finishes and DELETEs start. I will let you handle such details by using transactions and/or logic pertinent to your application.
MySQL Indexing Considerations Of Implementing A Priority Field In Your Application
Wednesday, March 18th, 2009
Updated: June 9th, 2009
Introduction
If you, like me, are building or thinking of implementing a MySQL-powered application that has any need for prioritizing selecting certain data over other data, this article is for you.
Example
As a real world example, consider a queue-like video processing system. Your application receives new videos and processes them. The volume of incoming videos can at times be higher than the processing rate because the process is CPU bound, so occasionally a pretty long queue may form. You will try to process them as fast as you can but…
…suddenly, you need to process a video somewhere in the middle of the queue or an important video enters and needs immediate attention. What do you do?
An obvious solution is implementing a simple priority system where each item has a numeric priority field. Now you can sort first by priority from highest to lowest and then by id within the highest priority. Important and urgent items get a their priority changed to something higher and get processed first. There is only one problem.
Problem
The problem is pretty serious – let’s take a look at the SELECT statement. Before selecting, I’ve added 19 random rows to have some data to work on.
SELECT * FROM queue ORDER BY priority DESC, id LIMIT 1;
What kind of index would you put on this table to speed up this query? You do want to add a proper index, don’t you? DO YOU? Ok, good.
Here’s what happens without any indexes:
mysql> EXPLAIN SELECT * FROM queue ORDER BY priority DESC, id LIMIT 1; +----+-------------+-------+------+---------------+------+---------+------+------+----------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+----------------+ | 1 | SIMPLE | queue | ALL | NULL | NULL | NULL | NULL | 19 | Using filesort | +----+-------------+-------+------+---------------+------+---------+------+------+----------------+ 1 row in set (0.00 sec)
Using filesort, ugh, of course, due to sorting without an index.
Let’s see, how about a combined index on (priority, id)?
mysql> ALTER TABLE `queue` ADD INDEX `priority_id`(`priority`, `id`); Query OK, 19 rows affected (0.05 sec) Records: 19 Duplicates: 0 Warnings: 0 mysql> EXPLAIN SELECT * FROM queue ORDER BY priority DESC, id LIMIT 1; +----+-------------+-------+-------+---------------+-------------+---------+------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+-------------+---------+------+------+-----------------------------+ | 1 | SIMPLE | queue | index | NULL | priority_id | 5 | NULL | 19 | Using index; Using filesort | +----+-------------+-------+-------+---------------+-------------+---------+------+------+-----------------------------+ 1 row in set (0.00 sec)
Better because an index is being used but not very good because filesort is still present. “Of course!”, you slap yourself on the forehead. The first ORDER BY uses a DESCENDING order, and our key is in ASCENDING order.
So, let’s add the proper key with the right ordering instead.
mysql> ALTER TABLE `queue` DROP INDEX `priority_id`; Query OK, 19 rows affected (0.05 sec) Records: 19 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE `queue` ADD INDEX `priority_id`(`priority` DESC, `id`); Query OK, 19 rows affected (0.06 sec) Records: 19 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT * FROM queue ORDER BY priority DESC, id LIMIT 1; +----+-------------+-------+-------+---------------+-------------+---------+------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+-------------+---------+------+------+-----------------------------+ | 1 | SIMPLE | queue | index | NULL | priority_id | 5 | NULL | 19 | Using index; Using filesort | +----+-------------+-------+-------+---------------+-------------+---------+------+------+-----------------------------+ 1 row in set (0.00 sec)
What the deuce? This is the same result as with the previous index. Time to dig up the documentation.
Here is what the MySQL manual has to say under the ORDER BY optimization section:
MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that match the WHERE clause … if you mix ASC and DESC:
SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC;
Moreover, to confuse the user even more, the index creation command accepts the DESC instruction, without actually honoring it, as specified in the CREATE INDEX section:
An index_col_name specification can end with ASC or DESC. These keywords are allowed for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order.
So, after so many years MySQL still doesn’t support such basic functionality – you are either stuck with a query that uses filesort or have to look for a workaround.
Solution
Since it’s not possible to mix order directions, the solution is then to change the meaning of the priority column to match your needs. Thus, in the new approach priority 1 is higher than priority 10, and the application logic needs to accommodate to that. If you caught this while the application is still young, the code may be easy to change, but otherwise it could be a major pain in the butt.
Conclusion
The moral here is: plan your queries ahead and don’t mix and match DESC and ASC ordering as MySQL will not be able to use an index to resolve it. Do it even sooner if you’re putting lots and lots of data into your tables.
Swapping Column Values in MySQL
Tuesday, February 17th, 2009
Updated: June 9th, 2009
Today I had to swap 2 columns in one of my MySQL tables. The task, which seems easily accomplishable by a temp variable, proved to be a bit harder to complete. But only just a bit.
Here are my findings:
-
The
UPDATE swap_test SET x=y, y=x;
approach doesn't work, as it'll just set both values to y.
PostgreSQL seems to handle this query differently, as it apparently uses the old values throughout the whole query. [Reference] -
Here's a method that uses a temporary variable. Thanks to Antony from the comments for the "IS NOT NULL" tweak. Without it, the query works unpredictably. See the table schema at the end of the post. This method doesn't swap the values if one of them is NULL. Use method #3 that doesn't have this limitation.
UPDATE swap_test SET x=y, y=@temp WHERE (@temp:=x) IS NOT NULL;
The parentheses around @temp:=x are critical. Omitting them will cause data corruption.1 2 3
mysql> UPDATE swap_test SET x=y, y=@temp WHERE (@temp:=x) IS NOT NULL; Query OK, 3 rows affected Rows matched: 3 Changed: 3 Warnings: 0
-
This method was offered by Dipin in the comments. I think it’s the most elegant and clean solution. It works with both NULL and non-NULL values.
UPDATE swap_test SET x=(@temp:=x), x = y, y = @temp;
-
Another approach I came up with that seems to work:
UPDATE swaptest s1, swaptest s2 SET s1.x=s1.y, s1.y=s2.x WHERE s1.id=s2.id;
1 2 3
mysql> update swap_test s1, swap_test s2 set s1.x=s1.y, s1.y=s2.x where s1.id=s2.id; Query OK, 3 rows affected Rows matched: 3 Changed: 3 Warnings: 0
Essentially, the 1st table is the one getting updated and the 2nd one is used to pull the old data from.
Test schema used:
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE `swap_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`x` varchar(255) DEFAULT NULL,
`y` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO `swap_test` VALUES ('1', 'a', '10');
INSERT INTO `swap_test` VALUES ('2', NULL, '20');
INSERT INTO `swap_test` VALUES ('3', 'c', NULL); |
Do you have a better approach? If so, please share in the comments.
Some references:
- http://stackoverflow.com/questions/37649/swapping-column-values-in-mysql/ – some discussion on various methods, which eventually prompted me to start this post
- http://www.marcworrell.com/article-3026-en.html – discussion on the 2nd approach, which doesn’t work
How To Fight Clickjacking (Using The Recent Twitter Hijacking As An Example)
Thursday, February 12th, 2009
Updated: June 9th, 2009
Introduction
Clickjacking is a malicious technique of tricking web users into revealing confidential information or taking control of their computer while clicking on seemingly innocuous web pages. A vulnerability across a variety of browsers and platforms, a clickjacking takes the form of embedded code or script that can execute without the user's knowledge, such as clicking on a button that appears to perform another function (credit: Wikipedia).
Clickjacking is hard to combat. From a technical standpoint, the attack is executed using a combination of CSS and iFrames, which are both harmless web technologies, and relies mostly on tricking users by means of social engineering. Additionally, the only server side technique against clickjacking known to me is “frame breaking”, which would cause a legitemate site to break out of any iFrames it may be embedded in. This is not always the desired behavior and is generally frowned upon.
Generic Example
In laymen’s terms, clickjacking means that it is quite possible for websites to trick you into, for example, clicking a button to show a cute kitty while in reality prompting a deletion of all your hotmail email. A malicious site uses an iFrame (which essentially allows embedding sites within other sites) with hotmail loaded inside and hidden using CSS (which is a web language for styling HTML elements). A button named “Show Me The Next Awwww Kitty” is then placed by the malicious site and positioned below the iFrame layer (manipulated by CSS, yet again). However, because the iFrame is hidden, it looks like the “Aww” button is all you’re clicking. Wrong!
Latest Example: Twitter
This morning a new, though harmless, epidemic hit twitter. Hundreds and thousands of messages saying “Don’t Click: http://tinyurl.com/amgzs6” started showing up. Clicking the link shows a simple page with 1 button:
Clicking (which I of course did) uses clickjacking to repost the message to your own twitter account. Take a look yourself: http://search.twitter.com/search?q=don%27t+click.
All of these are a result of an experiment by some French guys to mess around with twitter and show the effects of clickjacking. Thank you for that, French guys. Creating awareness via the most social platform on the web is the best thing they could do for us.
Fight Clickjacking
James Padolsey recently wrote an excellent blog post about clickjacking and mentioned Twitter specifically. Because clickjacking relies mostly on social hacking (i.e. tricking people into clicking malicious links and buttons), Twitter is nothing but a perfect platform. James gives some nice background info and code examples. He correctly highlights, as I did earlier, that clickjacking is not a software bug – it’s a malicious technique exploiting harmless technologies.
So how does one fight clickjacking?
At this point the most reliable way is to use Firefox and the NoScript extension. NoScript provides a simple, yet amazingly effective feature, called ClearClick. From their site:
“…it's enabled by default, protecting NoScript users from Clickjacking everywhere: it even remains active if you switch NoScript in the less safe Allow scripts globally mode. How does it work? Clickjacking hides or displaces or partially covers something you wouldn't want to click, if you could see it in its original context. ClearClick does the opposite: whenever you click a plugin object or a framed page, it takes a screenshot of it alone and opaque (i.e. an image of it with no transparencies and no overlaying objects), then compares it with a screenshot of the parent page as you can see it. If the two images differ, a clickjacking attack is probably happening and NoScript raises a "ClearClick warning", showing you the contextualized and "clear" object you were about to click, so you can evaluate by yourself if that was really something you wanted to do.”
Did ClearClick work in the earlier twitter attack? Sure did! After I clicked the “Don’t click” button Noscript promptly popped up a warning showing the hidden iFrame (since the original malicious page has been removed, I found another similar page from the same author for screenshot purposes).
So, even if you don’t want to enable NoScript globally, install it anyway, just for ClearClick.
That about covers what I had to say about clickjacking. Stay safe, folks!

(2 rating, 2 votes)


beer planet is Artem Russakovskii's blog. Artem is a software engineer at