My Message close
GAME JOBS
Latest Blogs
spacer View All     Post     RSS spacer
 
May 21, 2013
 
An Object Of Lust
 
Gamasutra Blog Guidelines - Updated and open for discussion [9]
 
Postmortem: ROBLOX Mobile
 
Fingle marketing effort and numbers [1]
 
Next-Gen Xbox: What Microsoft Needs To Reveal On 21st May [13]
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
May 21, 2013
 
Blizzard Entertainment
Senior Software Engineer, User Interface
 
Blizzard Entertainment
Senior Technical Artist
 
Blizzard Entertainment
3D Environment Artist, Senior
 
Blizzard Entertainment
Dungeon Texture Artist
 
Blizzard Entertainment
3D Character Artist, Lead
 
Hidden Variable Studios
Senior Designer
spacer
Latest Press Releases
spacer View All     RSS spacer
 
May 21, 2013
 
Command Rommel’s
Panzers in Battle
Academy!
 
Peter Molyneux\'s 22cans
Partners with DeNA to...
 
\"The Cold War Era isn\'t
over, it\'s just...
 
Astro Empires Celebrates
7 Years
 
Mortal Bacon: The Dragon
Pig, New Ultimate Boss
in...
spacer
About
spacer Editor-In-Chief:
Kris Graft
Blog Director:
Christian Nutt
Senior Contributing Editor:
Brandon Sheffield
News Editors:
Mike Rose, Kris Ligman
Editors-At-Large:
Leigh Alexander, Chris Morris
Advertising:
Jennifer Sulik
Recruitment:
Gina Gross
Education:
Gillian Crowley
 
Contact Gamasutra
 
Report a Problem
 
Submit News
 
Comment Guidelines
Sponsor

 
In-depth: Smoothsort vs. Timsort
In-depth: Smoothsort vs. Timsort
 

June 19, 2012   |   By Jaewon Jung

Comments Post A Comment

More: Console/PC, Programming





[In this reprinted #altdevblogaday in-depth piece, Crytek's technical lead Jaewon Jung examines Smoothsort and Timsort, offering benchmark results from his tests with the two sorting algorithms.]

I recently found out an interesting sorting algorithm called Smoothsort, originally created by the legendary Dijkstra. Again, my favorite algorithm guy, Keith Schwarz has a detailed explanation about it here[2] with his C++ implementation[4].

Another sorting algorithm I met recently through my Twitter stream is Timsort, which is known to be used in Python(>=2.3) and Java SE 7. Interestingly, both are supposedly faster when part of the sequence is already sorted.

These two seemed to me holy grails among comparison sorts. Naturally a thought came to me to study these and benchmark them with STL sorting algorithms.

Smoothsort

As you can see here[1], this algorithm shows a wonderful asymptotic performance. O(nlogn) worst-case time complexity and O(1) space usage! Except not being a stable sorter, it looked almost like a silver-bullet sort.

But, the algorithm is definitely a tough kind to understand. It took some time to wrap my head around it even with Keith's kind explanation. As Keith explains, it can be viewed as a variant of heap sort with a special heap called Leonardo heap, which again is based on something called Leonardo number, a close cousin of the well-known Fibonacci number.

Anyway please refer to Keith's explanation for the ins and outs of this algorithm.

Timsort

In contrast to the Smoothsort, which is mathematically sound and deep in theories (as Keith mentions, it's hard for one(at least, for Keith and me) to imagine how Dijkstra originally came up with it), Timsort is rather a product of observations of real-word data and clever tricks to exploit common patterns found in them.

Compared to Smoothsort, Timsort has a worse space usage of O(n), but it has a benefit of being a stable-sorter. Unfortunately, this algorithm is also far from simple. In my opinion, the best way to come to a good understanding of this is to read this and this[5] in that order. The latter is the original explanation of the creator, Tim Peters.

In a nutshell, Timsort is an adaptive combination of merge sort and insertion sort. I cound find a decent C++ implementation of it here[7].

Benchmark

Now, fun time. To see how these new kids stand their grounds against battle-tested standard sorting algorithms in STL, I adopted some benchmark code from the Timsort implementation above and modified it a little. You can find the main code here.

Without further ado, here is the result:


  • Data size: 100,000
  • Data type: int or double
  • Unit: milliseconds
  • Partially sorted #0: each subarray of size 10 in 100,000 sequence is sorted
  • Partially sorted #1: each subarray of size 1000 in 100,000 sequence is sorted
  • The original implementation of Keith's uses std:bitset. 'Smoothsort(raw bit)' is a modified one that uses raw bit operations instead
  • Test hardware: Intel Core i7 920 / 6GB RAM
As you can see, both Timsort and Smoothsort didn't cut the mustard. Smoothsort is worse than STL sorts in all cases (even with std:bitset replaced with raw bit operations).

Timsort shows a trade-off. In random or almost random sequences, not as good as STL ones, but in partially sorted sequences like 'Reversed', 'Sorted' and 'Partially sorted #1′, it shows impressive speed-up.

Admittedly, apart from replacing an obvious culprit like std::bitset, I didn't try any thorough optimization for each. So if you can see/suggest any optimization opportunities for both that I missed, please leave a comment.

std::sort/std::stable_sort

I was somewhat impressed with STL sorters at this point and found out I hadn't known much about their internal implementations except the foggiest idea that it would be a variant of quick sort.

So I digged into the source(STL in VS2012 RC, specifically). As you know, there are two, one which doesn't maintain the relative order of records with equal keys(unstable) and the other which keeps it(stable).

std::sort is basically a quick sort with a median of medians algorithm to decide a partition pivot. Once the sequence becomes small enough(< 32), it switches to an insertion sort. Or if the recursion becomes too deep(> 1.5log2(N)), then it switches to a heap sort.

std::stable_sort is not a quick sort. It's a sort(pun intended) of a merge sort. First, it insertion-sorts each chunk of size 32 and merge them hierarchically. Initially, it tries to get a temp storage of the half size of the original sequence and use it when merging.

If the allocation fails, then it tries a smaller size, but this means more recursions and merges, so slower, of course. In the sense that it is a combo of merge sort and insertion sort, one can say it's similar to Timsort in essence, although the latter has much more complex tricks up its sleeve.

Codes for std::sort/std::stable_sort are relatively easy to follow(especially in comparison with understanding Smoothsort or Timsort), so I strongly recommend to take stock of them, if you haven't done before.

Conclusion
  • Asymptotic performance is not a whole story at all. The constant factor matters (an obvious thing, but still worth repeating)
  • Timsort can be a viable alternative when data are not so random
  • Smoothsort, though mathematically clever, doesn't cut the mustard
  • std::sort/std::stable_sort is pretty good in most of the cases
  • For small data, insertion sort is very good. So it's a good strategy to mix it with other algorithms to devise a good hybrid sorting algorithm
References
  1. http://en.wikipedia.org/wiki/Sorting_algorithm
  2. http://www.keithschwarz.com/smoothsort/
  3. http://en.wikipedia.org/wiki/Smoothsort
  4. http://www.keithschwarz.com/interesting/code/?dir=smoothsort
  5. http://svn.python.org/projects/python/trunk/Objects/listsort.txt
  6. http://en.wikipedia.org/wiki/Timsort
  7. https://github.com/gfx/cpp-TimSort
[This piece was reprinted from #AltDevBlogADay, a shared blog initiative started by @mike_acton devoted to giving game developers of all disciplines a place to motivate each other to write regularly about their personal game development passions.]
 
 
Top Stories

image
Market's ready for new consoles, but old-gen surprisingly viable
image
The next Xbox: What Microsoft needs to reveal this week
image
Four ways next-gen consoles could fail, according to Riccitiello
image
How developers mess up immersion (you might be doing it wrong)


   
 
Comments


none
 
Comment:
 




 
UBM Tech