Once in a lifetime: 100 books in a year

(This is my last blog post for 2009 – thanks to everyone who reads my blog and takes part in the SQL community – hope you have a Happy and Prosperous New Year!) 

Every so often you have to challenge yourself with a goal that actually stretches your abilities and tests your stamina. At the start of 2009 I set myself the goal of reading 50 books during the year. By the time January was over I'd already read 18 books so I upped the goal to 100, thinking it would be easily achievable. How wrong I was!

I'm very proud that I stuck with this through the year and met my goal, finishing the 100th book on December 29th. I deliberately chose the final book to be Scotland: The Story of a Nation, by Magnus Magnusson. He used to be the host of the UK quiz show Mastermind (that I loved as a teenager), and his catch phrase was "I've started, so I'll finish!". An appropriate statement on my undertaking this year!

If you held a gun to my head and forced me to pick from all the book's I've read this year, my #1 favorite book is Cormac McCarthy's The Road (no, I haven't seen the movie). Incredibly powerful, haunting, and ultimately sad book – I get a lump in my throat just thinking about the story and it's ending. If you only read one book next year, read that one.

Overall, it was an excellent experience and I recommend everyone to try something similar at some point in their lives. Many people have expressed an interest in seeing the complete list plus my favorites for the year, so this blog post is my summary for you all (and as a neat way of getting closure for me too). It's divided into three parts: data, top-10, and the complete list.

I hope you enjoy reading this as much as I've enjoyed putting it together, and it inspires you to try some of these books, or even to set yourself a reading goal next year. Do let me know what you think. And I'll leave you with the saying that's governed 2009 for me:

In omnibus requiem quaesivi, et nusquam inveni nisi in angulo cum libro!

Analysis of What I Read

I read a total of 39674 pages, or about 109 pages on average every day, and a book every 3.65 days. Of course some days I didn't read anything and some days I read 500 pages, depending on what I was doing. You might ask – how the hell did you make the time for that with everything else you do? Well, I flew 138000 miles during the year and spent quite a few days sitting by pools in hot places getting on time zones before teaching classes, mostly in India (2 trips) and Thailand (4 trips) – that's a lot of time right there. I made time when at home, reading pages here and there while cooking, taking a break from work, in bed, etc. It also helps that I love reading, and I read quickly (I don't speed-read, or skip sections, every word is read and digested).

Several people through the year poo-poo'd my goal, saying I must only be reading small books, or 'fast reads'. No. I picked a general range from my library (I've got 900+ books – I don't like electronic readers, and one of my favorite past-times is buying books). Here are two charts: the first shows the number of pages in each book, in order that I read them; the second shows the proportion of books in each genre I read.

 

The average book length was 397 pages, and as you can see, I'm a huge history buff, so 42% of all books were either hard history, or historical fiction. Make fun of me for producing charts if you want, I don't care :-)

The Top-10 

Now on to the top-10. I tried very hard get down to 10 and couldn't – so you get my top-11. It's just impossible for me to order them so I'll present them in the order I read them, along with a little picture of the cover, and my mini-review from my Facebook page (I always post a little review when I finish a book, the first 3 I read before Facebook got it's evil claws into my psyche). The fact that only one hard history book is on the list does not imply that the others I read weren't good – they almost all were excellent, but just not as hugely entertaining or enthralling as the fiction I read.

  #25 The Kite Runner; Khaled Hosseini; 400pp; Fiction; February 14th (From what I remember: My first exposure to life in Afghanistan. Follows the life of a kite-flying boy and his friend in Kabul as it's torn apart by conflict between the Taliban and other warlords. Very well written and highly recommended, as is the sequel A Thousand Splendid Suns that became my #38.)

  #29 The Road; Cormac McCarthy; 287pp; Fiction; February 26th (From what I remember: Follows a father and son heading west through post-apocalyptic USA and their encounters with other survivors. As I said above, my favorite book of the year. Incredibly powerful – a masterpiece. Go read it.)

  #37 Riding the Iron Rooster: By Train Through China; Paul Theroux; 480pp; Travel; March 30th (From what I remember: I love Theroux's travel writing – his knack for portraying people he meets and irreverent appraisals of places he passes through. In this book he explores the Chinese rail network. I'd love to follow in his footsteps on my next trip to China – been twice, but didn't take any trains.)

  #51 The White Tiger; Aravind Adiga; 304pp; Fiction; July 1st (Winner of last year's Man Booker Prize. Excellent story about a driver in India – I can really relate to it after our two trips to India earlier this year being driven around the streets. Quick read – recommended.)

  #59 The Enchantress of Florence; Salman Rushdie; 368pp; Historical Fiction; July 25th (My first Salman Rushdie book turned out to be a real page-turner. Excellent story, steeped in 16th Century history of the Mughals and Florence (a real favorite city of mine – planning a week-long trip next year). Richly told story, great twist at the end. Looking forward to reading a bunch more of his, on a UPS truck towards me already :-) Highly recommended!)

  #64 Shadow of the Silk Road; Colin Thubron; 400pp; Travel; August 6th (Terrific account of following the 4000-mile Silk Road from Xian to Antioch. Central Asia really seems to be "a paradise or hell of mingled ethnicities" with borders that don't really divide the peoples of the area. Very strongly recommended – although the book has engendered some serious wanderlust in this reader!)

  #78 The Name of the Rose; Umberto Eco; 552pp; Historical Fiction; September 21st (I love Eco's works (this is my 4th of his) – they're hugely involved and heavy going to read, with long sections of complex prose. His stories are always involved and erudite, and this is no exception – a murder-mystery set in an early 14th century Italian monastery, amidst the Imperial vs. Papal backed theological struggles of the time. Unfortunately I'd seen the movie so knew the end, but the book was excellent – lots of pithy, syllogistic discussion. Highly recommended, but not for the casual reader.)

  #80 The Elegance of the Hedgehog; Muriel Barbery; 336pp; Fiction; September 29th (Translated from French, this wonderful book concerns a concierge of an upper-class apartment building in Paris. She's low-born but very intelligent, which she hides from the vacuous residents of the building. The other major character is a 12-yr old girl, also hyper-intelligent, but unhappy and suicidal, with startling insights on life. Life changes for them both. Beautiful book, highly recommended.)

  #85 Sea of Poppies; Amitav Ghosh; 560pp; Historical Fiction; October 19th (I've got a real thing going for writers portraying life in India right now. This book follows the stories of a bunch of people around the time of the Opium Wars, who are linked into the trade in India – both Indians and Westerners. Various calamities befall the Indians and they end up on a schooner, the Ibis, heading down to Mauritius. A very compelling story, expertly told and I'll be picking up some more of his novels from Amazon. Highly recommended.)

  #93 The Meaning of Night: A Confession; Michael Cox; 720pp; Historical Fiction; December 7th (Superlative story telling! Been reading this (long) one for a few months on and off. Compelling tale of a man discovering his true origins and trying to win back what is his, with twists and turns along the way – written as a confession from the point of view of the man himself. Dark and brooding, mixed in with life in England in the 1850s. Highly recommended.)

  #100 Scotland: The Story of a Nation; Magnus Magnusson; 752pp; History; December 29th (I deliberately chose my goal-meeting final book of the year to be Magnusson's magnum opus: his 700pp work on the history of Scotland. Extraordinarily well-written and comprehensively researched, I strongly recommend this book to anyone with Scottish roots.)

The Complete List

And now, for completeness, here's the entire list of all 100 books I read, with links to Amazon.com so you can explore further.

  1. Mademoiselle Boleyn; Robin Maxwell; 355pp; Historical fiction; January 2nd
  2. Ghostwalk; Rebecca Stott; 368pp; Fiction; January 7th
  3. The Old Patagonian Express:By Train Through The Americas; Paul Theroux; 404pp; Travel; January 8th
  4. Persian Fire: The First World Empire and the Battle for the West; Tom Holland; 464pp; History; January 9th
  5. Eternity; Greg Bear; 416pp; Science fiction; January 10th
  6. Queen Isabella: Treachery, Adultery, and Murder in Medieval England; Alison Weir; 512pp; History; January 11th
  7. Our Dumb World: The Onion's Atlas of Planet Earth; The Onion; 256pp; Humor; January 13th
  8. Dead Reckoning: Tales of the Great Explorers 1800-1900; Helen Whybrow (Editor); 576pp; Travel; January 15th
  9. If You Liked School, You'll Love Work; Irvine Welsh; 320pp; Fiction; January 17th
  10. The Professor and the Madman:A Tale of Murder; Insanity, and the making of the O.E.D.; Simon Winchester; 288pp; History; January 18th
  11. Isaac Newton; James Gleick; 288pp; History; January 20th
  12. Twilight (The Twilight Saga, Book1); Stephanie Meyers; 544pp; Fiction; January 21st
  13. New Moon (The Twilight Saga, Book 2); Stephanie Meyers; 608pp; Fiction; January 22nd
  14. Brunelleschi's Dome: How a Renaissance Genius Reinvented Architecture; Ross King; 208pp; History; January 24th
  15. Eclipse (The Twilight Saga, Book 3); Stephanie Meyers; 640pp; Fiction; January 25th
  16. Breaking Dawn (The Twilight Saga, Book 4); Stephanie Meyers; 756pp; Fiction; January 27th
  17. The Secret Diary of Anne Boleyn; Robin Maxwell; 281pp; Historical Fiction; January 29th
  18. Brideshead Revisited; Evelyn Waugh; 368pp; Fiction; January 31st
  19. Signora da Vinci; Robin Maxwell; 448pp; Historical Fiction; February 1st
  20. Chasm City; Alastair Reynolds; 640pp; Science Fiction; February 6th
  21. A Short History of Byzantium; John Julius Norwich; 496pp; History; February 9th
  22. Eleanor of Aquitaine: A Life; Alison Weir; 441pp; History; February 11th
  23. To The Tower Born; Robin Maxwell; 320pp; Historical Fiction; February 12th
  24. Virgin: Prelude to the Throne; Robin Maxwell; 243pp; Historical Fiction; February 12th
  25. The Kite Runner; Khaled Hosseini; 400pp; Fiction; February 14th
  26. Hellboy Library Edition, Vol. 1: Seed of Destruction and Wake the Devil; Mike Mignola; 278pp; Comics; February 21st
  27. The Year 1000: What Life Was Like at the Turn of the First Millennium; Robery Lacey; 240pp; History; February 22nd
  28. Accelerando; Charles Stross; 415pp; Science Fiction; February 23rd
  29. The Road; Cormac McCarthy; 287pp; Fiction; February 26th
  30. The Last Apocalypse: Europe at the Year 1000 A.D.; James Reston Jr.; 336pp; History; February 28th
  31. The First Crusade: A New History: The Roots of Conflict between Christianity and Islam; Thomas Asbridge; 448pp; History; March 1st
  32. Hellboy Library Edition, Vol. 2: The Chained Coffin, The Right Hand of Doom, and Others; Mike Mignola; 278pp; Comics; March 3rd
  33. Lighthousekeeping; Jeanette Winterson; 252pp; Fiction; March 4th
  34. Marvel 1602; Neil Gaiman; 248pp; Comics; March 8th
  35. Eternals; Neil Gaiman; 256pp; Comics; March 8th
  36. The Absolute Sandman, Volume 4; Neil Gaiman; 608pp; Comics; March 13th
  37. Riding the Iron Rooster: By Train Through China; Paul Theroux; 480pp; Travel; March 30th
  38. A Thousand Splendid Suns; Khaled Hosseini; 432pp; Fiction; April 1st
  39. The Complete Memoirs of George Therston; Siegfried Sassoon; 656pp; History; April 12th
  40. A Tale of Two Cities; Charles Dickens; 544pp; Historical Fiction; April 21st
  41. Michelangelo and the Pope's Ceiling; Ross King; 384pp; History; April 28th
  42. Augustus: The Life of Rome's First Emperor; Anthony Everitt; 432pp; History; April 29th
  43. Genghis Khan and the Making of the Modern World; Jack Weatherford; 352pp; History; May 9th
  44. The Killer Book of Serial Killers; Tom Philbin; 352pp; History; May 13th
  45. Why We Suck; Denis Leary; 240pp; Non-Fiction; May 23rd
  46. Holy Terrors: Gargoyles on Medieval Buildings; Janette Rebold Benton; 140pp; History; May 28th
  47. Knights Templar: The Essential History; Stephen Howarth; 321pp; History; Mat 31st
  48. Dark Star Safari: Overland from Cairo to Capetown; Paul Theroux; 496pp; Travel; June 17th
  49. The Thirteenth Tale; Diane Setterfield; 432pp; Fiction; June 27th
  50. A Short History of Nearly Everything; Bill Bryson; 560pp; History; June 30th
  51. The White Tiger; Aravind Adiga; 304pp; Fiction; July 1st
  52. Anil's Ghost; Michael Ondaatje; 307pp; Fiction; July 2nd
  53. Time Bandit; Andy Hillstrand; 240pp; Non-Fiction; July 3rd
  54. The Sea; John Banville; 195pp; Fiction; July 4th
  55. The Glassblower of Murano; Marina Fiorato; 368pp; Historical Fiction; July 5th
  56. The Book of Unholy Mischief; Elle Newmark; 384pp; Historical Fiction; July 8th
  57. The Bookseller of Kabul; Asne Seierstad; 320pp; Non-Fiction; July 11th
  58. The Forge of Christendom: The End of Days and the Epic Rise of the West; Tom Holland; 512pp; History; July 18th
  59. The Enchantress of Florence; Salman Rushdie; 368pp; Historical Fiction; July 25th
  60. The Curious Incident of the Dog in the Night-Time; Mark Haddon; 226pp; Fiction; July 28th
  61. The Gathering; Anne Enright; 260pp; Fiction; July 29th
  62. Saving Fish From Drowning; Amy Tan; 528pp; Fiction; August 1st
  63. The Story of Tibet: Conversations with the Dalai Lama; Thomas Laird; 496pp; History; August 3rd
  64. Shadow of the Silk Road; Colin Thubron; 400pp; Travel; August 6th
  65. The Brief Wondrous Life of Oscar Wao; Junot Diaz; 352pp; Fiction; August 8th
  66. The Cloud Forest; Peter Matthiessen; 320pp; Travel; August 12th
  67. Redemption Ark; Alastair Reynolds; 656pp; Science Fiction; August 17th
  68. Year of Wonders; Geraldine Brooks; 336pp; Historical Fiction; August 18th
  69. Orpheus Rising; Bateman; 480pp; Fiction; August 20th
  70. The Catholic Church through the Ages: A History; John Vidmar; 384pp; History; August 24th
  71. Edward the Confessor; Frank Barlow; 408pp; History; August 28th
  72. The Places In Between; Rory Stewart; 320pp; Travel; August 30th
  73. The Forever War; Dexter Filkins; 384pp; Non-Fiction; September 5th
  74. Mogadishu!; Heroism and Tragedy; Kent DeLong; 144pp; Non-Fiction; September 10th
  75. The Temporal Void; Peter F. Hamilton; 736pp; Science Fiction; September 14th
  76. One Hundred Years of Solitude; Gabriel Garcia Marquez; 448pp; Fiction; September 15th
  77. Slaughterhouse Five; Kurt Vonnegut; 288pp; Fiction; September 17th
  78. The Name of the Rose; Umberto Eco; 552pp; Historical Fiction; September 21st
  79. People of the Book; Geraldine Brooks; 400pp; Historical Fiction; September 27th
  80. The Elegance of the Hedgehog; Muriel Barbery; 336pp; Fiction; September 29th
  81. The Lost Heart of Asia; Colin Thubron; 400pp; Travel; October 2nd
  82. So Young, Brave, and Handsome; Leif Enger; 272pp; Fiction; October 4th
  83. The Mysterious Flame of Queen Loana; Umberto Eco; 480pp; Fiction; October 7th
  84. A Time of Gifts: On Foot To Constantinople; Patrick Leigh Fermor; 344pp; Travel; October 10th
  85. Sea of Poppies; Amitav Ghosh; 560pp; Historical Fiction; October 19th
  86. A Conspiracy of Paper; David Liss; 480pp; Historical Fiction; October 23rd
  87. Breakfast of Champions; Kurt Vonnegut; 303pp; Fiction; October 27th
  88. Dogs of God: Columbus, the Inquisition, and the Defeat of the Moors; James Reston Jr; 400pp; History; October 27th
  89. A Spectacle of Corruption; David Liss; 396pp; Historical Fiction; November 5th
  90. Absolution Gap; Alastair Reynolds; 704pp; Science Fiction; November 15th
  91. Spawn Collection, Volume 4; Todd MacFarlane; 480pp; Comics; November 27th
  92. Parallel Worlds; Michio Kaku; 448pp; Non-Fiction; December 1st
  93. The Meaning of Night: A Confession; Michael Cox; 720pp; Historical Fiction; December 7th
  94. Fine Just the Way It Is: Wyoming Stories 3; Annie Proulx; 240pp; Fiction; December 8th
  95. The Shadow Lines; Amitav Ghosh; 256pp; Fiction; December 10th
  96. The Lemon Table; Julian Barnes; 256pp; Fiction; December 16th
  97. Jackson Pollock; Leonhard Emmerling; 96pp; Non-Fiction; December 21st
  98. Fire and Steam: How the Railways Transformed Britain; Christian Wolmar; 384pp; History; December 22nd
  99. The Bedford Hours: A Medieval Masterpiece; Eberhard Konig; 144pp; History; December 23rd
  100. Scotland: The Story of a Nation; Magnus Magnusson; 752pp; History; December 29th

So you want to write a Storage Engine…

Earlier today there was a thread on Twitter asking about what degrees and academic background people have who work on SQL Server. I volunteered to put together a reading list for those wanting to know more of the theory behind a relational database management system, rather than just how to use one.

Here I present a reading list that will take you from how to program well up to how to architect multi-threaded database servers. I’ve read all of these at some point between finishing my CS/EE degree in Edinburgh in 1994 and stopping dev work in 2005, and they’re sitting on my bookshelf as I type this. They’re all the best books I could find on the subject at the time, and they’re all absolutely excellent. I’ve included Amazon.com links to the most up-to-date editions (because I’m nice like that :-).

Programming

Underneath the RDBMS

Concepts

RDBMS architecture

You should also checkout the ACM Special Interest Group on Management of Data (SIGMOD), and the VLDB Conference – these are the premier academic conferences to do with database management systems.

This should keep you busy.. happy reading!

 

What can cause log reads and other transaction log questions

Earlier today there was a question on SQL Server Central where someone wanted to know what could be causing so many reads on their transaction log. I was asked to chime in by fellow MVP Jonathan Kehayias (who also sent me some questions that I've answered in this post – thanks Jon!), so I did, with a list of everything I could think of. I thought it would make for a good post, so here it is, with a few more things I remembered while writing the post.

Before I start, if you're not comfortable talking log records and transaction log architecture, see my TechNet Magazine article on Understanding Logging and Recovery, which explains everything clearly, including how having too many VLFs can affect operations on the log that have to scan VLFs.

Each of these things can cause reads of the log:

  • Transaction rollback: when a transaction has to roll back (either because you say ROLLBACK TRAN or something goes wrong and SQL Server aborts the transaction), the log records describing what happened in the transaction have to be read so that their effects can be removed from the database. This is explained in the TechNet Magazine article. Note that it doesn't matter if you're using explicit transactions or not (i.e. BEGIN TRAN), SQL Server always starts a transaction for you (called an implicit transaction) so that it can put a boundary on what needs to be rolled back in case of a failure.
  • Crash recovery: crash recovery must read the transaction log to figure out what to do with all the log records in the active portion of the log (all the way back to the earlier of the most recent checkpoint or the start of the oldest active transaction). The log is read twice – once going forward from that oldest point (called the REDO phase) and then going backwards (called the UNDO phase). Again, this is explained in great depth in the article.
  • Creating a database snapshot: a database snapshot is a point-in-time view of a database. What's more, it's a transactionally consistent point-in-time view of a database – which means that, essentially, crash recovery must be run on the real database to create the transactionally consistent view. The crash recovery is run into the database snapshot, the real database isn't affected – apart from having all the active transaction log read so that crash recovery can run.
  • Running DBCC CHECKDB: creates a database snapshot by default on 2005 onwards, and runs the consistency checks on the snapshot. See above. There's a much more detailed description, including how this worked in 2000, in the first part of the 10-page blog post CHECKDB From Every Angle: Complete description of all CHECKDB stages.
  • Transaction log backups: this one's kind of obvious. A transaction log backup contains all the transaction log records generated since the last log backup finished (or since the log backup chain was established). To back up the log it has to read it. What's not so obvious is that a log backup will also scan through all the VLFs in the log to see if any active ones can be made inactive (called clearing or truncating the log – both misnomers as nothing is cleared and nothing is truncated). See my TechNet Magazine article on Understanding SQL Server Backups and in the blog post Importance of proper transaction log size management.
  • Any kind of data backup: (full/differential backup of a file/filegroup/database). Yup – data backups always include transaction log – so the backup can be restored and give you a transactionally consistent view of the database. See Debunking a couple of myths around full database backups and More on how much transaction log a full backup includes for details if you don't believe me.
  • Transactional replication: transactional replication works by harvesting committed transactions from the transaction log of the publication database (and then sending them to the subscriber(s) via the distribution database – beyond the scope of this post). This is done by the Log Reader Agent job, running from the Distributor. It needs to read all the log records generated in the publication database, even if they're nothing to do with the publications. More log equals more reads. My whitepaper on combining database mirroring and transactional replication in 2008 has more details on this stuff, as does Books Online.
  • Change data capture (in 2008): CDC uses the transactional replication log reader agent to harvest changes from the transaction log. See above. This means the CDC can cause the log to not be able to clear properly, just like transactional replication or database mirroring – see my blog post Search Engine Q&A #1: Running out of transaction log space for more details. Note the I didn't say Change Tracking – it uses a totally different mechanism – see my TechNet Magazine article on Tracking Changes in Your Enterprise Database for more details.
  • Database mirroring: DBM works by sending physical log records from the principal to the mirror database. If the mirroring sessions drops out of the SYNCHRONIZED state, then the log records won't be able to be read from memory and the mirroring subsystem will have to get them from disk – causing log reads. This can happen if you're running asynchronous mirroring (where you're specifically allowing for this), or if something went wrong while running synchronous mirroring (e.g. the network link between the principal and mirror dropped out, and a witness wasn't configured or the principal could still see the witness – again, beyond the scope of this post). Regardless, this is called having a SEND queue on the principal.
  • Restoring a backup: whenever backups are restored, even is you've said WITH NORECOVERY, the REDO portion of recovery is run for each restore, which reads the log.
  • Restoring a log backup using WITH STANDBY: in this case, you've essentially said you'd like recovery to run, but not to affect the transaction log itself. Running recovery has to read the log. For more info on using WITH RECOVERY, NORECOVERY, or STANDBY, see my latest TechNet Magazine article on Recovering from Disasters Using Backups, which explains how restores work.
  • A checkpoint, in the SIMPLE recovery mode only: see my blog post How do checkpoints work and what gets logged for a description of what checkpoints are and what they do. In the SIMPLE recovery mode, checkpoints are responsible for clearing the log (described with links above) so must read through all the VLFs to see which can be marked inactive.
  • When processing a DML trigger (on 2000): (thanks to Clay Lenhart for the comment that reminded me of this). In SQL Server 2000, the before and after tables that you can process in a DML trigger body are actually found from looking at the log records generated by the operation that caused the trigger to fire. My dev team changed this in 2005 to store the before and after tables using the version store, giving a big perf boost to DML trigger processing. 
  • Manually looking in the log (with DBCC LOG or the table-valued function fn_dblog): this one's pretty obvious.

Phew – a lot of things can cause log reads, the trick is knowing which one it is!

As you can see, there could be a lot of activity reading from your log as well as writing to it, which could cause an IO bottleneck. Make sure that the IO subsystem on which you place the log file (note: you don't get ANY performance benefit from having multiple log files) can handle the read and write workload the log demands. RAID 1 or RAID 10 with a bunch of spindles to spread the IOs out (note/warning/achtung: that's a big generalization – don't reply with a comment saying it's wrong because you've seen something different – different scenarios have different demands), and a proper RAID configuration (64k multiple for a stripe size, NTFS allocation unit size, volume partition alignment).