{"id":678,"date":"2010-06-22T06:45:00","date_gmt":"2010-06-22T06:45:00","guid":{"rendered":"\/blogs\/paul\/post\/Benchmarking-do-multiple-data-files-make-a-difference.aspx"},"modified":"2013-01-02T00:29:19","modified_gmt":"2013-01-02T08:29:19","slug":"benchmarking-do-multiple-data-files-make-a-difference","status":"publish","type":"post","link":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/","title":{"rendered":"Benchmarking: do multiple data files make a difference?"},"content":{"rendered":"<p>\n<font face=\"verdana,geneva\" size=\"2\">Many times I&#39;m asked whether having multiple data files can lead to an improvement in performance. The answer, as with all things SQL (except concerning auto-shrink) is a big, fat &quot;it depends.&quot; It depends on what you&#39;re using the database for, and the layout of the files on the IO subsystem, and the IO subsystem capabilities. I&#39;ve heard examples of &quot;yes&quot; and I&#39;ve heard examples of &quot;no.&quot;<\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Just for kicks, I put some of my <\/font><a href=\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-hardware-setup\/\"><font face=\"verdana,geneva\" size=\"2\">test hardware<\/font><\/a><font face=\"verdana,geneva\" size=\"2\"> to use to do some experimentation. (You can get to all my other <\/font><a href=\"https:\/\/www.sqlskills.com\/blogs\/paul\/category\/benchmarking\/\"><font face=\"verdana,geneva\" size=\"2\">Benchmarking<\/font><\/a><font face=\"verdana,geneva\" size=\"2\"> posts using this link.) <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">My setup for this series of tests is: <\/font>\n<\/p>\n<ul>\n<li><font face=\"verdana,geneva\" size=\"2\">Log file pre-sized to 8GB (to avoid log growth) on 8 x 1TB 7.2k SATA RAID-10 array, one iSCSI NIC, 128 KB stripe size<\/font><\/li>\n<li>\n<div>\n\t<font face=\"verdana,geneva\" size=\"2\">160GB database,&nbsp;variously setup as (all in the PRIMARY filegroup):&nbsp;<\/font>\n\t<\/div>\n<ul>\n<li>\n<div>\n\t\t<font face=\"verdana,geneva\" size=\"2\">1 x 160GB file <\/font>\n\t\t<\/div>\n<\/li>\n<li>\n<div>\n\t\t<font face=\"verdana,geneva\" size=\"2\">2 x&nbsp;80GB files <\/font>\n\t\t<\/div>\n<\/li>\n<li>\n<div>\n\t\t<font face=\"verdana,geneva\" size=\"2\">4 x 40GB files <\/font>\n\t\t<\/div>\n<\/li>\n<li>\n<div>\n\t\t<font face=\"verdana,geneva\" size=\"2\">8 x 20GB files <\/font>\n\t\t<\/div>\n<\/li>\n<li>\n<div>\n\t\t<font face=\"verdana,geneva\" size=\"2\">16 x 10GB files <\/font>\n\t\t<\/div>\n<\/li>\n<\/ul>\n<\/li>\n<li>\n<div>\n\t<font face=\"verdana,geneva\" size=\"2\">&nbsp;16 connections inserting 100\/16GB each, no other activity, all code executing on the server, no data transfer from clients <\/font>\n\t<\/div>\n<\/li>\n<\/ul>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Each test was run 5 times and then the time-for-test calculated as the average of the 5 test runs, so the two tests together represent 50 test runs. Luckily I wrote a test harness that will tear down and setup the database automatically each time in the different configurations, so just double click a cmd file and then a day or so later I get an email saying the test has finished. Great when we&#39;re traveling! <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Here are the test results: <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg\" alt=\"\" width=\"596\" height=\"424\" \/> <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">As you can see, it&#39;s pretty clear that with both test setups, having more data files definitely does produce a performance improvement, but only up to a point. <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Test 1: Data files on 8 x 300GB 15k SCSI RAID-10 array, two iSCSI NICs, 128KB stripe size <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Test 2: Data files&nbsp;round-robin between two 4&nbsp;x 300GB 15k SCSI RAID-10 array, one iSCSI NIC each, 128KB stripe size <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">In both cases, the performance increases up to eight data files, and then begins to decrease again with sixteen data files. The single data file case was bound to be slower on the SCSI array with fewer drives, and we see that in the results (left-most result&nbsp;in red). <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">In the best case, the eight-file case on two arrays was just over 6% faster than the single-file case on the single array. Hardly earth-shattering, but still a non-trivial gain. <\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Where&#39;s the gain coming from? I ran wait stats analysis for a few test variations &#8211; for example, between the eight data files test and the single data file test using two arrays, the cumulative wait stats were almost identical &#8211; 38\/39% PAGELATCH_EX, 19\/21% PAGELATCH_SH, 12\/13% WRITELOG. The gain is mostly coming from the IO subsystem, but the SCSI arrays are still overloaded, as I showed in plenty of the previous benchmarking tests.<\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Now, this is a very contrived test, with a single operation in my workload &#8211; it&#39;s definitely&nbsp;NOT representative of a mixed-operation OLTP workload. However, I <u>did<\/u> see a gain from having multiple data files &#8211; and I believe I would have seen more gain had the SCSI array(s) not been pretty much maxed out already.<\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">I&#39;ve heard plenty of anecdotal evidence that adding a few more data files for user databases can lead to performance improvements, but&nbsp;your mileage is definitely going to vary. I&#39;d be very interested to hear your observations in production as comments to this post (but please keep the comments constructive &#8211; don&#39;t give me a laundry-list of tests\/settings you want me to try, or rant about real-life vs. lab tests).<\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">Enjoy!<\/font>\n<\/p>\n<p>\n<font face=\"verdana,geneva\" size=\"2\">PS The next post on SSDs is forthcoming &#8211; just finishing up the (extensive) tests &#8211; and also the post on how you all have your log files configured, from the survey I did a while ago. Thanks for being patient! <\/font><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many times I&#39;m asked whether having multiple data files can lead to an improvement in performance. The answer, as with all things SQL (except concerning auto-shrink) is a big, fat &quot;it depends.&quot; It depends on what you&#39;re using the database for, and the layout of the files on the IO subsystem, and the IO subsystem [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15,53,66],"tags":[],"class_list":["post-678","post","type-post","status-publish","format-standard","hentry","category-benchmarking","category-io-subsystems","category-performance-tuning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Benchmarking: do multiple data files make a difference? - Paul S. Randal<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Benchmarking: do multiple data files make a difference? - Paul S. Randal\" \/>\n<meta property=\"og:description\" content=\"Many times I&#039;m asked whether having multiple data files can lead to an improvement in performance. The answer, as with all things SQL (except concerning auto-shrink) is a big, fat &quot;it depends.&quot; It depends on what you&#039;re using the database for, and the layout of the files on the IO subsystem, and the IO subsystem [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/\" \/>\n<meta property=\"og:site_name\" content=\"Paul S. Randal\" \/>\n<meta property=\"article:published_time\" content=\"2010-06-22T06:45:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2013-01-02T08:29:19+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg\" \/>\n<meta name=\"author\" content=\"Paul Randal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Paul Randal\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/\",\"name\":\"Benchmarking: do multiple data files make a difference? - Paul S. Randal\",\"isPartOf\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg\",\"datePublished\":\"2010-06-22T06:45:00+00:00\",\"dateModified\":\"2013-01-02T08:29:19+00:00\",\"author\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#primaryimage\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg\",\"contentUrl\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Benchmarking: do multiple data files make a difference?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#website\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/\",\"name\":\"Paul S. Randal\",\"description\":\"In Recovery...\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce\",\"name\":\"Paul Randal\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g\",\"caption\":\"Paul Randal\"},\"sameAs\":[\"http:\/\/3.209.169.194\/blogs\/paul\"],\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/author\/paul\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Benchmarking: do multiple data files make a difference? - Paul S. Randal","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/","og_locale":"en_US","og_type":"article","og_title":"Benchmarking: do multiple data files make a difference? - Paul S. Randal","og_description":"Many times I&#39;m asked whether having multiple data files can lead to an improvement in performance. The answer, as with all things SQL (except concerning auto-shrink) is a big, fat &quot;it depends.&quot; It depends on what you&#39;re using the database for, and the layout of the files on the IO subsystem, and the IO subsystem [&hellip;]","og_url":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/","og_site_name":"Paul S. Randal","article_published_time":"2010-06-22T06:45:00+00:00","article_modified_time":"2013-01-02T08:29:19+00:00","og_image":[{"url":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg","type":"","width":"","height":""}],"author":"Paul Randal","twitter_misc":{"Written by":"Paul Randal","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/","url":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/","name":"Benchmarking: do multiple data files make a difference? - Paul S. Randal","isPartOf":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#primaryimage"},"image":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#primaryimage"},"thumbnailUrl":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg","datePublished":"2010-06-22T06:45:00+00:00","dateModified":"2013-01-02T08:29:19+00:00","author":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce"},"breadcrumb":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#primaryimage","url":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg","contentUrl":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-content\/uploads\/2010\/6\/multidatafiles.jpg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/benchmarking-do-multiple-data-files-make-a-difference\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.sqlskills.com\/blogs\/paul\/"},{"@type":"ListItem","position":2,"name":"Benchmarking: do multiple data files make a difference?"}]},{"@type":"WebSite","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#website","url":"https:\/\/www.sqlskills.com\/blogs\/paul\/","name":"Paul S. Randal","description":"In Recovery...","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.sqlskills.com\/blogs\/paul\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce","name":"Paul Randal","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g","caption":"Paul Randal"},"sameAs":["http:\/\/3.209.169.194\/blogs\/paul"],"url":"https:\/\/www.sqlskills.com\/blogs\/paul\/author\/paul\/"}]}},"_links":{"self":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/posts\/678","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/comments?post=678"}],"version-history":[{"count":0,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/posts\/678\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/media?parent=678"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/categories?post=678"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/tags?post=678"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}