{"id":301,"date":"2011-03-06T00:50:56","date_gmt":"2011-03-06T00:50:56","guid":{"rendered":"http:\/\/www.themissingdocs.net\/wordpress\/?p=301"},"modified":"2011-03-06T00:50:56","modified_gmt":"2011-03-06T00:50:56","slug":"buffer-of-doom","status":"publish","type":"post","link":"https:\/\/www.themissingdocs.net\/?p=301","title":{"rendered":"Buffer of Doom"},"content":{"rendered":"<p>A simple pattern I have implemented myself, seen in many places and often results in problems is the following.<\/p>\n<p>We need a buffer to hold some data before we process it, so we allocate an array large enough to hold the data.\u00a0 To avoid continual garbage we keep this array and reuse it.\u00a0 If the data we need to buffer is too large, we throw away the current buffer and make a new larger one.\u00a0 We might even go fancy and ensure that the new buffer is greater in size than the current one by a minimum percentage margin, to ensure we don&#8217;t suffer from mass garbage from a 1,2,3,4,5,6 attack.<\/p>\n<p>Where the problem comes in is the spike.\u00a0 A single extraordinarily large piece of data is sent.\u00a0 In response we successfully allocate a huge buffer, maybe taking up almost half of the entire available memory space.\u00a0 Because of our design we never let this go.\u00a0 Now if we are multi-threaded and have multiple processing queues each with their own buffer, a well crafted scenario will take up almost all available memory just with these buffers, until rather than failing due to an OOM exception somewhere we can catch it, the runtime itself throws OOM and kills the process.<\/p>\n<p>Solution? Well we could simply outlaw all large data sets, but if you have multiple processing queues the threshold you have to place can be very low, to the point where normal processing is no longer feasible.\u00a0 Another idea is a bit of a compromise, have a high maximum, and a lower &#8216;spike threshold&#8217;.\u00a0 Any data set over the spike threshold is considered an anomaly and the buffer is not kept for reuse.<\/p>\n<p>A bit more complicated is the reverse extension of the minimum margin of growth.\u00a0 If more than &#8216;k&#8217; buffer uses in a row are less than the current size reduced by a percentage &#8216;p&#8217;, throw the current buffer away and use a smaller buffer.\u00a0 Question in my mind here is whether k is a constant, or a number which decreases as the buffer gets larger.\u00a0 I think probably the later results in best behaviour, but some theoretical analysis for random and worst case scenarios would be in order.<\/p>\n<p>And the reason I brought this topic up at all?\u00a0 System.Data.SqlClient.\u00a0 Large string values to or from the database are internally handled using a buffer of doom.\u00a0 There is one buffer per database connection.\u00a0 You have to be careful with large data or you end up allocating almost all available memory to database connection buffers&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A simple pattern I have implemented myself, seen in many places and often results in problems is the following. We need a buffer to hold some data before we process it, so we allocate an array large enough to hold the data.\u00a0 To avoid continual garbage we keep this array and reuse it.\u00a0 If the &hellip; <a href=\"https:\/\/www.themissingdocs.net\/?p=301\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Buffer of Doom<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-301","post","type-post","status-publish","format-standard","hentry","category-net-stuff"],"_links":{"self":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts\/301","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=301"}],"version-history":[{"count":0,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=\/wp\/v2\/posts\/301\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=301"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=301"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.themissingdocs.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=301"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}