Web dev

ASP best practices: be very careful

This is something like an aftermath on my RS.GetRows considered harmful post. I did not find a definitive solution as it is likely impossible without Oracle properly fixing their OleDB provider, but it gives me good incentive for this post.

Ok, let’s get to it.

When one is working with ASP as server-side technology, there are few important things to always keep in mind:

  1. ASP means VBScript or JScript, with first one being far more popular (and most likely better choice)

  2. VBscript, as the name implies, is script language, meaning not pre-compiled. Each time you are opening .asp page, the ASP pre-processor will validate the page code, and then run through all of it again to execute the code

  3. If you use ASP includes (include file or include virtual) there is additional zero-step: all includes are combined in one file and then processed

  4. If you use Server.Execute instead of common includes, they are not included nor are processed when page is rendered. Only if the page actually calls them during execution, they are treated as separate page (request)

  5. If you use regular includes, combined ASP page could be very long and pre-processing time (not execution time) could be several times higher than for the page using Server.Execute, thus making them slower, thus leading to lower pages/sec ratio.

  6. If you use regular includes, entire page is actually treated as one file, meaning that local vars and functions from one included file are available in all other included files.

  7. If you use Server.Execute, then each of those files is its own world. If you define local vars/functions in master file - one that server.executes other files - those vars/functions are not available in those files. Only global, intrinsic stuff is available - like Request, Response etc.

  8. Getting anything from or putting anything into Session will create a copy of that var/array/object in memory

  9. ASP arrays are allocated in heap memory

This is mostly common, sensible stuff, given here only because they play important role further on. ASP is executed on IIS, which in today’s world usually means IIS6 on Windows Server 2003. Ok, here are some important facts on how IIS works:

  1. IIS, on default settings, can execute 25 independent requests per CPU (on Windows Server 2003 Standard, I assume similar for other versions)

  2. When IIS gets CPU time to execute particular request, it does not let it go until the whole request is completed or it encounters a execution branch. That means: if it encounters Server.Execute, it will stop, add that server.executed page (file) into requests queue and release the CPU. The original (master) request will wait until child request is executed

Finally, let me list some of the ASP best practices (found all over the ASP sites), important for this rant.

  1. Do not use sessions - among other things, they are thread-locking thus killing good server-farm scaling

  2. Do not use Recordsets for processing - read your data, call RS.GetRows and kill the recordset. GetRows creates an array, which is the fastest structure you can use in ASP and uses the least amount of memory

  3. Use as light pages as possible to shorten processing time

Enough for the introduction.

How would you code with all this in mind?

One thinking: ok, light pages. That means I need to modularize my code, so that I can separate them into pieces that can be server.executed. Even though this leads to more requests, they should execute quickly. Hmm…it is rarely possible to completely separate code - I will certainly need some page-wide vars and/or functions in both master file and child files. But they are two separate worlds, so I should use Session to transfer them. But wait - they say that sessions are bad…ah, to hell with that - it is overblown anyway. Plus, I need session to keep user context somehow. Keeping sessions.

Ok, that’s solved. Recordsets and GetRows…this is repeated so many times and in so many places, it must be God-given truth. OK, going exclusively with that.

Man, you can’t imagine how wrong it turned out to be.

Problems…

Let us see what happens in real life. I have, for business and maintenance reasons, rather large ASP/ADO wrapping library of constants and functions. This library is exclusively used for all database operations. It amounts to 300kB or so - quite a lot to process. Hence, I will do all my db readings in master file, and resulting arrays will be transferred over Session into child files. That way, this library will not be included in each child file and ASP engine will not have to process it several time during one page execution, but only once. Plus, it’s way cleaner to maintain.

Ok, now going step by step through page execution, where that page has 4 server.executed child files.

  1. Master file is called and enters into requests queue

  2. When its turn comes, all stuff are read and prepared and I have one or two arrays in memory

  3. It encounters first child file requests

  4. I do: Session("some_array_name") = some_array

  5. Child file request enters the queue, master file request waits - at this moment I have two copies of some_array - one in master file as local var and the other in Session

  6. Child request comes around and gets executed

  7. I first get the stuff from Session, and among them: some_local_array_var = Session("some_array_name"). At this moment I have three copies of some_array: in master and child file plus in Session.

  8. When child request finishes, local array var is killed and I again have two copies

  9. This is repeated for remaining 3 child files

  10. At the end of master file, local array is always removed, and I’m a good boy and do my house keeping - I manually remove the array from Session using Session.Contents.Remove("some_array_name")

So, what just did I have: 1 + 2 + 1 + 1 + 1 = 6 array created, meaning memory is allocated 6 times for one same array. It starts to smell…

It gets worse if child files change the array - then you need to get the child-localized array copy back into Session - more allocations. If you then need to manipulate that array in the master file but in-between child files, it will mean several more allocation of this same array.

Simply put, you will have a very high percentage of heap memory fragmentation. I’ve seen numbers like 98%. Yes, that horrible. Remember, this is not .NET - no garbage collector, no memory clean-up in plain old ASP/VBScript. Memory fragmentation means that, at some point, when you try to read some fairly large recordset and transfer it into array, you might end up with no free memory block large enough to accommodate that array == cannot allocate memory message coming from OleDB provider == blank page (if I have good error handling) or error message displayed to the customer (if I’m lousy).

Nice.

You may think that this will not happen quickly and that you can get away with scheduled IIS restart every day at 4am. And this indeed works; I read the rumor that Microsoft did exactly the same thing on their own servers until they went to ASP.NET. But it only works if each server do not get hit too much - where “too much” is variable thing. Why?

Imagine that you have 300 people accessing your web server and sending requests for various pages on it. That is 300 master-file requests + 1200 child-file requests. IIS can execute 25 independent requests per CPU. I won’t do a step-by-step scenario (way too long) but just think: you will have several array copies across various page requests, all siting in memory until child-file requests wait in queue. If child-files has their own child-file requests (nothing unusual in large web apps) they their local array copies will also wait. Your queue becomes crowded; page execution time rises, site becomes slower, memory is fragmented and consumed at frightening speed.

Not very nice.

You want to have more visitors on your site. But more of them comes, the larger this problem becomes. Thus you need larger and larger server-farm to keep the requests/server ratio low. Not very efficient.

…and (kind of) solution

What I did, after all this analysis, days of testing and performance monitoring (from which these conclusions came) is choosing the lesser of two evils.

Can you guess the results? I was astonished.

Even though the master files became huge, the page request time went 20-100x down, on 2xIntel Xeon HT generic Dell server. With semi-high customer load. In the server.execute variant, as soon as I had 150+ requests == 100 executing and 50+ constantly in queue, master-file execution time went high and at 500+ in queue went ballistic (like 140s!). Plus memory usage was constantly rising until it ate up all available memory. All 2GB of it.

In latter case, re. exec. time stayed low (2s at most in my test case) and memory reached some level and lingered there. If it wasn’t for one little problem, it would be perfect.

And that problem is of course memory fragmentation. Sure it was much lower, but still gets you at some point. I suspect that this is something related to Oracle OleDB provider, not ADO (wild guess based on no problems reported from people using GetRows with MS SQL Server) but that is the only option I have - MS OleDB provider for Oracle does not work in my case (don’t ask why, I have no idea).

However, it was not all in vain - in real client case, this solution was good enough to hold a day (even more in some cases). Restart each day is a viable option then and I still have an efficient code. Because recordset processing is much slower than array processing - that is certainly true.