Expertise Bug Leads to Major Database Design Blunder

Often, we hear software developers (myself included) say, “yeah … we can definitely do that!” or “yeah … you have come to the right place, anything you want .. we can build it for you”! Then often, we too hear stories of software projects fail here and there. And now I begin to wonder how things could be worked out properly so problems, or worst case, failures can be avoided.

What happened to me recently could probably be one of the reasons why software projects fail. For the past several months, I have been working on a system that concerns the health of people. The system is intended for use by doctors, dentists, physical therapists, nurses, or any practice or profession that deals with people’s health, etc. The system, when done, will handle quite an extensive amount of data gathered from a good number of processes and sources that are realized every minute of the day.

For the past several months, there was a good amount of communication between my group and those potential users of the system. A lot of time were spent in requirements discovery and gathering, analysis, and everybody even subject a lot of items to questions just to sort things out clearly so things come out fine. A near functional prototype has been established for several months and it looks like everybody is happy of everyone’s progress … including the state of the system. Beta testing was conducted, had 6 major beta builds in the last few months, and seemed not a thing was amiss. In fact, it is almost done that the potential users are so eager to have the system deployed already and pioneered as soon as possible (that would have been like last month). I was into the finishing touches and was like in the process of putting icing on the cake.

Then … it was KABLAAAAAAMMMMMM!!! As the project’s database designer, I thought of something that hurled everything back into the drawing board. I miss identifying one piece of information that should have been in the database design from the very start. With the set of people contributing to make this happen … this one thing never had any manifestation of being thought out. The medical guys involved in the project never thought of the item. The software development guys never had any clue. I never had any clue. And I have contemplated so deeply to analyze how I could miss something this important. People who played the analyst role came short in thinking about this (I am one of those). But I, being the database designer, am blaming of missing something so important.

After gathering myself, I came to a conclusion that the only time that I’d be able to easily identify or come across such piece of information, is probably when I am a doctor, a practicing one, and at the same time a database designer who had lots of databases and experience tucked under my belt. I caught the missing piece from a totally unrelated event, not even related to what I am doing.

To cut the story short, what happened thereafter was 1 table was added to the database structure with 1 new column that would serve as a reference for 70% of other tables. With the change, 70% of sql code were re-written, 50% of critical UI got revamped and lots of time lost and gained lots of sleepless nights.

The moral of my story, though completely lacking of juicy technical details as I cannot divulge those for fear of legal ramifications and of ridicule (ano ako hilo? hahahha), software development / technical expertise can only bring us to certain extents and domain expertise is clearly a desirable attribute one can have, especially if you are a database designer. The reality and funny thing though is that, I can’t picture myself as a doctor and a database person so I can eliminate this problem in the future and this is probably the reason why there are just too many software project that have failed. And I can still hear myself saying “yeah … of course, anything you want, I can build it for you!”.

The system I talked about here is almost done and is looking really good! And I don’t think I have missed some more that would screw up my day. How about you? Do you say, being considered as an expert, can do of anything that is asked of you? 😛

When Do Applications Really Start To Slow Down

In this blog, I wanted to simulate a typical problem that projects, big and small alike, encounter when developers don’t have any idea how their queries perform. There are project where developers don’t give any regard to the performance of what they are doing. Some just code without actually knowing when their code exactly starts to explode on their faces.

Quite often, software development projects that rely on outsourced manpower, could probably experience euphoria right after they stamp their project as 100% done, only to experience problems when some coders responsible for a lot of the problems are probably gone and is somewhere ready to start on another mess. Most likely, those coders never had the chance of being told the problems they left. Probably some were lucky enough to have been retained and were still available when the problems start to manifest.

The Experiment

So when exactly do our applications start to show some sluggishness?

Here is the result:

 

Dev/Testing
Stage

After
15 days

After
1 month

After
6 month

_sales_office
(rows)

9235

29202

59877

337076

_sales
(rows)

119290

378058

776146

4376480

Qry1

1

4

8

48

Qry2

<1

<1

<1

4

Qry3

<1

<1

<1

<1

Qry4

<1

<1

<1

<1

Qry5

2

6

13

73

Qry6

2

5

11

66

Qry7

<1

<1

<1

<1

Qry8

<1

<1

<1

2

Qry9

<1

<1

<1

5

NOTE: results are in seconds.

To answer this question, I conducted an experiment to make a very simple 2-table application that contains very few columns. Only the primary and foreign keys were indexed
upon creation. The rest of the columns were not indexed. This application records the quantities of cameras sold by every salesman from a fix number of regions. I then built a small utility that would inject the tables of quantities being sold daily. I tried to have the records randomly inserted to be realistic as possible.

I then prepared at least 8 very simple queries so I can run it against the database when it is populated so I can see how the codes perform. The queries are a mix of things that anyone often encounters in a project that manifested some performance problems.

I then prepared the database to be filled with data that would simulate typical stages of the application’s lifecycle and when things are all very fast (usually this is during the development and testing stage). So I had prepared 4 stages so I get to see how things will progress in terms of speed of processing.

I started with the dev/testing stage. In this stage I wanted to know how fast are things during this stage assuming most projects don’t conduct load testing. As you can see in the result, all queries are running very fast, even the most absurd ones like query #5 and #6.

Then I emptied the database to signify that the next records to be inserted are all production records. I wanted to know the speed of things after 15 days. This is usually when everybody tends to be happy knowing that their brand new system is 100% running, and running very FAST. After 15 days, as you can see in the results, things are almost just as fast as when things were under development. Even the queries from hell (#5 and #6) are more than acceptable. Once in a while I still hear stories like management asking their developers a list of things that are a ten thousand long. In this example, it’s a 300k record list. That looooong! Nevertheless after 15 days, everybody are expected to be wearing their happy faces.

I also wanted to know how things are after a month of operation. Usually, at this stage, top management celebrates the success of their newly implemented system as things go smooth and pretty. Then probably, they would tend to let go of some key contractual personnel as a consequence of having the impression that their brand new system is running fine. But of course, they wanted to cut on expenses. No matter how absurd, a 700k record dataset is just as fast in 13 seconds. Who could complain with that volume and speed?

6 Months Later

Fast forward to 180 days, everybody must be wondering what hit them and started the blamestorming game. Those who are left in the project, probably are so overwhelmed by complaints from non-techies why things got so slow. Non-techies probably are wondering when everything were just bought and done 6 months ago. Everything was so brand new.

The Queries

The following were queries used in the experiment so you have an idea how things perform over time. The queries are very simple. #5 and #6 are created to make things obvious as there are still people who are not aware that those kinds of queries are just plain absurd no matter how management needs them (e.g. a report containing a list of everything).

–Q1
SELECT * FROM _sales

 

–Q2
SELECT * FROM _sales_office

 

–Q3
SELECT COUNT(*) FROM _sales

 

–Q4
SELECT COUNT(*) FROM _sales_office

 

–Q5
SELECT * FROM _sales T1 JOIN _sales_office T2 ON T1.sales_office_pk = T2.sales_office_pk

–Q6
SELECT
T2.datesold, T2.region, T2.salesman, T1.brand, T1.quantity_sold
FROM
  _sales T1
JOIN _sales_office T2 ON T1.sales_office_pk = T2.sales_office_pk

 

–Q7 GENERATE YESTERDAY’S REPORT
SELECT
T2.datesold, T2.region, T2.salesman, T1.brand, T1.quantity_sold
FROM
_sales T1
    JOIN _sales_office T2 ON T1.sales_office_pk = T2.sales_office_pk
WHERE T2.datesold = (SELECT MAX(datesold) FROM _sales_office)

 

–Q8 GENERATE AGGREGATED REPORT
SELECT
  T2.datesold, T2.region, T2.salesman, SUM(T1.quantity_sold) AS ‘qty_sold’
FROM
_sales T1
JOIN _sales_office T2 ON T1.sales_office_pk = T2.sales_office_pk
WHERE
T2.datesold = (SELECT MAX(datesold) FROM _sales_office)
GROUP BY
  T2.datesold, T2.region, T2.salesman


Conclusion

There are lots of things everyone can learn from this experiment. It is very important that someone in any project knows what they are doing and what they are up to. This is essential especially now that databases are taken for granted and much focus are given on the application side of things. Someone has to know their databases.

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Fire and Forget Coding

Ever wonder when you will stop on frequently checking out old code? Revisiting for what reasons you may ask? Just short trips to fix small bugs, re-factor for performance enhancement, do some small expansions to accommodate new requirements, etc. You visit an old code to do all these stuff other than code appreciation. If you are this kind of coder, you are a shame!

This piece of writing will be debatable to most developers. I don’t say we don’t go back to the code we have already checked-in. But at least we should minimize going back.

By checking out old code and doing some modifications (be it intentional or accidental), you are likely to do the following:

  • Introduce a bug
  • Break a build
  • Costing your employer money (for retesting). You should thank that automated testing are in nowadays.
  • Forcing others to check out their objects dependent on your code

To test yourself how you fare with creating bugs, rate yourself on how many bugs you create or introduce for every 100 coding task assigned to you. I would always want to stay below 1%. In most cases, it is achievable, unless you are lazy and you don’t get a clue. But the next question now is how will one achieve a bug-creation rate below 1%?

Here is how I usually achieve this and it’s pretty simple:

  1. Start from the smallest dependencies if possible.
  2. Understand the problem and coding task very carefully
  3. Create testing procedures. Be sure you cover all possibilities.
  4. Code
  5. Test code using your test procedures.
  6. You may review/ from steps 2 to 5 before checking in code. Never, ever check-in code if you are not 100% sure you have done steps 2 to 5. If you check in code without the 100% assurance, you just started a problem that will show itself sooner or later. (Some programmers have the habit of checking-in untested code because perhaps they are too slow or lazy to beat the check-in deadline). Others are just not qualified to write code thus they fail on steps 2 to 5.
  7. Check-in code and never look back.
  8. If you pass step 7, you can move on to a new coding task.

At this point, there are probably several possible outcomes that I will encounter:

  1. If there is one bug in my code, then I can say that I probably failed in one of the steps 2 to 5. And it can be said that I have not done my homework well.
  2. What if there is a new functionality needed that was missed? In most cases, if I was doing a good job on Step 2, this new functionality could have showed up in my radar and could have raised some alarms. But I did not coz probably I am no good and too lazy. In any case, I can go for a new version of the object and leave the old code intact for future deprecation (especially if the code is quite complex and the needed adjustments are just too significant). Else, If I am pretty sure that the changes are just trivial and I’d be able to ensure guarantee on step 6, then it would be logical to check-out the code again and update. But either way, I already failed earlier.

In most cases, if I have been doing a good job on steps 2 to 5, there is a slim chance or none at all that I would re-open and change the code. I love to call this a “fire and forget” coding that has saved me a lot of headaches since I started programming for a living. This isn’t perfection. It is just one way of making things better.

Responsible Software Architecture

Lately, I have had an overload of information regarding “software architecture”. I have been doing software for like a decade or more already and I have seen a bit enough to know what works out there and what are just passing fad. Today, developers, programmers or whatever you want to call them are so enamored to discuss anything about software architecture with vigor and gusto that I have not seen before.

In 2000, when I was doing some software architecture work as an external consultant to some really large project, the technical project manager cautioned me not to “over-architect”. It occurred to me that probably she had experienced something that make her said that. I saw the objectiveness of that cautious remark and I have instilled that remark into every design work I had thereafter. Until now, I think I had been a responsible individual in doing software architecture work.

I recently ask that technical project manager why the warning. She said something to this effect:

– The first time you piece together the super design of your mammoth killer-app, you make all the mistakes of a first timer.
– The second time you piece them all together, you tend to over-compensate on the shortcomings of the first.
– The third time you piece them together, you now have the confidence of having a near perfect architecture.

Perhaps this was the reason why there is a saying in the software development space that “you can’t make it right until you reach version 3.0”. Or was I referring to Microsoft apps of the past?

The big question to us right now is that, how many of us have managed to stay on to reach version 3.0?

Pundits have advocated that this could be avoided and have things right the first try, thus, we now saw everywhere guides to the best patterns and practices in various colors and flavor. And you would see everywhere heated debates on almost everything. Scrutinizing each approach and questioning motives why a particular practice is best to implement while others should not be. Some worked, some failed. Some were misapplied, some were near perfect. But there is one thing conclusive … all these are no SILVER BULLET that would solve all our woes in software architecture and the business of software design.

What is very noticeable nowadays is that developers tend to believe in almost everything that they’ve read out there. They are enamored to the fact that they can technically articulate and converse proficiently at any technical subject at hand and bask at the feeling of knowing the differences between subjects. However, with these observations, I also happen to notice that these same software architects and designers build their software around the capabilities and limitations of each technology architecture they tend to believe (or perhaps the easiest to them) instead of understanding first the requirements and adopt something that fits with what is needed based on the limits of what they know. Often, they blame the lack of support from the architecture they preferred when they encounter something that is really hard to accomplish.

In almost all cases too, software architects and designers seem to spend too much time architecting a solution. Some without regard to known constraints such as time and financial resources, experience or the lack of it for the said solution. Some would tend to apply the complexities of a multi-tiered architecture to even the simplest of apps just so to be known to peers as architecture savvy.

So I ask, what do others do? Do they understand first the requirements then find and apply appropriate patterns and practices that can deliver the requirements and expectations of users? Do they try to instead fit the user requirements to that of their preferred architecture? Do they pick several of the best out there and find ways and means to glue things together to come up with something deliverable?

As for me, I tend to absorb first the requirements and know my constraints (time, manpower, logistics, technical capability, experience, etc). Then, I tend to immerse myself into what the software be like in its final form. Once I get a clear picture of what the software would be like, that is when I start doing technical work. This involves picking the right architecture to adopt. It is not always necessary for me to check on what is new. If what I have known and tested to work fits given all the scenarios and what-ifs, I pick that architecture regardless how relevant, or how obsolete it has become. Sometimes, I see people who have a knack on choosing whatever is new and untested out there as long as it is popular. If I find my architecture stock insufficient to the work at hand, then that is the time I start investigating other possible options. This can be real hard sometimes, since in most cases, you don’t have the benefit of experience. If I am not careful and responsible in choosing the right stuff, the consequences could be unimaginable.

By and large, I think having too much architecture can be as bad as having none or too little. I also strongly believe that software designers and architects should be on responsibly knowing how much architecture is needed for a particular software and not how to produce the technically most elegant or advanced architecture for any software to prove we are architecture-savvy.

Since there is really no silver bullet when it comes to software architecture, and forever we will be avid learners of what is new and cool out there, and, forever we will be adopting to changes as dictated by our users, I am espousing that let us be responsible in selecting and applying what we think is the right architecture.

Avoid LIKE Like A Plague

What is the best thing about using LIKE operator in T-SQL? It is very convenient to use in doing some few pattern searching.

Many of us are using LIKE just for this reason alone. For example, running the following query

SELECT first_name
FROM _name
WHERE first_name LIKE ‘%electra%’

on a several thousand row table would easily give you result in a blink of an eye (assuming that a non-clustered index was created on first_name column).

But let us try to go deeper as to how SQL Server executes the LIKE queries:

— Contains Search
SELECT first_name
FROM _name
WHERE first_name LIKE ‘%electra%’

Execution Plan Used = Index Scan

— Ends With, Where Last Character Can Be Any Character
SELECT first_name
FROM _name
WHERE first_name LIKE ‘%ann_’

Execution Plan Used = Index Scan

— Begins With Search
SELECT first_name
FROM _name
WHERE first_name LIKE ‘electra%’

Execution Plan Used = Index Seek

We know that every time SQL Server executes a query, its Query Optimizer tries its best to determine the best execution plan that it can do. It will always attempt to use an Index Seek wherever/whenever possible.

On the 1st and 2nd example, SQL Server failed to determine or find an appropriate index to use. Thus, it opted to use an Index Scan. In an Index Scan, SQL Server scans all the rows in an index. This means that SQL Server has to scan each row in an index to determine if the row qualifies in your search criteria. If you are searching for just 3 rows within a million row table, and those records are in the 8th, 9th and 10th, SQL Server will have to scan up to the millionth record of the index before it can return all the rows that you need. Now that is quite unacceptable to most of us. However, Index Scan isn’t bad at all. There are times where an Index Scan is faster then Index Seek. Every time you see in your execution plan that SQL Server is using Index Scan, take a look at the number of rows it is processing. If it is just a few thousands rows, then you are fine and Index Scan can be really fast.

On the 3rd example, SQL Server was able to determine that it can use an index effectively by using Index Seek. Index Seek, as we know means that SQL Server’s Query Optimizer was able to find a useful index to locate the records we are searching. If we are searching for 3 records in a million row table, SQL Server will return the rows in a blink of an eye.

So how would this information help us?

Often, when we develop apps, we tend to use a lot the LIKE operator. During this stage, we seldom find performance problems related to LIKE operator as typically, our tables are not well populated with large amount of data. In most cases, we often have a few hundreds of rows or less in our tables during development stages. Perhaps we are thinking that since we have created an index, then we are ok.

We will only notice the performance problems associated with LIKE when the tables gets populated heavily with real data.

Hope this helps us in understanding when and how to use the LIKE operator. Hey, you may want to explore SQL Server’s Full Text Searching.

How Often Do We Change Data Types

One colleague of mine raised some intriguing points on why SQL Server’s User Defined Datatypes can’t be edited. You do the editing process by dropping the datatype and recreate the new one. His position was that, he avoided using UDDs simply because of this limited mechanisms. This has led me to think of another question.

The question really is .. how often do we need to change the definition of our data? In its lifetime, I should say it should not be changed at all once our understanding of its nature is final. If the data definition we had changes from time to time, I would say that there is a possible problem in our foresight or understanding of the nature of that data entity. If that is the case, it should not be a good reason for us to define a UDD for that data entity. Worse, If we are not sure of a data entity’s definition, we have to get back to the drawing board and try to understand its nature once again if indeed we are sure of its definition. This problem also reflects the lack of resolve when determining the true nature of certain data entities.

However, I do recognize, that by the time we really need to change the definition of our data entity, I would surmise that the reason for doing so is very significant and justifiable as it will have a great impact in both our database and our application. This issue brings to us the question on how clear are pictures in our minds.

Here is my case with a few examples to show how I deal with the situations when understanding and defining data types.

Scenario #1: How stable can we define a data type?

BirthDate as DATETIME
EmailAddress as NVARCHAR(320)

Scenario #2: Questionable Candidates for UDD Creation

CustomerID as CHAR(5)

Scenario #3: The Lack of Understanding Example

Age as INTEGER – (assuming in Years)

Scenario #4: Things Change Over Time

PhoneNumber as CHAR(7)

The scenarios above are the usual things I encounter when one defines the type of a certain data.

In scenario #1, how often do we need to change datatypes? Should we have these kinds of data and have clear understanding of their very nature, defining UDDs for these would be sound. Thus, EXEC sp_addtype BirthDateType, datetime, ‘NULL’ would almost guarantee us that we will NOT encounter problems in the future or would require us to edit its definition.

Scenario #2 is a classic example of our lack of foresight or anticipation. A good DB designer could easily spot this. In the example, it is just a matter of time before we make it to CHAR(10) or CHAR(15) or better yet CHAR(25). I personally declare IDs as VARCHAR(255) as I don’t have a habit of making ID types as one my system keys. I’d rather have IDs as VARCHAR(255) and indexed them with UNIQUE constraint making them candidates for user consumable keys. As for me, I won’t be inclined to have a UDD for this situation.

Scenario #3 is also a classic on how we fail to understand the very nature of certain data. In most of us, without batting an eyelash, it is almost automatic that we declare Age (in years) as INTEGER, only to find out later on that we need to store it like 10.25 for 10 Years and 3 Months.

Scenario #4 forces us to change. It has been decades that Philippines had 6 or 7 digit local phone numbers. We could have probably saved on a few bytes and declared it to CHAR(7). Now we are seeing 8 digit phone numbers and we need to adapt.

If we had defined UDDs, for all the samples above, scenarios #3 and #4 could probably force us to EDIT our UDDs. Using EXEC sp_addtype PhoneNumberType, CHAR(7), ‘NULL’, we have defined a UDD for #4. To drop and create a new UDD for this to support CHAR(8), this could create a major headache in our app. We will need to revisit our SPs and character validation codes, the formats and pics we defined, the CONSTRAINTS we declared and other stuff on both our DB and the apps and check all support related to PhoneNumberType. How do I avoid this? Should I need support CHAR(8), I would definitely NOT do an “EDIT” by dropping and recreating my UDDs. Instead, I’ll create a new one with

EXEC sp_addtype PhoneNumberType2, CHAR(8), ‘NULL’

and subsequently create new ‘phone’ type columns and define them as PhoneNumberType2. Eventually, I will have to deprecate PhoneNumberType in favor of PhoneNumberType2 but that would be another story.

This way we have zero impact on our apps and databases and have avoided possible headaches.

When A Problem Becomes An Opportunity

I was following up a very interesting SQL Server database related thread in one of the forums that i am in. It is about a certain unexplainable problem one database developer had encountered that forced him to call it “weird”. The thread has been running for almost a week now and each day, but no one was able to come up with anything that helped him solved the problem. The problem was a sort of ‘the vanishing table structure’. Indeed .. it sounds weird.

I was so interested with the problem that I have to check on it everytime. However, just a while ago, I was a bit dumbfounded to see that he succumb to the problem and decided to reinstall everything without knowing what really triggered the problem.

These are circumstances that I take advatange most. However, I understand that he might need to act on it as soon as possible (perhaps the server is on production or doing some critical task) thus he got no choice but to decide drastically and promptly and have everything reinstalled.

I have a different attitude when things like this happen. I look at it as both a problem and an opportunity. A huge opportunity to learn. These circumstances are not usually replicated and taught by experts and gurus nor it comes out during experimentations in the labs. I always grab the opportunity to learn from the problem as much as I can and try to know what caused it. Now .. all that he can do is to speculate what really happened. Was it an software bug? Was it a setup or a configuration problem? Was it due to user ignorance? Or perhaps carelessness? Now we dont know what really went wrong.

I would encourage everyone to get to the bottom of things before coming up with a desperate solution. The value that can be taken out of the situation is priceless but this is when you graduate from being just an ordinary/average user or developer.

Again … he may be forced to act on it and perhaps he doesn’t have the time and resources to prolong the problem. And that’s ok.