10 things to review after finishing your data model
When I finish modeling my database, I like to just let it sit a couple days, then spend some time just reviewing it to check for inconsistencies. The kind of inconsistencies you never notice when you’re up to your eyeballs in details, but drive you up the wall after 5 years of maintenance. When I finished my database design a few days ago, I jotted a few things down to remember to do. But when I transcribed them into my bug tracker, where I manage all my tasks, 3 things turned into 5, then 7 and I realized if I could just add a few more, I’d have an infamous ‘Top 10’ list. Anyway here it is. It’s not exactly comprehensive, but it’s a start. 10 things to review after finishing your data model: Naming consistencies Column default consistencies Identity / auto-number technique exists Constraint consistencies Foreign Key relationships exist Indexes on Foreign Key columns Abbreviations are consistent Abbreviations are documented Data type and size consistencies. (For example; TableA.UserName is nvarchar(N), so TableB.UserName should be nvarchar(N) as well, not nvarchar(N±X)) Review all requirements again to reconfirm everything was... read moreHow to enforce a foreign key constraint against multiple tables
I am building a web app with Ben Alabaster, and one of the requirements is for the user to be able to flag items for moderators. So the user can flag entity A, entity B, entity C, etc… So I created a single flag table. Which I then tried to tie it to the entity tables, hoping for something like Where all the foreign key relationships were from [flag].[entity_id] to [EntityX].[id] Then when I wanted the top 10 flags from a particular entity (B in this case), I could run a query like select top 20 e.[name], count(*) "count" from entityB as e left join flag as f on f.entity_id = e.id where f.entity_type='B' group by e.[name] order by count(*) desc Unfortunately, if you were to create the above table relationship, and run the following inserts insert into EntityA( id, name) values (1, 'EntityA'); insert into EntityB( id, name) values (2, 'EntityB'); insert into EntityC( id, name) values (3, 'EntityC'); insert into EntityD( id, name) values (4, 'EntityD'); The following statement insert into flag(entity_id, flag_reason) values(5, 'Testing without a valid FK value.'); would fail as expected, as expected, with the following error. “The INSERT statement conflicted with the FOREIGN KEY constraint “FK_flag_EntityA”. The conflict occurred in database “test”, table “dbo.EntityA”, column ‘id’.” But insert into flag(entity_id, flag_reason) values(1, 'Testing the FK to entity A.'); would also fail, which was undesired, with the following error: “The INSERT statement conflicted with the FOREIGN KEY constraint “FK_flag_EntityB”. The conflict occurred in database “test”, table “dbo.EntityB”, column ‘id’.” + So, my options with regards to referential integrity are : Ditch the referential integrity, which I am vehemently opposed to. ++ Create multiple flag tables, each with the exact same schema, but a different Foreign Key relationship, which just seems wrong. Managing referential integrity via triggers. While I’m not a big fan of triggers, the ‘Managing referential integrity via triggers.’ option seems like the only tolerable one. So I added the [entity_type] column to my flag table. Removed the relationships And wrote the following trigger to manage the foreign key relationship. -- ============================================= -- Description: maintain referential integrity on -- a column which is a FK for different tables -- ============================================= CREATE TRIGGER flag_entity_id_fk ON flag AFTER INSERT,UPDATE AS BEGIN declare @entity_type char(1); declare @entity_id int; declare @cnt int; -- SET NOCOUNT ON added to prevent extra result sets from -- interfering with SELECT statements. SET NOCOUNT ON; -- get info select @entity_type=entity_type, @entity_id=entity_id,... read moreHow to hook up NUnit as an option in the ASP.NET MVC application wizard
Ben Alabaster and I have started a project, lets call it Project X for now (original eh? ;-). Anyway, we have decided to blog about its development. I will be posting something in the next few days, and Ben just published his first post “How do I hook my version of nUnit into the ASP.NET MVC template?“. Here’s a blurb: If you’ve been looking for a way to integrate nUnit into your ASP.NET MVC 1.0 template – that is, when you create a new ASP.NET MVC application and it asks you if you’d like to create a test project, nUnit shows up in the list along with the usual Visual Studio Unit Test option. There are a number of longwinded ways of doing things. There’s also a relatively simple way touted on the Visual Web Developer Team Blog which I’ll spare you the headache of running it and finding the same problems I did…. Ben’s post can be found on his blog... read moreInvestigating the relationship between estimation accuracy and task size
Yesterday on StackOverflow Johannes Hansen asked What is the acceptable upper limit of time allocated to a single development task? I answered with If you track your estimate/actual history, you can probably plot hours by accuracy and figure out exactly what number is appropriate for your team. My advice sounded so good I thought I’d try it myself. So I opened bug tracker where I keep track of my probable and actual times and exported my closed bugs to Excel. I cleaned up a bit, by removing any rows with either a 0 probable or actual time, then created a chart. Now when I conceived of this idea, I was expecting something like Well I wasn’t expecting the plots to be that dense, or to accelerate above 200% so fast, but let’s just say, that general look would have been pleasing to my eye. Here’s what I got instead. Now, I’ve got to say, is NOT what I was expecting at all. You can kind of see a very dense block under 4 hours and 100%, but doesn’t tell us very much with regards to the relationship between estimation accuracy and size of the tasks. So, I then threw a Linear Regression Trendline on the chart hoping it would illuminate an ascending trend. Instead it contradicted my assumptions by declining, suggesting the larger the task, the more accurate I am … which isn’t true at all. Maybe it’s the outliers. Maybe it’s the weird changes outside of normality causing it to look so horrible. So I sorted the data by the accuracy percentage, dropped the top and bottom 5 percent, redrew the chart and got this. Still obvious relationship between the estimated task size and estimation accuracy. But at least my trendline is no longer declining. By flat lining, it’s now suggesting there is no relationship between estimation accuracy and task size. … hmmm … bugs are included in my data. I wonder if that could be having an effect? I’ve been estimating approximate times bugs will take to resolve for my manager. Most of these bugs have been estimated before even investigating the cause, so that’s not really the same as estimating a defined task. What if I remove them? I went back to my original data dump, removed all bugs, tickets, and questions so I was left with only new tasks and changes. I again removed the bottom & top 5% and recharted. Well, I’ve finally got... read moreA very simple Pair Programming IP Rights Agreement
Once in 1998, I sat down with my manager (the only manager I’ve ever had who could program), and we banged out some code for about 2 days. It was a very fast paced synergistic activity where one idea fed another and at the end of 2 days our initial idea morphed into something completely different and a heck of a lot better. Well, tonight 11 years later, I’ve convinced my colleague Ben Alabaster to come over and pair program. I don’t know how it will go, I’ve got high hopes, but I am confident at the end of the night both Ben and myself will be a little better as programmers, and might have even started something worth finishing. But two things I do know: 1) if we come up with something good, we’re both going to want to use it. And 2) if we ever get to the point of needing an agreement outlining our IP rights, it will be too late to draft one. So, Ben & I threw together some basic rules yesterday. Frankly, I’m surprised I couldn’t find any on the net already, maybe I over think this stuff more than most people, or perhaps it’s because I just didn’t look that hard. So here’s what we agreed to: Each of us, individually, is free to use any programming concept shared, discovered, or created. Each of us, individually, is free to use anything we cocreate as part of a larger project with a significant amount of additional functionality. This can be a personal project, business project, or consulting project. Each of us must agree to release any code or binaries either as a commercial product or open source. Each of us will share any credit and/or financial profits equally. I’d love to hear other people’s perspective and comments about this. Copyright © John MacIntyre 2009, All rights... read more11 Personal Programming Assumptions That Were Incorrect
Today I got side tracked and spent an unreasonable amount of time on StackOverflow.com. One of the questions I was looking at was What is your longest-held programming assumption that turned out to be incorrect?
Many of the answers immediately resonated with me, like Instantsoup’s answer That people knew what they wanted and JohnFx’s awesome answer about comparing his knowledge to the collective knowledge of all other programmers. Other answers reflected a poor initial understanding of the language or technology, many of these I was fortunate enough to not relate to.
As you can imagine, I immediately started coming up with my own answers, so I continued reading to make sure they weren’t already there. But as I read, I came up with more initial assumptions which proved to be false. I thought I’d pick out the best, and answer with that one, but realized I had a whole blog post!
So without further ado; here is my list of assumptions about programming and the industry which proved to be incorrect: …
read more6 simple steps to a stress free database change deployment
… Deployments can be a real headache at the best of times, but especially when schema updates to a production database are involved. Don’t get me wrong, you usually have a backup to fall back on, but how long will that take to restore? … Really, you don’t want to resort to the restore, have the database offline for that long, or have your name associated to it. So gradually I evolved a process which has kept me sane and confident when deploying schema changes to production servers, even on large, sensitive, and active databases….
read moreHow To Get The Most Frequently Used Column Values
Whenever I import external data, integrate to another database, or am new to a project, I need to get familiar with the database. The table schemas, relational integrity, and constraints are the first thing I look at and take me a long way, but soon I need to know what the data looks like.
In an ideal world, relational integrity and database constraints would define control this, and all I’d really need to do is look at those. But the reality is, in 15 years of working in this industry, most of the databases I’ve worked on, that I didn’t design, have barely used constraints and some haven’t even used relation integrity fully!
The need to get a good feel of the data is even more prevalent when working with dirty data, or when refactoring poorly written applications to ensure any refactoring doesn’t introduce other issues. I will usually wind up writing the following query repeatedly:
….