Tuesday, June 28, 2011

Picking the right laptop

Another topic back on the laptops, and I hope this will be my last for a while. Earlier this year I decided to pick up a MacBook Air to use when traveling. It's the lightest full laptop on the market and the size is a dream. The problems began quickly - first, I had to spend sometime installing BootCamp and setting up Windows 7 on it with all my software, as most macOS programs are useless for what I do. The problem I had with the laptop was the glossy screen - after just 30 minutes of use, I couldn't see well the rest of the day with severe eye strain. After a couple weeks I put it in the closet and a quick Bing search revealed that the exact same problems had been noted by others as well. I let a friend borrow it a few weeks who didn't have any issues, so just guessing my eyes are a little sensitive and the Apple store said it's perfect, so I resold it and decided to move back to my second choice, a Lenovo ThinkPad.

I finally decided on a ThinkPad T420s. It's a full machine with a 14" screen, which is only 1 lb more than the MBA, with all the features of a laptop (DVD rom, removable battery, anti-glare screen, great screen resolution). Crucial provided me an additional 4GB of ram, so now I'm running a 8 GB ram machine that's a dream to carry and can natively run all the programs that I need. The Lenovo cost the same amount as the MBA so that's a double win. So to anyone who is contemplating a thin, light 14" laptop, I highly recommend checking out the Lenovo lineup.

Monday, June 27, 2011

Types of Data

This is a question I get asked quite a bit: "What are the different types of data that I can collect in my data warehouse?" Always an interesting topic, so I'll start by saying there are three basic types of data: empirical, anecdotal, and derived.

The basic case for data warehousing starts with empirical data. This is data that is collected - Eg. "I sold 10 widgets this week". Most data warehouses are built off this type of data, because it's really a "fact", meaning that it's true. Not to be confused with dimensional modeling, as an address is also "empirical" in nature.

A second type of data is derived. This is data that is created from another type of data. An example of derived data is "I sold 10 widgets this week for $1 each, therefore my total sales are $10 for the week". Derivation is the only way to perform a computation to get a picture. Think of aggregations as derived data.

The third type of data, and one that is less common, is anecdotal. This is data that is oberserved or believed but without any scientific basis. Anecdotal data often has applications in business. Think of the example that a salesman is selling widgets to a retailer, which we shall call Mega-lo-mart, and the salesman knows through discussion with the Mega-lo-mart manager that they don't intend to buy widgets this year, anecdotal data would be the salesman's oberservation that "mega-lo-mart doesn't indend to buy widgets this year because they aren't selling well". There is no scientific evidence this is true, but think of the business case, where a salesman is wasting time trying to sell to someone who will not buy the widget. Thus, there is a case that anecdotal evidence could be used in a data warehouse application, as long as it's documented as such, to help drive decisions.

I find these data types fascinating, especially the anecdotal nature. Sometimes it's difficult to determine which data type is particular type of data is, based on the way it was collected. That's our jobs as architects (typically a data modeler) who would analyze the data, work with business users to determine the applicability of the data, and build a dimensional model that contains all three types of data to present as a business intelligence applicaiton.