Tag Archives: #testing

ExcludeFromCodeCoverage considered harmful

Metrics are only useful if they help you improve. Code coverage KPIs are most often circumvented. Liberal use of the ExcludeFromCodeCoverage attribute is to be avoided.
Abuse of the ExcludeFromCodeCoverage attribute

I once worked on project that had a mandatory code coverage target. If your commit didn’t maintain the overall coverage ratio of 75% then it was rejected. The reasoning came from good intentions; a high code coverage is good therefore we will create the mandate that is must be high. It had some unfortunate side effects though. The first unintended consequence is that developers only wrote enough tests to keep the value above the target instead of considering how they needed to test their code. The second consequence was the proliferation of [ExcludeFromCodeCoverage] attributes adoring all sorts of classes.

The attribute was originally designed for generated code but more recently I’ve seen it be applied where the code that is hard to test or too simple to test. Ultimately this hides code from testing metrics. Your code coverage metric is no longer accurate nor useful.

Yeah, we are at 85% code coverage ignoring the code we excluded because it was hard to test.

My immediate response to this is “How come it isn’t 100% then?”

I would much rather have a lower, but accurate, code coverage metric so I consider the use of this attribute harmful on any code that isn’t generated.

I wish code analysis tools like SonarCube or the Roslyn analyzers treated this attribute like SuppressMessageAttribute - the Justification property should be filled out when applied. Better still, just fix the issue and avoid the need to use either of them.

Why should I test properties?

The consensus on Should you Unit Test simple properties? seems pretty much for testing and you can use the examples here to test yours. I prefer property based testing though because it can find edge cases you didn’t think of. There is a great intro at Property-Based Testing with C# using FsCheck. Your code will effectively look like:

[Property] public Property Set_Then_Get_Returns_Same(string exampleValue) { var target = new ClassYouWantToTest(); target.PropertyToTest = exampleValue; return (target.PropertyToTest == exampleValue).ToProperty(); }

FsCheck will generate a bunch of random values to try this test with so these 4 lines of code are resulting in hundreds of unique tests for this property including, for this example: blank string, null value, very short, very long, non-printing, accent characters, etc.

Why should I test … something else?

I will come back and add more examples as I encounter them.

Summary

Code coverage metrics are only useful if they help improve the software quality.

Mandatory targets can lead to harmful practices such as excluding code from testing or writing superficial tests.

ExcludeFromCodeCoverage attributes should be treated in the same way as SuppressMessage - they hide warnings that should really be fixed.

Photo by Luca Bravo on Unsplash
This entry was posted in code and tagged #metrics #testing #csharp on Feb 17, 2024 .
Discuss this on Twitter or LinkedIn
Interesting Links #4
These seem to get longer and longer. A whole pile of links for you.

Management and Organisational Behaviour

How Serving Is Your Leadership? - Who is working for who here?

Be a Manager - “The only reason there’s so many awful managers is that good people like you refuse to do the job.”

I’m the Boss! Why Should I Care If You Like Me? - Because your team will be more productive… Here are some pointers.

Software Development

Technical debt 101 - Do you think you know what technical debt is and how to tackle it? Even so I’m sure this article has more you can discover and learn. A must read.

Heisenberg Developers - So true. In fact this hits a little close to home since we use JIRA, the bug tracking tool mentioned in the article.

What is Defensive Coding? - Many think that defensive coding is just making sure you handle errors correctly but that is a small part of the process.

Need to Learn More about the Work You’re Doing? Spike It! - So you are an agile shop, your boss is demanding some story estimates and you have no idea how complex the piece of work is because it’s completely new. What do you do?

Software Development with Feature Toggles - Don’t branch, toggle instead.

Agile practices roundup - here are a number of articles I’ve found useful recently:

4 Reasons to Include Developers in Story Writing

2 Times to Play Planning Poker and 1 Time Not To

Benefits of Pair Programming Scheduling Pairing Sessions Inside a Pairing Session

How to review a merge commit- Phil dives into the misunderstood world of merge commits and reviews. Also see this list of things to look out for during code reviews.

Functional Programming

Don’t Be Scared Of Functional Programming - A good introduction to functional programming concepts using JavaScript as the demonstration language.

Seamlessly integrating T-SQL and F# in the same code - The latest version of FSharp.Data allows you to write syntax checked SQL directly in your F# source and it executes as fast as Dapper.

Railway Oriented Programming - This is a functional technique but I’ve recently been using it in C# when I needed to process many items in a sequence, any of which could fail and I want to collect all the errors up for reporting back to ops. It is harder to do in C# since there are no discriminated unions but a custom wrapper class is enough.

Erlang and code style - A different language this time, Erlang. How easy is programming when you don’t have to code defensively and crashing is the preferred way of handling errors.

Twenty six low-risk ways to use F# at work - Some great ways to get into F# programming without risking your current project.

A proposal for a new C# syntax - A lovely way to look at writing C# using a familiar but lighter weight syntax. C#6 have some of these features planned but this goes further. Do check out the link at the end of the final proposal.

Excel-DNA: Three Stories - Integrating F# into Excel - a data analysts dream…

Data Warehousing

Signs your Data Warehouse is Heading for the Boneyard - Some interesting things to look out for if you hold the purse strings to a data warehouse project. How many have you seen before?

The 3 Big Lies of Data - I’ve heard these three lies over and over from business users and technology vendors alike. Who is kidding who?

Six things I wish we had known about scaling - Not specifically about data warehouses but these are all issues we see on a regular basis.

Why Hadoop Only Solves a Third of the Growing Pains for Big Data - You can’t just go and install a Hadoop cluster. There is more to it than that.

Microsoft Azure Machine Learning - Finally it looks like we can have a simple way of doing cloud scale data mining.

Data Visualization

5 Tips to Good Vizzin’ - So many visualizations break these rules.

Five indicators you aren’t using Tableau to its full potential - I’ve seen a few of these recently - tables anyone?

Create a default Tableau Template - Should save some time when you have a pile of dashboards to create.

Building a Tableau Center of Excellence - It is so easy to misunderstand Tableau which is not helped by a very effective sales team. This article has some great advice for introducing Tableau into your organisation.

Beginner’s guide to R: Painless data visualization - Some simple R data visualization tips.

Visualizing Data with D3 - If you need complete control over your visualization then D3 is just what you need. It can be pretty low-level but its easy to produce some amazing stuff with a bit of JavaScript programming.

Testing

I Don’t Have Time for Unit Testing - I’ve recently been guilt of this myself so I like to keep a reminder around - you will go faster if you write tests.

Property Based Testing with FsCheck - FsCheck is a fantastic tool primarily used in testing F# code but there is no reason it can’t be used with C# too. It generates automated test cases to explore test boundaries. I love the concise nature of F# test code too especially with proper sentences for test names.

Analysis Services

I’ve collected a lot of useful links for Analysis Services, both tabular and multidimensional:

DAX Patterns website - This website is my go-to resource for writing DAX calculations. These two are particularly useful:

How to handle fact tables with different granularities

Implement Budget Allocation in DAX

Using Tabular Models in a Large-scale Commercial Solution - Experiences of SSAS tabular in a large solution. Some tips, tricks and things to avoid.

Also:

Experiences & Tabular Tips

Row and Column (Cell) based security in SSAS Tabular Model

How connections will hurt your Tabular Workload

Listing Active Queries with PowerShell

How to build your own SSAS Resource Governor with PowerShell

5 Tools for Understanding BISM Storage

How to Automate SSAS Cube Partitioning in SSIS

How to turn off/on bitmap indexes in SSAS

Context Aware And Customised Drillthrough
This entry was posted in reference and tagged #business-intelligence #data-visualization #data-warehouse #excel #functional-programming #tableau #testing on Jul 9, 2014 .
Discuss this on Twitter or LinkedIn
Interesting Links #2

January was a long month so I’ve got quite a list for you. I may consider doing these more often if readers think there are too many items for a single list.

Governance

Self-Service Business Intelligence Governance - Essential reading/watching for anyone planning to deliver self-service business intelligence.

Five Stages of Data Grief - we’ve all been through this, “If you don’t think you have a quality problem with your data you haven’t looked at it yet”.

Functional Programming

Maybe that shouldn’t be settable - Bringing some of the F# Option type goodness into a C# world.

Software Process

Five Tips to Get Your Organisation Releasing Software Frequently - my team score well on these but culturally I can see some being quite difficult to implement, particularly around the devops style organisation of teams.

Pairing vs. Code Review: Comparing Developer Cultures - pros and cons for each style of quality culture. Which, if any, is best?

Is Agile BI Really a Better Mousetrap? - A great article on the benefits of agile BI. This really appeals due to its use of development process business intelligence - measure and optimise just like we preach to our customers.

Using Vertical Slicing and Estimation to make Business Decisions at Adobe - A good look at the release planning process at Adobe with some nice techniques discussed.

Personal Development

Of Orcs and Software Craftsmanship - Best quote of the month if you are a parent: “These are the types of error messages that make debugging a software like debugging a 2 month old baby.”

Yak Shaving Defined - Sometimes if feels like this all day long in software.

Organisational Behaviour

Performance Reviews Are Not Useful; Feedback Is - Personally I think performance reviews are something that human resources departments mandate; feedback is something that leaders give.

If Managers Don’t Give Performance Reviews, What Happens? - Well, as it turns out, a lot of good things start to happen.

Top 10 ways to ensure your best people will quit - some common mistakes; how many have you come across?

Testing and Test Driven Development

These next three links are related and if you read the first you should also read the second and third.

The Failures of “Intro to TDD” - Justin Searls rips into the current way of teaching test driven development.

The Domain Discontinuity - Bob Martin responds comprehensively but ends with why the issue is not about test driven development but wider issues such as architecture and domain design.

Commentary on ‘Roman Numerals Kata with Commentary’ - Ultimately you must understand your domain before trying to do test driven development.

Databases

Default Configuration of SQL Server - Like most software, out of the box SQL is configured for the most general case and may need extra tuning for specific workloads. Thomas gives a simple set of extra configuration changes and reasons why. Also love the quote “If you are working in a bank, they may not apply to you.”

Data Visualization

Announcing Power BI for Office 365 - In case you missed it, all the fancy new BI capabilities in the Microsoft cloud are publicly available now. Shame we are stuck using corporate infrastructure.

Famous Movie Quotes as Charts - A fun look at communication in chart form.

Ten Tips and Tricks for New Tableau users - A rather nausea inducing format but useful tips for making great Tableau dashboards.

Power Tools for Tableau - Desperate for some sort of an API with Tableau? This may be the answer.

Statistics and Data Analysis

Revolution Analytics - Want to run ‘R’ statistics against your Hadoop data? This seems to be the way to do it…

Learn R interactively with the swirl package - It looks like R is going to be an important tool for us so anything that makes it easier to learn is a bonus.

Learn Data Science Online with DataCamp - Similarly, learning data science online and interactively.

Analysis of Health Inspection Data using F# - Another great example of using F# (and D3) to analyse data quickly and easily.

Big Data

Big Data: The organizational challenge - Some interesting stats comparing companies with the best analytic capabilities vs. those that don’t.

Update on Stinger: the view from a Microsoft Committer - Stinger is the Hortonworks initiative for faster SQL queries against Hadoop. This article describes some of the recent performance gains.

How To Install Hadoop on Windows with HDP 2.0 - Get Hadoop running on Windows with a minimum of fuss. However, our local Hadoop expert recommends you only do this at home; in the enterprise just setup a proper development cluster.

How To Use Microsoft Excel to Visualize Hadoop Data - Tutorial for visualizing Hadoop data in Excel/PowerView, this one is for stock quotes.

How to Visualize Website Clickstream Data - Another Hadoop tutorial this time on web click-stream data.

50+ Open Source Tools for Big Data - I think one of the problems with open source is it littered with cute names that do little to describe software function so here is a useful list to help you distinguish the likes of Orient, Flock, Storm and others.

Building your own web analytics system using Big Data tools - Should you build these things yourself? What are the choices? Are there any risks?

Master Data the noun in Big Data sentences - I often talk about master data and spend more time worrying about dimension design than facts. It is useful to see how this applies to big data too.

You don’t have big data… - With all this talk of big data it is worth remembering that most use cases do not quality at big. Most likely you have ‘hot data’.

This entry was posted in reference and tagged #apache-hadoop #domain-driven-design #microsoft-excel #tableau #test-driven-development #testing on Feb 18, 2014 .
Discuss this on Twitter or LinkedIn
Interesting Links #1
Since I manage to read so much on the train I think readers will find some of the articles useful so I plan on listing up the best ones each month.

Business Intelligence

Design Tip #162 Leverage Data Visualization Tools, But Avoid Anarchy - This month’s Kimball Group design tip and incredibly timely considering how we are using Tableau at work. I think we should make it required reading for all business users of Tableau.

Databases

The Baker’s Dozen: 13 Differences Between Analysis Services OLAP and Tabular - An in-depth look at the functional and usage differences between the two flavours of Analysis Services.

Clustered Indexes vs. Heaps - Not a lot of people know that… Thomas Kejser goes in-depth on clustered index performance relative to heaps for both OLTP and OLAP workloads. I bet there is something for everyone to learn in this article.

Indexing a PK GUID in SQL Server 2012 - Again Thomas debunks some myths about GUID keys and scalability in OLTP systems.

Code

Complete Guide to Lazy Loading in C# - The Lazy type in C# 4.0 is a useful tool for performance optimising applications. This article describes its use and various threading options.

F#, Deedle and Computational Investing - Another example of how concise F# is; stock correlation charts in under 75 lines of code.

Testing

Patterns of Effective Test Setup - A set of techniques for avoiding complete unit test setup code; ensuring test clarity and reducing brittleness. This is just the start really and libraries such as AutoFixture can help even more once you have the basics right.

Unit Testing SQL Server OLAP Cubes Using C# - Not really unit testing by most definitions, more like regression testing. In many ways similar to what we do at work with some interesting additions.

Test SAML with #Tableau Server on the cheap - If you end up having to configure and test Tableau SAML this will help.

Development Process

Workflows of Refactoring - A great little slide deck from Martin Fowler about the various refactoring workflows (hint: it is to never refactor and add functionality at the same time).

When Should Y ou Refactor - Everyone remembers the conversation with your manager “We really need to spend some time refactoring; can we add some time in the schedule?”. This article discussing this “Big Bang Refactor” with a far better “Incremental Refactor”.

The Value of Persistent Chat in Incident Management, Support and Business Continuity - We have talked about persistent chat a lot in our sprint retrospectives. This is a bit salesy but good points on the value it brings.

When is it a Good Idea to write Bad Code? - Discusses the trade-offs you make when introducing technical debt into the code base.

How to Run a Successful Open Source Project - Good advice for all successful projects, not just open source ones.

Personal Development

Best development book I’ve read, has no code in it - Looks like one of those must read books for those that take their career seriously. Love the quotes “If you’re worried that your current job is rotting your brain, it probably is.” and “Expose Your Ignorance. Tomorrow I need to look stupider and feel better about it. This staying quiet and trying to guess what’s going on isn’t working so well.”

Don’t Get Me Started: The Steam Drill - Learn to recognise when your skills are out of date and need refreshing to stay competitive.

Uber-Architects: The Building Metaphor Is Dead - Not what you think. The role of the architect is changing for the better.

Organisational Behaviour

The Open-Office Trap - New Yorker article rounding up all the research done one open space workplace productivity. Some interesting results among the expected ones.

Can-Do vs. Can’t-Do Culture - “The trouble with innovation is that truly innovative ideas often look like bad ideas at the time.” Next time you are thinking why something won’t work, take a moment to consider if you are stopping innovation.

Don’t interrupt developers - Absolutely nails why you should not interrupt developers.

Are Your Programmers Working Hard, Or Are They Lazy? - “the appearance of hard work is often an indication of failure” - a must read for both developers and managers.
This entry was posted in reference and tagged #business-intelligence #code #indexes #organisational-behaviour #personal-development #software-process #sql-server #tableau #testing on Jan 15, 2014 .
Discuss this on Twitter or LinkedIn
A behaviour driven testing framework for batch processing systems
Recently I’ve been working on a testing framework to support testing of batch systems such as data warehouses.

The framework is called ‘posh-gwen’ due to the three behaviour driven methods Given, When, and thEN. The first version is on github at: https://github.com/jsnape/posh-gwen. Comments, suggestions and pull requests are welcome.

So why should you care about using this framework?

It is difficult to test batch systems using modern test frameworks such as Specflow or FitNesse because of the simple rule that good tests should be isolated from one another. All these frameworks run tests in sequence:

do something

check something

clean up

move on to the next test

For this to be successful each test has to run very fast. Most batch processing systems are optimised for bulk processing of data. They may take tens of seconds to run end to end even with a single row of data so running hundreds of tests independently can take hours.

This framework is designed to break the rule of sequential test execution. All tests are run in parallel by phase.

The best way to test batch processing is for a known input data to contain many test cases. The batch is run loading all data at once. Finally a number of queries are executed against the resulting system. So for example a data warehouse might load a number of source files using an ETL framework such as SQL Server Integration Services. Once loaded the data warehouse can be queried to check that expected values exist in the final system.

It is still important to make sure that each test is isolated from others or else changes in one might cause a number of others to fail or become invalid.

We can do this for batch processing by data isolation - that is to carve up data domains in a way that only a single test uses data from that domain. Then verification of the results force the query to execute against that test specific sub-domain.

There are a number of suitable domains to use but any with high cardinality are best:

Dates - each day is a single test (or blocks of days, weeks, years etc. for those tests that need to span days).

Transaction identifiers - use a map of IDs to test cases or in the case of strings prefix the transaction id with the test case number.

Business keys - for entities such as customer or product there is usually an ID field used as the business key; use the same methods as transaction identifiers.

Custom attributes - if none of the above will work then you might consider adding an extra attribute to the source data which is passed through the batch system. Obviously this is not a preferred solution single you will have to change your system.

Combinations of the above - sometimes depending on where you need to validate you might need multiple solutions.

Go try it out and let me know how it goes. I plan on adding more features over the coming months.
This entry was posted in software-testing and tagged #batch #batch-processing #batch-systems #bdd #fitnesse #github #powershell #testing on Mar 29, 2013 .
Discuss this on Twitter or LinkedIn

Tag Archives: #testing

ExcludeFromCodeCoverage considered harmful

Abuse of the ExcludeFromCodeCoverage attribute

Why should I test properties?

Why should I test … something else?

Summary

Interesting Links #4

Management and Organisational Behaviour

Software Development

Functional Programming

Data Warehousing

Data Visualization

Analysis Services

Interesting Links #2

Governance

Functional Programming

Software Process

Personal Development

Organisational Behaviour

Testing and Test Driven Development

Databases

Data Visualization

Statistics and Data Analysis

Big Data

Interesting Links #1

Business Intelligence

Databases

Code

Testing

Development Process

Personal Development

Organisational Behaviour

A behaviour driven testing framework for batch processing systems