1. How to write a good bug report

    Many companies have different names for what they store in their bug tracking system. I’ve heard them called tickets, cases, defects, issues, or tasks. Here at B-Line Medical we call them cases, and that’s how I’ll refer to them in this post. Regardless of their name, it’s important that they’re clear and well written. I’ve covered this in an earlier blog post, but I wanted to go in to more detail on what we consider to be a well-written case.

    I equate good cases to a recipe. Anyone should be able to read it, understand it, and work from it. A good case can be returned to months after it was created, and will still be clear and understandable. Cases should never rely on information passed through an email, conversation or IM. All relevant information should be documented in the case, and it should stand on it’s own.

    Writing a good case isn’t easy, but it’s important. So, how do we do it?

    Language

    First and foremost, your case should be written with clear language. Capitalize, punctuate, spell check and use proper grammar. This is a no-brainer. If your case is hard to read, it may cause information to get lost or passed over. You don’t want this. Be succinct, summarize your points and don’t use unnecessarily flowery language. You want to get straight to the case’s relevant information. Don’t lose the point of the post while doing this, though. You want to have the right balance of short and meaningful.

    Structure

    Your case should be broken down in to sections, and each of those sections should play a different role. If you’ve got places where redundant information needs to be entered, it will lead to problems. Don’t duplicate effort, and don’t create situations that are prone to error. You want a workflow for creating and reading cases that allows the most information to be passed on with the least amount of effort.

    Title

    The title of your case has a higher visibility than any other portion of the case. It’s going to be seen in searches, listings, and overviews. It’s going to be included in email notifications about cases, and summaries in patch notes. Developers, QA, support people, and anyone else looking at their own queue will see case titles first. Don’t take this lightly. A poorly-written case title can make one problem look less important, or look just like another problem. A good title can be the difference between a case getting the proper attention, and it getting pushed off to a backlog.  Essentially, the title is your case “at a glance.”

    So in order to make it meaningful, you need to do a few things. The case title should be specific to the problem at hand. It should contain a unique portion of the error, specific to the case, so that it is easily searchable. “Login button not working” or “Error logging in” are bad titles. They’re not descriptive, not unique to the problem at hand, and don’t use details from the actual error. Searching for this specific case would be incredibly difficult If you’ve had a hundred, or a thousand, cases about problems logging in.

    Where possible and relevant, add portions of a stack trace to the case title. “Null ref exception at XYZ after clicking the login button” is a much more descriptive title. It gives indication of what the problem is, and can be searched on. It will be descriptive enough for a summary or a case listing.

    Some situations call for a naming convention for case titles. This is not appropriate for all cases, but will help out in certain situations. If there’s a new feature that’s been implemented, for example. It helps to have a reference to a feature when there’s going to be a group of cases. This makes assigning them to the right person easier, and makes overviews of case titles group easily. For example, cases created while testing ModuleA could be prefixed with a “ModuleA -> Case Title.”

    While you can infer the area of your product that this references by reading the case title, that requires actually reading and processing each title individually. Having a convention will cause you to naturally group these cases together without actually needing to read the titles. This will allow you to process related cases together, rather than jumping around between different areas. You might note most bug trackers have an area dropdown to indicate the part of the product each case is referencing, and my suggestion for case names breaks the rule about redundant information. This is a good point, and why discretion needs to be used when naming cases. Putting an area can denote a subset or a superset of a particular area, giving you more granularity in your case groupings. For example, if you have an area for your permissions screen, but you have a module that adds something to that screen, you could set the area to permissions, and add the module name to the title. Or vice-versa, depending on how you’ve structured things. This way, the case gets as specific as possible, with the least amount of duplication and clutter.

    Summary

    The summary is where a lot of good cases fall apart. This is the place to expand on what you started with the title. It should be short, only a few sentences. You want to provide an overview of what is going on, not a novel. This is the place to denote context to the rest of the information you’re going to provide. It should provide a quick overview of what happened, without listing out each step.  You don’t need to explain everything in this section.

    It should be towards the top, if not the top part of the case, and should be the first thing someone sees and reads when they open a case. Just like the tittle, the body of the case is usually indexed for searching. It’s important to use specific terms and language unique to the case, as it will assist searching.

    If you’re unsure how to make the summary clear, think about how you would explain the case to another person on the team. You should be able to clearly convey the meaning of the case in 10-15 seconds by reading this section out loud to a coworker. If it takes longer than that, or if it’s not clear, you most likely need to revamp your summary.

    Steps to reproduce

    Quite easily the most important section of any average case. Any person should be able to follow these steps, and produce identical results to what the case is describing.

    • This section should be a bulleted or numbered list.
    • Use short, descriptive sentences for each step.
    • Don’t omit common steps. You can combine multiple common steps in the same bullet to save space.

    Expected vs Actual Results

    This section should be very succinct, and differs slightly from others. It’s a method of internal quality control. This can help weed out cases that aren’t actually bugs, by highlighting what the person creating the case is expecting to see. If this expected description differs from actual requirements, you’ll save development time by catching it before code is changed. This helps keep everyone on the same page when it comes to expected functionality. If there’s any confusion as to what is supposed to happen, this can spark discussion, and help clarify requirements.

    Stack trace/error information

    Actual data is crucial. Logs, screenshots, databases, or anything related to your application that can be used in debugging should go here. We used to only ask for logs, but found that including various other types of data has increased case visibility and readability immensely.

    Screenshots in even the most simple cases can help visualize what can take a paragraph to describe.

    We ask that all cases that have a stack trace, that it be included in the body of the case itself. Inline stack traces, copy and pasted within a formatted block, will get indexed for searching. This means I can plug part of a stack trace in to a search, and find similar cases. When logs are attached as files, they aren’t indexed and this isn’t possible. As well, a developer who is looking at the case can see a snippit of what’s happening without loading another application. The goal here is to maximize the ease of finding cases, as well as minimizing the time to fix.

    Databases, disk image snapshots, process dumps, Wireshark captures, or other raw data can be provided to help reproduce a specific application or machine state that can be difficult or impossible to set up. Some bugs are data, environment, or network dependent, and loading up the exact setup that was being used to test it will make it easier to reproduce. These additions can help speed up the time to fix a case by a significant margin.

    Include the version that this was discovered. Some bug trackers have a field for “Milestone”. I tend to see the milestone field as what version the bug will be fixed, not the version it was discovered. You want to make sure that both are clearly denoted.

    Supplemental Information

    Finally, I like having a section at the bottom for the leftover information. Not all cases need this, but it’s crucial to some.

    • History and Context are important.
      • If there is a story behind this bug, make sure to tell it.
      • If this is for an angry client, this can be noted here.
    • If you need a particular setup or machine, this is the place to say why.

     

    These aren’t hard and fast rules, more like guidelines. Some of them might not apply to your shop, and you might have things to add. The important thing is to establish a consistent format that your people can follow, and then enforcing it. Don’t let laziness be an excuse for bad bugs.

    Read More
  2. Advice to new engineers

    I had the opportunity to represent the company I work for at an engineering networking event at the University of Maryland today catered to young engineering students of all disciplines. The basic idea was to be available for students to ask questions they don’t normally get to ask of working professionals such as “what’s the day to day like?” [lots of coffee, followed by coding all day], “what advice would you give to someone looking to get into xyz field”, etc.

    Personally I had a great time being there since as an alum I felt like I could relate to their specific college experience. In this post, I wanted to share a couple of the main points that came up today during my informal discussions with the students.

    Don’t be afraid of problems

    I really wanted to stress this to the people I talked to today. You can’t anticipate every problem you will face in the technical world, and the only real way to succeed in a career is to accept that. The trick is, though, is to know just enough to be able to find the information you want. If you can’t find the info you want, ask someone! Unlike school, group work is encouraged. On top of that, the things you learn in school won’t prepare you for all the real world things you will encounter. All a good education really gives you is the toolset to help you find the information you need.

    Not being afraid of problems means you won’t freeze and give up when you’re faced with what seems like an insurmountable issue. Break things down into smaller sets; do some research. Eventually you’ll find a solution, or at least be more informed as to why you can’t solve a certain problem and hopefully have learned from it.

    Remember, nobody knows the answer to everything, and if they say they do they are lying.

    Work with people you like

    Almost 5/7 of the week (and sometimes more) is spent with a bunch of people at work. If you don’t like who you work with that’s a problem. I think recent graduates don’t realize that at an interview the interviewer should be selling themselves to the candidate just as much as the candidate to the interviewer. It has to be a good match, both professionally and personally. If you come out of an interview and feel like you just talked to the weirdest most uncomfortable person ever, don’t work there! It’s natural to be afraid of saying no to a job that was offered, especially when you are starting out. But, if you can afford to, it’s good to be picky. The people you work with can make all the difference between a place you consider a “job” and a place where you get to practice your hobby all day long and get paid for it.

    On top of that, don’t work at a place where you won’t feel challenged. If you can find a mentor at that place that’s even better, because guided growth (especially in the beginning of a career) is invaluable.

    Also, don’t worry about any stigma of jumping ship early. Leaving a job after a year isn’t a bad thing if it’s not a right fit. Find somewhere else to work. Engineering is a field that is in demand right now, but it’s also extremely competitive and constantly changing. The only way to be competitive is to always be learning.

    Interests matter

    For me, when I’m conducting interviews, what really sets people apart is their level of enthusiasm and interest. You can be the best engineer in the world, but if you don’t care about what you work on, or your field, you won’t do a good job. Being enthusiastic about the field you are in is important. If you care about what you do, whether its computer engineering, or biological engineering, or whatever, you should have personal projects you can show. Even just showing you’ve gone above and beyond basic classwork and done research or internships in an area goes a long way.

    I don’t think it matters how big or small the personal projects are, what matters is you spent the time independently to do them. People frequently suggest contributing to open source projects, and that’s great, if you have time. But if not, small personal projects also show interest and a real drive to learn and do better.

    Engineering is fun as hell

    I could spend all day doling out advice, but I only had about an hour with the students. In the end, while sometimes the engineering field can be wrought with roadblocks, if you can get past them it’s super fun and gratifying to build stuff that works. Sometimes as a student it’s hard to see how all the pieces fit, but they do, and if you persevere through it a career in engineering can be extremely satisfying.

    Read More
  3. Import Firefox Keywords into Chrome as Search Engines

    I just switched from Firefox to Chrome. Of course the first thing I did was import all my bookmarks. The second thing I did was try to use them, via their Firefox keyword. None of them worked. After a quick web search I found a lot of complaints about the inability to import Firefox keywords into Chrome but surprisingly no solution. I poked around a bit and found both browsers store their bookmarks, keywords, and search engine data in SQLite databases. This makes import straightforward. Here’s how.

    Step 1: Download SQLite Command Line Interface

    If you don’t already have sqlite3.exe on your computer, then download it here from SQLite.org. The file you want is Precompiled Binaries for Windows or whichever is appropriate for your OS.

    Once you’ve downloaded it, you can either copy the executable inside to a directory in your path, such as C:\Windows\, or use the full path when launching it in the command line, below.

    Step 2: Close browsers

    The next step is to close both Firefox and Chrome. Both browsers maintain a lock on the database while open, so in order to read and write from the databases you’ll need to close both browsers.

    To keep these instructions available, open it in Internet Explorer (yuck) or another browser if you have one installed (Opera, Safari). Alternately, you can do it the old-fashioned way and print it on paper.

    Step 3: Locate Firefox’s bookmarks file

    Firefox stores its bookmarks in a file called places.sqlite. This is the root of your profiles directory. The exact location will vary by OS and version. On my Windows 7 box, it’s at C:\Users\sam\AppData\Roaming\Mozilla\Firefox\Profiles\x0wk0n99.default\places.sqlite.

    You don’t need to do anything with the file for now, just copy it’s location to a text file. Don’t just copy to the clipboard, copy to a text file since we’ll need to manipulate it slightly.

    Step 4: Locate Chrome’s Search Engines file

    Chrome stores its search engine information in a file called Web Data, no extension. Again, the exact location may vary. Mine is at

    C:\Users\sam\AppData\Local\Google\Chrome\User Data\Profile 1\Web Data
    

    Step 5: Open a Command Prompt

    Start > cmd
    

    Step 6: Open Chrome’s Search Engines in SQLite

    Using the above location, open Chrome’s Search Engines file in sqlite3.exe.

    sqlite3 "C:\Users\sam\AppData\Local\Google\Chrome\User Data\Profile 1\Web Data"
    

    Remember that if you didn’t copy sqlite3.exe to your path, you’ll need to enter the full path above.

    Once you launch SQLite, you’ll get the SQLite prompt:

    SQLite version 3.7.16 2013-03-18 11:39:23
    Enter ".help" for instructions
    Enter SQL statements terminated with a ";"
    sqlite>

    Step 7: Attach Firefox’s bookmarks database

    Now that we’ve opened Chrome’s database, we have easy access to its data. We also need to attach Firefox’s database so we can read from it. This is where copying the path to a text file helps. SQLite only understands paths when using the / character, even in Windows. So replace all \ with / and then copy the path to the clipboard.

    Run this command in SQLite, replacing the path with yours.

    ATTACH 'C:/Users/sam/AppData/Roaming/Mozilla/Firefox/Profiles/x0wk0n99.default/places.sqlite' AS f;
    

    Don’t miss the part at the end, AS f; is important, including the semicolon.

    Step 8: Import

    Now we’re ready to run the import. The SQL code used here will be the same for everybody.

    INSERT INTO
    	keywords (
    		short_name,
    		keyword,
    		favicon_url,
    		url,
    		safe_for_autoreplace,
    		originating_url,
    		date_created,
    		usage_count,
    		input_encodings,
    		show_in_default_list,
    		suggest_url )
    SELECT
    	b.title, 	-- short_name
    	k.keyword, 	-- keyword
    	substr(p.url, 1, instr(substr(p.url,8),'/') + 7) || 'favicon.ico',
    				-- favicon_url
    	replace(p.url, '%s', '{searchTerms}'),
    				-- url
    	1, 			-- safe_for_autoreplace
    	null, 		-- originating_url
    	0, 			-- date_created
    	0,	 		-- usage_count
    	'UTF-8', 	-- input_encodings
    	1, 			-- show_in_default_list
    	null		-- suggest_url
    
    FROM
    	f.moz_keywords k
    			INNER JOIN
    	f.moz_bookmarks b
    			ON
    		k.id = b.keyword_id
    			INNER JOIN
    	f.moz_places p
    			ON
    		b.fk = p.id
    ORDER BY
    	k.keyword;
    

    Done

    Now close the SQLite command line window and you’re done. Open Chrome and notice all the imported search engines.

    Read More
  4. Thread Synchronization With Aspects

    Aspect-oriented programming is an interesting way to decouple common method level logic into localized methods that can be applied on build. For C#, PostSharp is a great tool that does the heavy lifting of the MSIL rewrites to inject itself in and around your methods based on method tagging with attributes. PostSharp’s offerings are split up into free aspects and pro aspects so it makes diving into aspect-oriented programming easy since you can get a lot done with their free offerings.

    One of their free aspects, the method interception aspect, lets you control how a method gets invoked. Using this capability, my general idea was to expose some sort of lock and wrap the method invocation automatically in lock statement using a shared object. This way, we can manage thread synchronization using aspects.

    Managing thread synchronization with aspects isn’t a new idea: the PostSharp site already has an example of thread synchronization. However, they are using a pro feature aspect that allows them to auto-implement a new interface for tagged classes. For the purposes of my example, we can do the same thing without using the pro feature and simultaneously add a little extra functionality.

    There are two things I wanted to accomplish. One was to simplify local method locking (basically what the PostSharp example solves), and the second was to facilitate locking of objects across multiple files and namespace boundaries. You can imagine a situation where you have two or more singletons who work on a shared resource. These objects need some sort of shared lock reference to synchronize on, which means you need to expose the synchronized object between all the classes. Not only does this tie classes together, but it can also get messy and error-prone as your application grows.

    First, I’ve defined an interface that exposes a basic lock. Implementing the interface is optional as you’ll see later.

    public interface IAspectLock
    {
        object Lock { get; }
    }
    

    Next we have the actual aspect we’ll be tagging methods with.

    [Serializable]
    public class Synchronize : MethodInterceptionAspect
    {
        private static readonly object FlyweightLock = new object();
    
        private static readonly Dictionary<string, object> LocksByName = new Dictionary<string, object>();
    
        private String LockName { get; set; }
    
        /// <summary>
        /// Constructor when using a shared lock by name
        /// </summary>
        /// <param name="lockName"></param>
        public Synchronize(String lockName)
        {
            LockName = lockName;
        }
    
        /// <summary>
        /// Constructor for when an object implements IAspectLock
        /// </summary>
        public Synchronize()
        {
    
        }
    
        public override void OnInvoke(MethodInterceptionArgs args)
        {
            object locker;
    
            if (String.IsNullOrEmpty(LockName))
            {
                var aspectLockObject = args.Instance as IAspectLock;
    
                if (aspectLockObject != null)
                {
                    locker = aspectLockObject.Lock;
                }
                else
                {
                    throw new Exception(String.Format("Method {0} didn't define a lock name nor implement IAspectLock", args.Method.Name));
                }
            }
            else
            {
                lock (FlyweightLock)
                {
                    if (!LocksByName.TryGetValue(LockName, out locker))
                    {
                        locker = new object();
                        LocksByName[LockName] = locker;
                    }
                }
            }
    
            lock (locker)
            {
                args.Proceed();
            }
        }
    }
    

    The attribute can either take a string representing the name of the global lock we want to use, or, if none is provided, we can test to see if the instance implements our special interface and use its lock. When an object implements IAspectLock the code path is simple: get the lock from the object and use it on the method.

    The second code path, when you use global lock name, lets you lock across the entire application without having to tie classes together, keeping things clean and decoupled.

    For the scenario where a global lock name was defined, I used a static dictionary to keep track of the locks and corresponding reference objects to lock on based on name. This way I can maximize throughput by using a flyweight container: lock first on the dictionary just to get the lock I want, then lock on the value retrieved. The locking of the dictionary will always be fast and shouldn’t be contended for that often. Uncontested locks are tested for using spinlock semantics so they are usually extremely quick. Once you have the actual lock you want to use for this function, you can call args.Proceed() which will actually invoke the tagged method.

    Just to be sure this all works, I wrote a unit test to make sure the attribute worked as expected. The test spawns 10,000 threads which will each loop 100,000 times and increment the _syncTest integer. The idea is to introduce a race condition. Given enough threads and enough work, some of those threads won’t get the updated value of the integer and won’t actually increment it. For example, at some point both threads may think _syncTest is 134, and both will increment to 135. If it was synchronized, the value, after two increments, should be 136. Since race conditions are timing-dependent we want to make the unit test stressful to try and maximize the probability that this would happen. Theoretically, we could run this test and never get the race condition we’re expecting, since that’s by definition a race condition (non-deterministic results). However, on my machine, I was able to consistently reproduce the expected failure conditions.

    private int _syncTest = 0;
    private const int ThreadCount = 10000;
    private const int IterationCount = 100000;
    
    [Test]
    public void TestSynchro()
    {
        var threads = new List<Thread>();
        for (int i = 0; i < ThreadCount; i++)
        {
            threads.Add(ThreadUtil.Start("SyncTester" + i, SynchroMethod));
        }
    
        threads.ForEach(t=>t.Join());
    
        Assert.True(_syncTest == ThreadCount * IterationCount,
                String.Format("Expected synchronized value to be {0} but was {1}", ThreadCount * IterationCount, _syncTest));
    }
    
    [GlobalSynchronize("SynchroMethodTest")]
    private void SynchroMethod()
    {
        for (int i = 0; i < IterationCount; i++)
        {
            _syncTest++;
        }
    }
    

    When the method doesn’t have the attribute we get an NUnit failure like

      Expected synchornized value to be 1000000000 but was 630198141
      Expected: True
      But was:  False
    
       at NUnit.Framework.Assert.That(Object actual, IResolveConstraint expression, String message, Object[] args)
       at NUnit.Framework.Assert.True(Boolean condition, String message)
       at AspectTests.TestSynchro() in AspectTests.cs: line 35
    

    Showing the race condition that we expected did happen (the value will change each time). When we have the method synchronized, our test passes.

    Read More
  5. IxD 2013: Rhythm, Flow, and Style

    Today Carlo and I listened to a 45 minute presentation by Peter Stahl called “Rhythm, Flow and Style” which discussed designing flow and rhythm into applications. Peter started with the observation that the world is full of rhythms, but not just what people think of (namely music). Rhythms exist in every part of life, chopping vegetables, walking outside, involving in conversations, filling out forms, navigating a website, etc. All of them involve actions, pauses, and repetitions. According to Stahl there are two types of rhythm: iterative rhythm, which is rhythm of the user (engaging with the application) and there is also motivic rhythm, which is rhythm within the application. But to get rhythm you need flow.

    Flow, according to Stahl’s presentation has several dimensions:

    • Known goals, with known progress
    • Perceived balance of challenge and skill
    • Sense of control
    • Focused concentration
    • Loss of self conciouslyness: becoming one with the activity
    • Time distortion
    • Self rewarding experience

    and it’s possible to induce it by offering clear goals, achievements, progressive challenges, progress tracking, and obvious next steps. The ultimate intent is to keep the user engaged and challenged, where they can easily lose themselves in the application. You can see these kinds of theories being applied to sites like StackOverflow, LinkedIn, or OKCupid. They have percentages indicating progress, helpful hints on how to get to the next step, progressively more advanced information, etc. Flow engages the user and lets the application open up, giving it a sense of movement over time.

    Flow, however, isn’t always just about content and progress. It’s also a matter of visual effects and transitions. Transition effects can influence the perception of an application: fast and jarring can give a sense of precision and automation, whereas slow, more gentle fades, induces a sense of calmness.

    On top of rhythm and flow is style, which is used to direct the flow and rhythm. Stahl called it rhythmic style and it is affected by visual frequency, speed, size and distance, and special effects. There are lots of different kinds of rhythmic style choices:

    • Dazzling or engaging?
    • Zippy or comfortable?
    • Dramatic or responsive?
    • Single-use or long-term?

    But the choice of style within visual components depends on content branding and audience.

    Stahl compared Samsung vs Vizio (though both have changed their sites since the slides shown in presentation). Samsung, at the time, used gentle fade transitions, whereas Vizio used a paper swipe effect. He argued that Samsung’s felt more human but Vizio’s was reminiscent of a copy machine: inhuman but precise. Both have advantages and with small motivic rhythm choices you can induce different emotions in the user.

    Of course though, flow rests with the user, so Stahl stressed to make sure to include the user in user interaction testing. Stahl showed a four way split image showing two images of the user using the application, as well as the screen being viewed (the last screen I think was a description of the test):

    stahlPresentation

    Finally, Stahl ended with a resonating quote that I think captures the design goals of any application:

    It’s all about the choregraphy of people’s attention. Attention is like water. It flows. It’s liquid. You create channels to divert it, and you hope that it flows the right way”.

    I really enjoyed his presentation. Here are Stahls slides https://www.box.com/s/ms8ssh4nc3zl6mx0n38w.

    Read More
  6. IxD 2013 – Production ready CSS workshop

    Carlo and I are in Toronto for Ixd 2013 for the week hoping to pick up some interesting info on interaction and design. We were painfully reminded of the need for continual design improvement within the first hour of stepping into Canada. We rented a Hyundai Veloster and for 30 minutes couldn’t figure out how to start the car. For the uninitiated, you have to hold the brake pedal down while pushing the start button.

    velocsterStarter

    Today the conference started and we attended a workshop called “Sitting in the drivers seat: Designing production level CSS”. Carlo’s been doing html and css for a long time and was interested to hear what the speaker would say, whereas I am less experienced and wanted to see if I could pick up any design tips. We were pleasantly surprised when Jack Moffett started and stressed that we should be focusing on blurring the lines between designer and developer. He quoted Jared Spool who said that designers should stop asking “do I need to code” but instead should ask “will learning code help me be a better designer?“. Moffett stressed using developer oriented tools such as version control–svn or git–wikis to track requirements, diff tools, bug tracking software, some sort of IDE for live editing of html and css, and of course the standard Firebug, Chrome, and even Internet Explorer developer toolsets. Using production tools that developers also use integrates the application design and development processes. With wiki’s and version control you get historical visibility about changes and decisions. This helps designers know when developers want changes, and helps developers know when designers want changes. Conversations and issues can be tracked in bug trackers like JIRA or FogBugz and gives everyone a sense of direction.

    Moffett said that a benefit to following developer practices and learning at least some basic html and css structures is that designers can design with a practical implementation in mind. Understanding the limitations of html and css also helps the designer avoid implementation pitfalls and minimizes the amount of design, implement, tweak rounds during application development. Just like an architect should know the structure and the needs of the building crew, if the design is impracticable it doesn’t matter how pretty it is.

    After the tool chain introduction Moffett led everyone through several examples of creating reusable CSS stressing seperating content from skin. The content should be able to stay independent, while skin elements (such as font, padding, etc) can be toggled with css. The example that I think the workshop most enjoyed was taking this:

    amazonRow.

    and refactoring it to this:

    amazonCol.

    By removing inline css and a few minor content structure tweaks. Now we can toggle between row and column layout using the same html, just with different css. I was a little surprised that Amazon’s site was used for the example – I expected at first that Moffett would show it as an example of what to do, not what to NOT do.

    Moffett also worked through an example leveraging CSS inheritance to control visible state. By doing it this way you can avoid direct css style manipulations in javascript and use css classes to define visual behaviors. For example, assume we want to control visiblity of a div that contains some photos lets set up the html and css like so.

    Photos aren’t visible:

    <div>
      <div class="photos">
        // some photos
      </div>
    </div>
    

    Photos are visible:

    <div class="showPhotos">
      <div class="photos">
        // some photos
      </div>
    </div>
    

    The css for this example would look like this:

    .photos {
       display:none;
    }
    
    .showPhotos .photos{
       display:block;
    }
    

    By default the photos div isn’t visible, however if we added the showPhotos class to the parent, the parents display property will override the photos display setting it to block. By removing the showPhotos class we can hide the element since the photos display element goes back to the default. This way you can control visibility and state by toggling items at the parent level instead of the child level. Carlo also brought up in the seminar that by doing it this way you are setting it up for using css transitions and animations since you can now control state information just with css.

    While it may sound trivial to UI developers or designers, it’s always good to ground yourself in how to make something simple and reusable. Even the best developers need reminders.

    Read More
  7. K-Means Step by Step in F#

    Recently we had a discussion on clustering techniques at one of our weekly tech talks and the k-means clustering algorithm came up. K-means is considered one of the simplest unsupervised machine learning techniques and I thought it would be cool to try my hand at an F# implementation of it.

    In short, k-means is a way to put data into groups based on distance between nearest neighbors. Technically it works with any dimension but for this post I’ll stick with 1-d data to make things easy.

    K-means background

    Imagine you have some 1-dimensional data like

    1, 2, 5, 14, 17, 19, 20
    

    And you want to group this into two groups. Pretty easily you can see that 1, 2, and 5 go into one group, and 14, 17, 19, 20 should go in another group. But why? For this small data set we intuitively figured out that 1, 2, and 5 are close together, but at the same time far away from 14, 17, 19, and 20, which are also close together. Really what we did was we calculated the centroid of each group and assigned the values to the centroid. A centroid is a fancy way of saying center point of a group. It’s calculated by taking the average of a group.

    For example, if we have the group

    1, 2, 5
    

    What is its centroid? The answer is

    (1 + 2 + 5) / 3 = 2.666
    

    So with k-means you end up with k centroids and a bunch of data grouped to that centroid. It’s grouped to that centroid because that data is closest to that centroid vs any other centroid. To calculate the centroids and grouping k-means uses an iterative process:

    • Pick k random points. So if you are doing a k=2 clustering, just pick 2 random points from your data set. These are going to be our initial centroids. It doesn’t matter if it’s right, or even if the points are the same. The algorithm is going to fine tune things for us. Technically this is called the Forgy Method of initialization. There are a bunch of other ways to initialize your centroids, but Forgy seems to be pretty common.
    • For every point, calculate its distance from the centroid. So lets say we picked points 1 and 2 as our centroids. The distances then look like this:
           Δ1      Δ2
      1    0        1
      2    1        0
      5    4        3
      14   13       12
      17   16       15
      19   18       17
      20   19       18
      
    • Now what we do is assign each data point to its nearest centroid. So, obviously, point 1 is closest to the initial arbitrary point 1 centroid. And you can see that basically everything else is closer to the arbitrary point 2 centroid. So we’re left with a grouping like this now:
      data around centroid 1: 1
      data around centroid 2: 2, 5, 14, 17, 19, 20
      
    • Now, the next step is to calculate new centroids based on these groups. For this example, we can take the average of all the points together to find the new center. So the new centroids become 1 and 12.83. Now we go back to step 2 using the new centroids. If you keep track of the current centroid and the previous centroid, then you can stop the k-means calculation when centroids don’t change between iterations. At this point everything has converged and you’ve got your k clusters

    Unfortunately k-means doesn’t always converge, so you can add a convergence delta (i.e. if the centroid between the current and previous iterations hasn’t really changed much, so it’s within some acceptable range) or an iteration limit (only try to cluster 100 times) so you can set bounds on the clustering.

    Implementation

    The first thing to do for my implementation is to define a structure representing a single data point. I created a DataPoint class that takes a float list which represents the data points dimensions. So, a 1-dimesional data point would have a float list of one element. A 2-d data point has one of two points (x and y), etc. To give credit where credit is due, I took the dimensional array representation from another F# implementation.

    type DataPoint(input:float list) =
        member this.Data = input
        member this.Dimensions = List.length input
        override x.Equals(yobj) =
            match yobj with
            | :? DataPoint as y -> (x.Data = y.Data)
            | _ -> false
        static member Distance (d1:DataPoint) (d2:DataPoint) =
                                        if List.length d1.Data <> List.length d2.Data then
                                            0.0
                                        else
                                            let sums = List.fold2(fun acc d1Item d2Item ->
                                                                         acc + (d2Item - d1Item)**2.0
                                                                  ) 0.0 d1.Data d2.Data
                                            sqrt(sums)
    

    It has its data, the dimensions, an equality overload (so I can compare data points by their data and not by reference), as well as a static helper method to calculate the euclidean distance between two n-dimensional points. Here I’m using the fold2 method which can fold two lists simultaneously.

    Alias data types

    For fun, I wanted to use F#’s data aliasing feature, so I created some helper types to make dealing with tuples and other data pairing easier. I’ve defined the following

    type Centroid = DataPoint
    
    type Cluster = Centroid * DataPoint List
    
    type Distance = float
    
    type DistanceAroundCentroid = DataPoint * Centroid * Distance
    
    type Clusters = Cluster List
    
    type ClustersAndOldCentroids = Clusters * Centroid list
    

    The only unfortunate thing is you need to explicitly mention aliased types for the type inference to work properly. The reason is because F# will prefer native types to typed aliases (why choose Distance over float unless you are told to), but it would’ve been nice if it had some way to prefer local type aliases in a module.

    Centroid assignment

    This next major step is to take the current raw data list and what the current centroids are and cluster them. A cluster is a centroid and its associated data points. To make things clearer, I created a helper function distFromCentroid which takes a point, a centroid, and returns a tuple of point * centroid * distance.

    let private distFromCentroid pt centroid : DistanceAroundCentroid = (pt, centroid, (DataPoint.Distance centroid pt))
    

    The bulk of the k-means work is in assignCentroids. First, for every data point, we calculate its distance from the current centroid. Then we select the centroid which is closest to our data points and create a list of point * centroid tuples. Once we have a list of point * centroid tuples, we want to group this list by the second value in the tuple: the centroid. Here, I’m leveraging the pipe operator to do the grouping and mapping. First, we group the lists by the centroid. This gives us a list of centroid * (datapoint * centroid) items. But this isn’t quite right yet. We want to end up with a list of centroid * dataPoint list so I passed this temporary list through some other minor filtering and formatting.

    (*
        Takes the input list and the current group of centroids.
        Calculates the distance of each point from each centroid
        then assigns the data point to the centroid that is closest.
    *)
    let private assignCentroids (rawDataList:DataPoint list) (currentCentroids:Clusters) =
            let dataPointsWithCentroid =
                rawDataList
                    |> List.map(fun pt ->
                                        let currentPointByEachCentroid = List.map(fun (centroid, _) -> distFromCentroid pt centroid) currentCentroids
                                        let (_, nearestCentroid, _) = List.minBy(fun (_, _, distance) -> distance) currentPointByEachCentroid
                                        (pt, nearestCentroid)
                                )
    
            (*
                |> Group all the data points by their centroid
                |> Select centroid * dataPointList
                |> For each previous centroid, calculate a new centroid based on the aggregated list
            *)
    
            List.toSeq dataPointsWithCentroid
                |> Seq.groupBy (fun (_, centr) -> centr)
                |> Seq.map(fun (centr, dataPointsWithCentroid) ->
                                    let dataPointList = Seq.toList (Seq.map(fun (pt, _) -> pt) dataPointsWithCentroid)
                                    (centr, dataPointList)
                          )
                |> Seq.map(fun (cent, dataPointList) ->
                                    let newCent = calculateCentroidForPts dataPointList
                                    (newCent, dataPointList)
                          )
                |> Seq.toList
    
    

    Calculating new centroids

    Since we’ve grouped data points by centroid, we now need to calculate what the new centroid for that list is. This is where I employ a disclaimer. For my implementation of n-dimensional centroids I averaged each dimension together to get a new value. I’m not totally sure this is actually valid for arbitrary geometrical shapes (given by n-dimensions), but if it’s not at least this is the only method that needs to be updated.

    First I create an empty n-dimension array of all zeros, initialized by the dimensions of the first point in the list (all the dimensions should be the same so it doesn’t matter which point we choose). This is the seed for list for the next fold. I’m not actually going to fold the value into a single element; I’m going to fold all indexes of the data point list together into a new array by adding each dimension together.

    For example, if I have two points of two dimensions [1, 2] and [3, 4] I add 1 and 3 together to make the first dimension, then 2 and 4 together to make the second dimension. This gives me an intermediate vector of [4,6]. Then I can average each dimension by the number of data points used, so in our example we used two data points, so the new centroid would be [2, 3]. Since a centroid is the same as a data point (even though one that doesn’t exist in our original raw set), we can return the new centroid as a data point. Having everything treated as data points makes the calculations easier.

    (*
        take each float list representing an n-dimesional space for every data point
        and average each dimension together.  I.e. element 1 for each point gets averaged together
        and element 2 gets averaged together.  This new float list is the new nDimesional centroid
    *)
    
    let private calculateCentroidForPts (dataPointList:DataPoint List) =
        let firstElem = List.head dataPointList
        let nDimesionalEmptyList = List.init firstElem.Dimensions (fun i-> 0.0)
        let addedDimensions = List.fold(fun acc (dataPoint:DataPoint) -> addListElements acc dataPoint.Data (+))
                                    nDimesionalEmptyList
                                    dataPointList
    
        // scale the nDimensional sums by the size
        let newCentroid = List.map(fun pt -> pt/((float)(List.length dataPointList))) addedDimensions
        new DataPoint(newCentroid)
    

    Here is the function to aggregate list elements together

    (*  goes through two lists and adds each element together
        i.e. index 1 with index 1, index 2 with index2
        and returns a new list of the items aggregated
    *)
    
    let private addListElements list1 list2 operator =
        let rec addListElements' list1 list2 accumulator operator =
            if List.length list1 = 0 then
                accumulator
            else
                let head1 = List.head list1
                let head2 = List.head list2
                addListElements' (List.tail list1) (List.tail list2) ((operator head1 head2)::accumulator) operator
        addListElements' list1 list2 [] operator
    

    I’m using a hidden accumulator so that the function will be tail recursive.

    Cluster runner

    The above code only does one single cluster iteration though . What we need is a clustering runner that does the clustering iterations. This runner will be responsible for determining when to stop the clustering. There are four functions here:

    • getClusters and getOldCentroids are helper methods to pull data out of the tuples (fst and snd are built in functions)
    • cluster is a basic method that will cluster as many times as necessary until the centroids stop changing
    • clusterWithIterationLimit is the bulk method that takes in the raw data, the clustering amount (k), the max number of clustering iterations, and a convergence delta to avoid infinite loops

    For each iteration create a new set of clusters, then test to see if if the centroids are the same. If they are, we’ve converged and we can end. If not, see if the clusters are “close enough” (the convergence delta). The final base case is if we’ve clustered the max iterations then return.

    (*
        Start with the input source and the clustering amount
        but also take in a max iteration limit as well as a converge
        delta.
    
        continue to cluster until the centroids stop changing or
        the iteration limit is reached OR the converge delta is achieved
    
        Returns a new "Clusters" type representing the centroid
        and all the data points associated with it
    *)
    
    let private getClusters (cluster:ClustersAndOldCentroids) = fst cluster
    let private getOldCentroids (cluster:ClustersAndOldCentroids) = snd cluster
    
    let clusterWithIterationLimit data k limit delta =
        let initialClusters:Clusters = initialCentroids (List.toArray data) k
                                        |> Seq.toList
                                        |> List.map (fun i -> (i,[]))
    
        let rec cluster' (clustersWithOldCentroids:ClustersAndOldCentroids) count =
            if count = 0 then
                getClusters clustersWithOldCentroids
            else
                let newClusters = assignCentroids data (getClusters clustersWithOldCentroids)
                let previousCentroids = getOldCentroids clustersWithOldCentroids
                let newCentroids = extractCentroidsFromClusters newClusters
    
                // clusters didnt change
                if listsEqual newCentroids previousCentroids then
                    newClusters
                else if centroidsDeltaMinimzed previousCentroids newCentroids delta then
                    newClusters
                else
                    cluster' (newClusters, newCentroids) (count - 1)
    
        cluster' (initialClusters, extractCentroidsFromClusters initialClusters) limit
    
    (*
        Start with the input source and the clustering amount.
        continue to cluster until the centroids stop changing
        Returns a new "Clusters" type representing the centroid
        and all the data points associated with it
    *)
    
    let cluster data k = clusterWithIterationLimit data k Int32.MaxValue 0.0
    

    Demo

    Using the sample data from the beginning of this post, let’s run it through my k-means clusterer:

    module KDataTest
    
    open System
    open KMeans
    
    let kClusterValue = 2
    
    let sampleData : KMeans.DataPoint list = [new DataPoint([1.0]);
                                              new DataPoint([2.0]);
                                              new DataPoint([5.0]);
                                              new DataPoint([14.0]);
                                              new DataPoint([17.0]);
                                              new DataPoint([19.0]);
                                              new DataPoint([20.0])]
    
    KMeans.cluster sampleData kClusterValue
        |> Seq.iter(KMeans.displayClusterInfo)
    
    Console.ReadKey() |> ignore
    
    

    And the output is

    Centroid [2.66666666666667], with data points:
    [1], [2], [5]
    
    Centroid [17.5], with data points:
    [14], [17], [19], [20]
    

    Full source available at our github

    Read More
  8. Tracing computation expressions

    F# has a novel syntax feature called computation expressions, which lets you build complex monadic expressions with minimal syntax. Commonly shied away from, a monad is simple: it’s a function whose input is some state. A monad usually manipulates the state and returns a new state (which can be handed off to another monad).

    Monad’s are maybe best known from Haskell, but they exist in scheme, ML, clojure, scala, and even show up in C# and other imperative languages.

    While the computation expression syntax is cool, short of the maybe monad, and the baked-in async keyword, I wasn’t sure what I could do with this. Thankfully, I found an interesting post by Luis Diego Fallas who posted an F# code sample leveraging computation expressions to read binary formatted files. However, if you are like me, and trying to better understand the power of computation expressions, tracing through these samples can be difficult. It’s not because they are poorly written, but mostly because they are jumping into more complex usages.

    To clarify what’s going on with computation expressions, I wanted to show what is passing through the monad. Computation expressions (also known as monads or workflows) can be tricky because you are overriding the language syntax. So what is a “let!” statement in one workflow is not the same as in another workflow. On top of that statements themselves can return a value, not just have a left hand side assignment. If you feel your brain start to hurt, that’s OK. It will get better.

    An example

    Let’s start with a small sample to see how information passes through the workflow. This makes it easier to understand what computation expressions are doing:

    open System
    
    type State =
        | Current of int * State
        | Terminated
    
    type Builder() =
        member this.Bind(value:int, rest:int->State) =
            State.Current((value, rest(value)))
    
        member this.Return(returnValue:int) = State.Current(returnValue, State.Terminated)
    
    let builder = new Builder()
    
    let build _ = builder{
                            let! x = 1
                            let! y = 2
                            return 3
                    }
    
    let stateChain = build()
    
    let rec formatState (chain:State) =
                match chain with
                    | State.Terminated -> Console.WriteLine()
                    | State.Current(i, next) -> Console.WriteLine("State: {0}", i)
                                                formatState next
    
    formatState stateChain
    
    Console.ReadKey() |> ignore
    

    Which prints out

    State: 1
    State: 2
    State: 3
    

    Desugaring

    Before we trace through what’s happening, lets desugar the expression. If you aren’t familiar with the bang syntax, a let! statement will compile into an execution on the builders Bind function and a return will compile into an execution on the builders Return function. Let’s take the original expression:

    let build _ = builder{
                            let! x = 1
                            let! y = 2
                            return 3
                    }
    

    And show the same thing but without the syntactic sugar.

    let desugared =
        builder.Bind(1, fun next ->
                            let x = next // next = 1 since we took the input value of the bind,
                                         // and passed it to the next lambda
    
                            let returnedState =
                                builder.Bind(2, fun next2 ->
                                                    let y = next2 // next2 is the value 2 here
    
                                                    let terminatedState = builder.Return(3)
    
                                                    //terminated state is now
                                                    //State.Current(3, State.Terminated)
    
                                                    terminatedState
                                                )
    
                            // the state here is
                            // State.Current(2, State.Current(3, State.Terminated))
                            returnedState
                        )
    

    The important thing to understand here is how the let! statement is deconstructed. The right hand side is the value input to the bind. Everything below the let! statement (and including the left hand side of the assignment), is inside of the bind lambda (passed to the rest argument of the bind function). The first thing the lambda does is actually apply the left hand side assignment with the input from the bind. In this case, the first let! statement assigns x = 1. Then it executes the remaining function, being the other let! and return statements.

    Tracing it through

    In the original sample, I’ve defined a discriminated union called State representing the current state. This union has two types. One, called Current, contains an integer as well as a link to the next state in the form of a tuple. The other, Terminated, is a value that we can use as a terminator for our state link. The example is really just for demonstration, since by being able to capture the state it’s easier to understand how computation expressions work; it gives us a sense of where the monad has been.

    Let’s take this one step at a time, with an even simpler example based on the above code. It’s important to understand the deconstruction. The compiler will translate our computation expressions into invocations on the builder.

    builderDeconstruction1

    To desguar it we take the left hand side and everything after

    builderDeconstruction2

    And move it to a lambda. This lambda is what is going to be passed as the rest argument to the builder’s bind function

    builderDeconstruction3

    The right hand side is going to be applied to the value argument of the bind function

    builderDeconstruction4

    Go back and look at how we’ve defined the bind function:

    member this.Bind(value:int, rest:int->State) =
            State.Current((value, rest(value)))
    

    Here the value argument is 1. The second argument, rest, is the lambda. The lambda is going to have to return a State union since State.Current expects an integer, State tuple.

    When we execute the lambda inside the bind we pass the value (1) to the lambda, but we’ve also captured the current value. This means that this bind is going to return:

    State.Current(1, rest(1))
    

    So here we apply the value to the function

    builderDeconstruction5

    This is where the left hand side statement (x) now gets assigned.

    builderDeconstruction6

    Now what about

    builder.Return(2)
    

    Remember we defined the return function to return

    member this.Return(returnValue:int) = State.Current(returnValue, State.Terminated)
    

    So with our simplified example this will return

    State.Current(2, State.Terminated)
    

    The previous lambda now returns that same value, since that’s the last line of the statement. So we’re back now to the original bind function:

    member this.Bind(value:int, rest:int->State) =
            State.Current((1, rest(1)))
    

    But rest(1) returns State.Current(2, State.Terminated). Our final builders return value, in this example, is

    State.Current(1, State.Current(2, State.Terminated))
    

    All the computation builder syntax is doing is just sugaring our statements up to give us these broken up functions.

    Back to the original sample. We added a second let! statement in there:

    let build _ = builder{
                            let! x = 1
                            let! y = 2
                            return 3
                    }
    

    Now, hopefully, you should be able to see how the final return from the computation expression is a state object representing what happened in the monad:

    State.Current(1, State.Current(2, State.Current(3, State.Terminated)))
    

    Combine and Yield

    Computation expressions have more than just let! and return statements though. Once you get used to tracing through and thinking about the computation builder, it becomes easier to start writing workflows. Just for kicks, I wanted to see if I could write a computation expression to evaluate a basic arithmetic expression. Here I’m using partial functions and the Combine property of the builder to build out the expression. If you wanted to, you can even use computation expressions within computation expressions. There’s nothing keeping you from doing that.

    In general, there is a bunch of reserved syntax that maps to specific builder functions. The msdn on computation syntax has these all defined.

    open System
    
    type BuildTest() =
        member this.Combine(currentStatement, value) = currentStatement(value)
        member this.Return(value) = value
        member this.Yield(item) = item
        member this.Delay(item) = item()
    
    let builder = new BuildTest()
    
    type math() =
        member this.add x y = x + y
        member this.mult x y = x * y
    
    let m = new math()
    
    let build _ = builder{
                    yield m.mult 2 // 2 * (1 + (2 + (3 + 0))
                    yield m.add 1  // 1 + (2 + (3 + 0)
                    yield m.add 2  // 2 + (3 + 0)
                    yield m.add 3  // (3 + 0)
                    return 0       // 0
                }
    
    let run = build()
    
    let monader = printf "%s" ("got " + run.ToString())
    
    Console.ReadKey() |> ignore
    

    This snippet evaluates to 12.

    You can even add precedence by evaluating a computation expression within the computation expressions

    builder{
        yield m.mult 2 // 2 * (1 + (8 + 0))
        yield m.add 1  // 1 + (8 + 0)
    
        let parenth = builder{
                            yield m.mult 4 // 4 * 2
                            return 2  // 2
                        }
    
        yield m.add parenth // 8 + 0
    
        return 0       // 0
    }
    

    Which evaluates to 18.

    Tracing Combine and Yield

    Just like before, there’s a bunch of magic going on here, so it’s easier if you follow along with the desugared version of the original arithmetic expression below.

    let desugared = builder.Delay(
                    fun () -> builder.Combine(builder.Yield(m.mult 2),
                        builder.Delay(
                            fun() -> builder.Combine(builder.Yield(m.add 1),
                                    builder.Delay(
                                        fun() -> builder.Combine(builder.Yield(m.add 2),
                                                builder.Delay(
                                                    fun() -> builder.Combine(builder.Yield(m.add 3),
                                                        builder.Delay(
                                                            fun() -> builder.Return(0))))))))))
    

    Each monadic function is wrapped in a Delay, which promptly executes it. Look at the builder’s delay declaration – it takes a function and executes it.

    In our builder, the Yield just returns the same value. It doesn’t do much but we needed to implement it to use the computation expression syntax.

    What we pass to the delay is an anonymous function that has a combine statement. Combines take two things and produce a third. Here, we are passing the current partial function as the first argument (via the yield), and the value we want to use to evaluate this partial function as the second argument. However, the second argument isn’t actually evaluated till the end. The combine will then apply the second argument (an integer) to the first argument (a partial function that takes an integer).

    For the basic arithmetic example, the final delay function returns 0. You can think of this as a “seed.” If you think of it like a fold operation, this is very similar. When we finally return the seed, we bubble each evaluated expression back up the stack (starting with 0), so read the desugared version from the bottom up. In this way, we are executing the current curried statement with the previous statements evaluated value in the Combine method of the builder. Not the most practical application, but I thought it was a fun exercise.

    If you are confused why this example’s desguaring contains the Delay method and the original example didn’t, it’s because the sugaring happens differently depending which builder constructs you use.

    Under the hood

    When you decompile a computation expression, each monadic function gets compiled into it’s own class representing a monad. In our arithmetic operation example, this is a Combine, Yield, Delay trio. It’s not easy to read since the function names have been mangled, but you can see the general pattern here (formatting added).

    [Serializable]
    internal class run\u004089 : FSharpFunc<Unit, int>
    {
        internal run\u004089()
        {
        }
    
        public override int Invoke(Unit unitVar0)
        {
            return ExpresionsTest.builder.Combine<int, int>(
            ExpresionsTest.builder.Yield<FSharpFunc<int, int>>(
                (FSharpFunc<int, int>) new ExpresionsTest.run\u004089\u002D1(2, ExpresionsTest.m)),
                    ExpresionsTest.builder.Delay<int>(
                        (FSharpFunc<Unit, int>) new ExpresionsTest.run\u004091\u002D2()));
        }
    }
    

    This decompliation represents the following sub-block.

    builder.Combine(
        builder.Yield(m.add 2), builder.Delay( (*next function*) )
    )
    

    Notice in the decompiled block that the class name is run\u004089 and the executable expression is compiled into an Invoke that returns an int. The decompiled assembly will actually be littered with these classes with mangled names, following a naming format of the target variable name (run) and an identifier (\u004089). You can always decompile the computation expression to get a sense for how it’s been desugared.

    Conclusion

    I said in the beginning that monads are simple, but I’ll admit that I lied. Monads are tricky, there’s no denying that. Maybe that’s why there is no shortage of blog posts trying to explain the monad over and over again. But, in the end, once you wrap your head around it, I think computation expression syntax is a cool way of using the concept of a monad by decoupling what something is defined to do, vs how it’s actually executed.

    I highly suggest running the examples and actually stepping through them bit by bit if you are having trouble following what is happening. Being able to see a desugared version of the code and using a debugger to inspect locals while stepping through examples makes it a lot clearer to see whats happening.

    More reading

    If you are curious here are some links to further reading explaining monads and computation expressions (such as the F# wikibook). Don Syme also has a few posts explaining things really well that are definitely worth checking out.

    Read More
  9. Debug ‘Could not load file or assembly or one of its dependencies’ error with Process Monitor

    Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly ‘NativeLibWrapper.dll’ or one of its dependencies. The specified module could not be found.  at MainApp.Program.Main(String[] args)

    The error above is possibly one of the most searched .Net exceptions on Google. “Could not load file or assembly” exceptions occur when Windows is not able to load a dll needed by the calling code. There are numerous reasons why the loader can fail to load the required dlls, such as a missing dependency, calling code that does not have permission to read the needed dll, etc.

    Dependency Walker and Fusion Log Viewer are common tools used to troubleshoot missing dependency problems. Dependency Walker statically resolves all the dlls needed by a native PE file and flags missing dependencies, while Fusion Log Viewer catches assembly binding problems in managed code during runtime. Both are great tools, and do their respective jobs well, but they fall short with applications that dynamically load native dlls or contain both native and managed assemblies. In this blog post I’ll discuss how we can use Process Monitor to do exploratory debugging on “Could not load file or assembly” problems, as well as how this process can be automated.

    The Problem

    Process Monitor is an easy to use real-time event monitoring tool for Windows that shows you file system, registry, network, process and thread activity log. To demonstrate how we can use Process Monitor for troubleshooting, I created a managed application called MainApp.exe. MainApp references a managed wrapper library called NativeWrapper.dll, which references a native C++ dll called NativeLib.dll. (You can download the complete solution from github here). To simulate a missing dependency, let’s delete NativeLib.dll. Executing MainApp.exe now will result in the following error:

    >MainApp.exe

    Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly ‘NativeLibWrapper.dll’ or one of its dependencies. The specified module could not be found. at MainApp.Program.Main(String[] args)

    Notice that the exception didn’t mention the actual missing dependency, it only pointed out that one of the NativeLibWrapper.dll dependencies is missing, but which one? It’s easy to figure out the problem in this toy application due to it’s small scope. However, in real world applications where each assembly can have dozens of dependencies scattered around the system, this can become a nightmare. As I mentioned above, Dependency Walker does not work with managed assemblies, so it won’t be helpful.  NativeLib.dll is an unmanaged dll, so Fusion Log Viewer won’t detect it. Lets see how we can troubleshoot this problem using Process Monitor.

    Run Process Monitor and monitor MainApp.exe

    RunProcessMonitor

    • Run Process Monitor and start capturing events.
    • Add a Process Name filter for MainApp.exe to limit the displayed events to those related to MainApp only (This keeps the logs from becoming unmanageably large).
    • Run MainApp.exe after the capture begins
    • Once you get the FileNotFoundException, switch to Process Monitor and stop capturing events (File > Capture Events).

    Manual Log analysis using Process Monitor

    Now we can analyze the logs. The exception that we’re looking at is a file not found exception, which is almost certainly related to a file system event. So, we know we can filter out everything other than file system events. You can do this by turning the process, registry and network buttons off in the toolbar. The process monitor window should now look something like this:

    ProcMonLog

    Let’s trim the irrelevant events:

    1. We are trying to find problems related to a dll, so exclude all events that are not related to dll files.

    2. Since we are interested in finding out whether the file was successfully loaded or not, let’s exclude events where the result is not one of the following: NAME NOT FOUND, PATH NOT FOUND, ACCESS DENIED, FILE LOCKED WITH ONLY READERS and SUCCESS. Process Monitor does not allow you to add OR filters so you will have to add exclude filters for all the results not in the above list.

    3. CreateFile and CreateFileMapping operations are triggered when Windows tries to load a dll, so exclude all other operations. I had to exclude events where the operation was BUFFER OVERFLOW, FILE LOCKED WITH ONLY READERS and Query*. Depending upon your process, you might have to exclude a few other operations as well.

     The following snapshot displays all the filters I had to apply for the above to take effect.

    FilterEvents

    Now it’s time for all the hard work to pay off. Scroll down to the bottom of the log and try to find the file that the loader was not able to find (i.e. there is not an event with Result = SUCCESS for it). It should be fairly easy to spot, since most of the time it will be the entry with tons of contiguous NAME NOT FOUND results. For my demo app, visual inspection clearly reveals that the loader was not able to find NativeLib.dll.

    ManualInspection

    Automatic Log Analysis using Dependz

     Now, this explorative process is a lot of fun for the first few times, but it loses its charm pretty quickly. To make it less tedious I automated the analysis part of the process. Rather than manually filtering and inspecting the logs, I wrote a small tool called Dependz to analyze the log file and list all the unresolved dependencies. You can download Dependz’s source here and binary here.

     Dependz will need a Process Monitor log file (in xml format) along with the name of the application to analyze as inputs, as shown below:

    DependzUsage

    Let’s analyze the NativeLib.dll problem with Dependz this time. Go to Process Monitor and reset all the filters we just applied (Ctrl + L > Reset) then save the log in xml format.

    SaveLog

    Run Dependz.exe, with the required inputs, as shown below:

    DependzOutput

    Hooray! Dependz figured out that NativeLib.dll was missing and also listed all the paths probed by the loader while trying to find it. Besides NativeLib.dll, a few other dlls like rpcss.dll and mscorrc.dll were not found, but, I think these libraries are optional and didn’t cause this problem. I usually work my way from the bottom to the top of the log. I’ve found that the last logged dll is almost always the cause of the problem.

    Permissions related problems

    Dependz can also help you figure out dependency errors because of insufficient permissions. Assume that a user without read permissions to NativeLib.dll runs MainApp.exe, and MainApp.exe crashes with the following exception:

    >MainApp.exe

    Unhandled Exception: System.IO.FileLoadException: Could not load file or assembly
    at MainApp.Program.Main(String[] args)

    This exception is not helpful, as expected. Fire Process Monitor, capture logs and run Dependz with it.

    DependzAccessDenied

    This worked! Dependz figured out that MainApp.exe was not able to load NativeLib.dll because access was denied to the file and also revealed the actual path from where MainApp.exe was trying to load NativeLib.dll.

    Dependz is a work in progress. As I encounter more scenarios where I can automate Process Monitor log analysis, I’ll add more functionality. If you have any particular troubleshooting scenarios that you frequently repeat with Process Monitor, please leave a comment and I’ll try to help.

    Read More
  10. Let’s talk about functions

    We recently had an internal discussion on functions and our coding standards. Coding standards are important to have, it becomes tough to read code otherwise. Everyone has preferences and if no standards are set, it could result in a mishmash of coding styles as team members change or add code fitting their tastes. There were a lot of thoughts from each developer and I decided to discuss some of the points in this blog post.

    Let’s start with the length of a function.

    Short Functions

    A function should never be longer than a single screen. Any function that exceeds this length is probably doing way too much. Let’s go out on a limb and think of code as a story. Stories are comprised of chapters, which are made of paragraphs, which are made of sentences, which are made of words (see literate programming). When you read a story the paragraphs are nicely spaced apart, each paragraph conveys a central idea, and each sentence within that paragraph is related. When too many sentences are strung together in a paragraph, it becomes less likely that they are all tightly related. When writing, you then make them separate paragraphs. This makes it easy to follow.

    I think functions are the same way, they should be clearly separated and related statements grouped appropriately.

    For example, here’s a chunk of sample code from an open source project.

          private static void CheckEndPoint( int width, int height, IntPoint start, IntPoint end )
            {
                if ( end.X >= width )
                {
                    int newEndX = width - 1;
    
                    double c = (double) ( newEndX - start.X ) / ( end.X - start.X );
    
                    end.Y = (int) ( start.Y + c * ( end.Y - start.Y ) );
                    end.X = newEndX;
                }
    
                if ( end.Y >= height )
                {
                    int newEndY = height - 1;
    
                    double c = (double) ( newEndY - start.Y ) / ( end.Y - start.Y );
    
                    end.X = (int) ( start.X + c * ( end.X - start.X ) );
                    end.Y = newEndY;
                }
    
                if ( end.X < 0 )
                {
                    double c = (double) ( 0 - start.X ) / ( end.X - start.X );
    
                    end.Y = (int) ( start.Y + c * ( end.Y - start.Y ) );
                    end.X = 0;
                }
    
                if ( end.Y < 0 )
                {
                    double c = (double) ( 0 - start.Y ) / ( end.Y - start.Y );
    
                    end.X = (int) ( start.X + c * ( end.X - start.X ) );
                    end.Y = 0;
                }
            }
    

    Here’s a quick re-factoring of the code where each check is moved into separate functions.

            private static void FixEndPoint(int width, int height, IntPoint start, IntPoint end)
            {
                LimitMaxWidth(end, width, start);
                LimitMaxHeight(end, start, height);
                EnsurePostiveWidth(end, start);
                EnsurePostiveHeight(end, start);
            }
    
            private static void EnsurePostiveHeight(IntPoint end, IntPoint start)
            {
                if (end.Y < 0)
                {
                    double c = (double)(0 - start.Y) / (end.Y - start.Y);
    
                    end.X = (int)(start.X + c * (end.X - start.X));
                    end.Y = 0;
                }
            }
    
            private static void EnsurePostiveWidth(IntPoint end, IntPoint start)
            {
                if (end.X < 0) 
                {
                    double c = (double)(0 - start.X) / (end.X - start.X);                 
                    end.Y = (int)(start.Y + c * (end.Y - start.Y));
                    end.X = 0;             
                }         
            }         
            
            private static void LimitMaxHeight(IntPoint end, IntPoint start, int height) 
            {             
                if (end.Y >= height) {
                    int newEndY = height - 1;
    
                    double c = (double)(newEndY - start.Y) / (end.Y - start.Y);
    
                    end.X = (int)(start.X + c * (end.X - start.X));
                    end.Y = newEndY;
                }
            }
    
            private static void LimitMaxWidth(IntPoint end, int width, IntPoint start)
            {
                if (end.X >= width)
                {
                    int newEndX = width - 1;
    
                    double c = (double)(newEndX - start.X) / (end.X - start.X);
    
                    end.Y = (int)(start.Y + c * (end.Y - start.Y));
                    end.X = newEndX;
                }
            }
    

    In comparison, I think that the refactored CheckEndPoint() function is now a lot less jarring at first glance; each check is abstracted into its own function. Since each check is really doing something different, they are different concerns and should be separated.

    Avoid unnecessary side effects

    Ideally, a function should be short and do one thing. It’s a good idea to not have unnecessary side effects, especially if those side effects are not clearly stated by the function name. In object oriented programming, it’s hard to avoid side effects. However, unless the side effects are tied directly to the function, they should be separated. If that’s not possible, then refactor and name the function so that the side effect is clear.

    In the original code above, each check not only checks the boundaries, but also updates the end point if the boundaries are exceeded. However, the function name CheckEndPoint() does not imply that the function updates anything. When refactoring, you should try to be more explicit.

            private static void LimitMaxHeight(IntPoint end, IntPoint start)
            {
                if (end.Y < 0)
                {
                    double c = (double)(0 - start.Y) / (end.Y - start.Y);
    
                    end.X = (int)(start.X + c * (end.X - start.X));
                    end.Y = 0;
                }
            }
    

    In the refactored code the side effect is clearly stated in the function name. Additionally, it’s best to have succinct function names rather than have names that resemble a long sentence. It might also be a good idea to standardize naming conventions such as “Update”, “Ensure”, “Check”, or “Verify” so that team members recognize these keywords. If ten developers use ten different synonyms for “Update”, it can cause confusion as expectations (of encapsulated code) inevitably develop based on function names.

    Lambdas

    Lambdas are great but can quickly become unmanageable if misused. Take the following example:

                statesMap.ForEach(kvp =>
                {
                    var list = new List<string>();
                    kvp.Key.Actions.OfType<DoAction>().
                        ForEach(a =>
                        {
                            if (a.HasParameter("check"))
                            {
                                string checkRef = a.GetParameters().Find(p => p.Name == "check").Value;
                                list.Add(checkName);
                            }
                        });
    
                    string str = "";
    
                    kvp.Key.Actions.OfType<WaitAction>().
                        ForEach(a =>
                        {
                            if (a.HasParameter("test"))
                            {
                                str = a.GetParameters().Find(p => p.Name == "hasValue").Value;
                                list.Add(str);
                            }
                        });
    
                    newStates.Add(new StateDto { Parameters = list });
                }); 
    

    There are a couple of problems in this example, one is the variable ‘kvp’. I think it’s fairly intuitive that this is an abbreviation of key-value pair, but does it really tell you what that key-value pair is? We definitely have more information at our disposal so why not be more explicit and name the variable what it really is? When using lambdas there’s a tendency to use obscure variable names that are not meaningful because lambdas are usually terse and inline. This should be avoided and meaningful variable names should be used instead.

    The second issue is that the lambda is doing too much; it should be a function. In addition, that function should probably be broken up into smaller functions.

    Consider the following refactoring.

            statesMap.ForEach(actionsPair => CreateAddState(newStates, actionsPair.Key.Actions));
    
            private void CreateAddState(List<StateDto> newStates, List<Action> actions)
            {
                var list = new List<string>();
                actions.OfType<DoAction>().ForEach(action => FindCheckParameter(action, list));
                actions.OfType<WaitAction>().ForEach(action => FindHasValueParameter(action, list));            
    
                newStates.Add(new StateDto { Parameters = list });
            }
    
            private void FindCheckParameter(Action action, List<string> parameterList)
            {
                if (action.HasParameter("check")) {
    
                    string checkRef = action.GetParameters().Find(parameter => parameter.Name == "check").Value;
                    parameterList.Add(checkName);
                }               
            }
    
            private void FindHasValueParameter(Action action, List<string> parameterList)
            {
                if (action.HasParameter("hasValue")) {
    
                    string valueParam = action.GetParameters().Find(parameter => parameter.Name == "hasValue").Value;
                    parameterList.Add(valueParam);
                }            
            }
    

    In the refactored code, each lambda is now a function with an appropriate name. The variable names are no longer single characters (e.g. p, a), making them much more identifiable. The code is much easier to follow as there is no longer the single blob of statements.

    Minimize Levels of Abstraction

    Another consideration is the level of abstraction within functions. If at all possible, you should try to avoid functions which have statements that are not at the same level of abstraction. When the level of abstraction between statements are great, it becomes extremely hard to follow and debug as our mental model of the code becomes muddled by the layers. It’s not always possible to minimize levels of abstraction within functions but I would consider it a high priority for maintainable code.

    Avoid this, if possible.

    public void Depth1() {
        Depth2();
    }
    
    public void Depth2() {
        Depth3();
    }
    
    public void Depth3() {
        ...
    }
    

    Aim for this.

    public void DoTask() {
        Depth1();
        Depth2();
        Depth3();
    }
    

    Function parameters

    When dealing with function parameters, an important concern should be to avoid boolean traps because it leads to misleading or confusing code. Consider the following statement.

    displayWindow(true);
    

    When reading this statement, you don’t have any contextual clues as to what “true” means with regards to the function. You would need to refer to the function implementation to determine what this statement really means. Instead, if the function was refactored to the following.

    displayWindow(WindowStyle.Border);
    displayWindow(WindowStyle.NoBorder);
    

    Now, it’s clear what the function is doing. The use of an enumeration clearly conveys the intent of each method call. If you are thinking about adding a boolean to a function consider if it would not make more sense to have a different function with a meaningful name, use a typed object as a parameter, or possibly an enumeration as shown.

    We should avoid too many parameters in a function. If a function has too many parameters then the parameters should probably be wrapped in a object and the single object passed to the function instead.

    White space

    White space is important. You don’t display a phone number as 1234567890, you display it as 123-456-7890. Humans can’t form a mental model as easily if things are cluttered.

    Take the following.

                    var textContainer:Canvas = Canvas(answers.addChild(new Canvas()));
                    textContainer.visible = false;
                    textContainer.includeInLayout = false;
                    var text:SuperTextArea = new SuperTextArea();
                    text.percentWidth = 100;
                    text.maxHeight = 200;
                    text.scaleToContent = true;
                    text.setStyle("left", 18);
                    text.setStyle("right", 1);
                    text.editable = data.editable;
                    text.selectable = data.editable;
    

    Now, compare it with the following refactored code.

                    var textContainer:Canvas = Canvas(answers.addChild(new Canvas()));
    
                    textContainer.visible = false;
                    textContainer.includeInLayout = false;
    
                    var text:SuperTextArea = new SuperTextArea();
    
                    text.percentWidth = 100;
                    text.maxHeight = 200;
    
                    text.scaleToContent = true;
                    text.setStyle("left", 18);
                    text.setStyle("right", 1);
    
                    text.editable = data.editable;
                    text.selectable = data.editable;
    

    I think the refactored code is much easier on the eyes because of the clearer separation between the different tasks being done by each group of statements. I think the additional white space is like punctuation in sentences, it allows you to rest (your eyes) and absorb before moving on to the next thought. This will make it much easier for everybody who reads it in the future.

    Due Diligence

    The points above are pretty common sense, but it’s difficult to follow them perfectly. That is why, as a team, it’s important to develop a culture where everybody agrees to make an effort to either correct poorly written code or make a (bug) case so that it’s documented to be fixed.

    I understand that it can be difficult to work if we’re often distracted by making cases while working on a task, but in the long term it makes everything easier.

    Standards

    It’s important to establish coding standards and have all team members buy in. There are many different opinions on a team, but it’s important to agree and adhere to conventions even if you might not agree with everything.

    This is a vast topic and I’m only barely scratching the surface; you could write books on this stuff. In fact, many people have. Here one’s that I recommend: Clean Code by Robert Martin.

    Read More