Functional VGL List Class

25 June 2019

The scripting language used inside of SampleManager, VGL, has basic multidimensional array datatypes for fixed-length and variable-length arrays. In addition to the usual array accessor, there are some routines in the core STD_ARRAY library to do some basic manipulation of arrays. Most of the names are self-explanatory:

array_copy
array_element_exists
array_get_dimensions
array_insert_slice
array_remove_slice
array_sort
array_complex_sort (for multidimensional arrays)

Over time, using modern languages like C# or JavaScript, I've become accustomed to having a more fully-featured Array datatype, so I set out to create something to make my life a little easier in VGL. I implemented a 1-dimensional LIST class, using the JavaScript Array object as a model for which actions I implemented.

API Summary

Here's an summary of the LIST API:

Create a new list

JOIN LIBRARY lib_list

DECLARE list

lib_list_define_list_class()
CREATE OBJECT LIST_CLASS, list

As with many VGL class-oriented libraries, a routine to define the class needs to be called before the class itself is available for instantiation.

Add elements to the list

                       { list contents                              }
list.append(1)         { [1]                                        }
list.append(2)         { [1  2]                                     }
list.push(3)           { [1  2  3]                                  }
list.unshift(4)        { [4  1  2  3]                               }

The append and push actions are identical, adding a new element to the end of the list. The unshift action adds a new element to the beginning of the list.

Access elements of the list by index

                       { list contains: [4  1  2  3]                }
list.get(3)            { action returns: 2                          }
list.set(3, 5)         { list contains: [4  1  5  3]                }

Removing elements of the list

                       { list contents                              }
                       { [4  1  5  3]                               }
list.remove(1)         { [1  5  3]                                  }
list.pop()             { [1  5]      pop() returns 3                }
list.shift()           { [5]         shift() returns 1              }

Corresponding to unshift and push, shift and pop remove and return the first and last elements of the list, respectively. The remove action returns the list object rather than the element that was removed.

Order elements of the list

                       { list contents                              }
                       { [1  2  3  4]                               }
list.reverse()         { [4  3  2  1]                               }

There is a reverse action to reverse the order of the elements in the list, but no sort action.

Slice and splice

                       { list contains: [a  b  c  d  e]             }
list.slice(2, 4)       { action returns: [b  c]                     }
list.slice(-3, EMPTY)  { action returns: [c  d  e]                  }
list.slice(EMPTY, -2)  { action returns: [a  b  c]                  }

The slice action is used to copy a range of elements into a new list without modfying the original list.

                       { list contains: [a  b  c  d  e  f  g]       }
list.splice(2, 4)      { action returns: [b  c  d  e]               }
                       { list now contains: [a  b  f  g]            }

The splice action is used to return a range of elements from the original list, removing them from the original list. This differs from the JavaScript Array.splice method in that it cannot be used to insert or replace existing elements.

Filter

                       { list contains: [a  b  b  c  d  a]          }
list.distinct()        { action returns: [a  b  c  d]               }

The distinct action returns a new list with all of the distinct elements in the original list. Elements appear in the returned list in the order in which they first appear in the original list. The original list is not modified.

Bounds

                       { list contains: [a  b  c  a  d]             }
list.inBounds(0)       { action returns: FALSE                      }
list.inBounds(3)       { action returns: TRUE                       }
list.inBounds(6)       { action returns: FALSE                      }

The inBounds action returns TRUE if the given index is contained within the list. Remember, list indexes are 1-based in order to be consistent with VGL array indexes.

                       { list contains: [a  b  c  a  d]             }
list.length            { statement value: 5                         }

There is also a length property containing the number of elements in the list. It should not be modified from outside the class actions, but it can be read.

                       { list contains: [a  b  c  a  d]             }
list.indexOf(a)        { action returns: 1                          }
list.indexOf(c)        { action returns: 3                          }
list.indexOf(e)        { action returns: 0                          }

list.lastIndexOf(a)    { action returns: 4                          }
list.lastIndexOf(c)    { action returns: 3                          }

list.includes(a)       { action returns: TRUE                       }
list.includes(e)       { action returns: FALSE                      }

The indexOf and lastIndexOf return the index of the first or last (respectively) occurrence of a given element in the list. If the element is not found in the list, the action returns 0. The includes action returns TRUE if the given element is included in the list—it is implemented by returning TRUE if indexOf is greater than 0.

Chaining

                       { list contains: [1  2  3  4]                }
list.push(5).unshift(6).remove(2).pop()
                       { action returns: 5                          }
                       { list now contains [6  2  3  4]             }

Many actions that result in the list being modified will return a reference to the modified list. This allows multiple actions to be chained together in the same statement.

Output

                       { list contains: [1  2  3]                   }
list.join(", ")        { action returns: "1, 2, 3"                  }
list.join("|")         { action returns: "1|2|3"                    }
list.join(" and ")     { action returns: "1 and 2 and 3"            }
list.toString()        { action returns: "[1,2,3]"                  }

The join action returns a string containing the list elements delimited by the specified string. The toString action returns a human-readable string containing the contents of the list suitable for logging or debugging purposes.

GitHub Repository

I have created a GitHub repository for this library and released it under the MIT License, which makes it easy to use in your own projects. If you have any problems with the library or suggestions on how it could be extended or improved, please create an issue or submit a pull request so that they can be tracked effectively.

If you find this useful and end up using it in a project, I'd love to know about it.

Text Reports III: Filling the Page

6 November 2018

Continuing to build on the first and second installments, let's try to reduce the wrapping used by expanding to fill the page width.

We'll continue to use the same data as before, but with the Moisture Content and Volatile Matter columns removed to give us room to expand. Thus we start with this:

                               Gross           
                               Caloric         
                    Fixed      Value at        
                    Carbon by  Constant        
Sample              Difference Volume   Ash    
Name    Date        wt. %      J/g      wt. %  
------- ----------- ---------- -------- -------
X24-03  01-Nov-2018 15.62      19985    0.25   
X24-02  31-Oct-2018 16.01      20004    0.23   
X24-01  30-Oct-2018 15.89      19996    0.24   

The output above only uses 48 columns of text, so you can see there's lots of room to grow.

How Wide?

We need to define how many columns of text can fit on a page. The code blocks that I use on this site comfortably fit about 80 columns—at least in my browser—so we'll use that. We'll set the page width next to where we set the minimum column width.

    // page width and minimum column width
    const minimumColumnWidth = 7;
    const pageWidth = 80;

Let's Get Organized

Our approach to filling the space will be by increasing the width of the most-wrapped header columns in an effort to reduce the wrapping. We define most-wrapped as the column header that has the greatest height.

To do this, let's first reorganize our code a bit to make it more modular. First, we'll make a word-wrap function based on the wrapping code we've used previously. Note that it returns an array of lines rather than a single string with newlines.

// wrap text to a specific length
function wrapText(str, len) {

    // split the string into words and use Array.reduce() to condense it into
    // lines that are all less than or equal to the target length
    return str.split(/\s+/).reduce((accumulator, current) => {

        // is this the first word?
        if (accumulator.length == 0) {

            // start a new line with the first word
            return [current];
        } else {

            // get the last line
            const lastLine = accumulator.pop();

            // add the next word to the last line
            const testLine = lastLine + ' ' + current;

            // is this line less than or equal to the target length?
            if (testLine.length <= len) {

                // add the line to the list
                return accumulator.concat(testLine);
            } else {

                // otherwise add the unmodified line to the list and start a
                // new line with the next word
                return accumulator.concat(lastLine, current);
            }
        }
    }, []);
}

Next, let's take our code to extract the units from the first data row of the grid and strip them out of all of the values and move it all into its own function which modifies the grid and returns the units. This is updated a little bit from the previous installment in order to modify the grid in place.

// extract the units from a string
function extractUnits(grid) {

    // check each cell in the first data row of the grid
    const units = grid[1].map(str => {

        // compare with the regular expression
        const matches = str.match(/\d+(\.\d+)?\s+(.+)/);

        // was there a match?
        if (matches !== null) {

            // return the units
            return matches[2];
        } else {

            // otherwise return false
            return false;
        }
    });

    // remove the units from the grid
    grid.forEach((row, i) => {

        // is this a data row?
        if (i > 0) {

            // if we have a unit, remove it from the cell
            grid[i] = row.map((x, j) => units[j] ? x.split(/\s/)[0] : x);
        }
    });

    // return the units; the grid is already modified
    return units;
}

Finally, let's take the code which renders the grid into text and put that into its own function. As an input, we'll give it the main grid data, the units, and the desired column widths.

// render the grid into text
function gridToLines(grid, units, columnWidths) {

    // wrap the column headers based on the calculated widths
    const headers = columnWidths.map((w, i) =>
        wrapText(grid[0][i], w).concat(units[i] ? units[i] : []));

    // how many header lines do we need?
    const headerHeight = Math.max(...headers.map(x => x.length));

    // pad our headers with blank lines so the content is bottom-aligned
    headers.forEach(h => {
        h.unshift(...new Array(headerHeight - h.length).fill(''));
    });

    // format as lines
    return new Array(headerHeight)
        .fill('').map((h, i) =>
            columnWidths.map((w, j) => headers[j][i].padEnd(w)).join(' '))
        .concat(columnWidths.map(w => ''.padEnd(w, '-')).join(' '))
        .concat(...grid.slice(1).map(row =>
            columnWidths.map((w, j) => row[j].padEnd(w)).join(' ')));
}

This Wide

Now that we have our existing code a little better organized, let's move into the new stuff. We want to incrementally make the tallest (defined by header height) column wider until it is shorter and we still fit on the page. Let's start by writing a function which, given a string and a target height, will tell us the smallest wrapping width to acheive the target height.

// find the minimum width to wrap text to a target height
function calcWidth(str, targetHeight = 0) {

    // start with the minimum width possible without breaking words
    let width = Math.max(...str.split(/\s+/).map(w => w.length));

    // calculation for the height
    const heightCalc = (str, width) => wrapText(str, width).length;

    // increase the width until we reach the target height
    while (targetHeight > 0 && heightCalc(str, width) > targetHeight) {
        width++;
    }

    // return the width used to hit the target height
    return width;
}

Note that in targetHeightWidth() we made targetHeight an optional parameter with a default value of zero. We'll reuse this function later to calculate the minimum possible width of a column.

Now we get to the meat—how do we expand things to fill the page width? The basic algorithm is this:

Find the "tallest" column header.
Expand it so that it's one row shorter.
If we're still narrower than the page width, repeat from the top.

Here's the code integrated in a function to calculate the column widths.

// find the optimal column widths
function calcWidths(grid, units, minimumColumnWidth, pageWidth) {

    // calculate the minimum column widths
    let columnWidths = new Array(grid[0].length)
        .fill(minimumColumnWidth)
        .map((width, i) => Math.max(...grid.map((row, j) =>
            j === 0 ? calcWidth(row[i]) : row[i].length).concat(width)));

    // iterate until we fill the page
    let newWidths = columnWidths.slice(0);
    do {

        // use the new widths
        columnWidths = newWidths;

        // find the tallest column
        let tallest = columnWidths
            .map((w, i) => ({ i, h: wrapText(grid[0][i], w).length }))
            .sort((a, b) => a.i - b.i)
            .sort((a, b) => b.h - a.h)[0];

        // if our tallest column has a height of 1, bail on the loop
        if (tallest.h <= 1) break;

        // make a copy of the column widths
        newWidths = columnWidths.slice(0);

        // update the width of the tallest column
        newWidths[tallest.i] = calcWidth(grid[0][tallest.i], tallest.h - 1);

        // repeat if we're still under the target width
    } while (newWidths.reduce((a, c) => a + c + 1, -1) <= pageWidth);

    // return the column widths
    return columnWidths;
}

It's not an optimal algorithm, but it gets us close enough to be functional. Perhaps we'll improve on it in a later iteration.

Put It Together

We've refactored the code into a bunch of functions, so let's put it all together. Here's the main code that calls the functions above:

// page width and minimum column width
const minimumColumnWidth = 7;
const pageWidth = 80;

// populate the grid from our input data
const grid = [Object.keys(input.data[0])];
input.data.forEach(row => {
    grid.push(grid[0].map(key => row[key]));
});

// extract the units from the grid
const units = extractUnits(grid);

// calculate the column widths
const columnWidths = calcWidths(grid, units, minimumColumnWidth, pageWidth);

// print out the grid
console.log(gridToLines(grid, units, columnWidths).join('\n'));

Output

With a specified page width of 80 columns, this is the output that we get:

                                                   Gross Caloric Value        
                        Fixed Carbon by Difference at Constant Volume  Ash    
Sample Name Date        wt. %                      J/g                 wt. %  
----------- ----------- -------------------------- ------------------- -------
X24-03      01-Nov-2018 15.62                      19985               0.25   
X24-02      31-Oct-2018 16.01                      20004               0.23   
X24-01      30-Oct-2018 15.89                      19996               0.24   

You can download the complete code here.

Next Steps

In the next iteration, we'll handle wrapping of the entire table if the columns are too wide to fit in the width of the page.

Text Reports II: Extract Units

5 November 2018

Building on the first installment, let's improve the formatting by pulling the units out of the rows and putting them in the header instead. We'll use the same data as before.

Define Pattern

Using a regular expression, we can easily determine if a cell contains a numeric value with a unit and then extract both parts. Let's create a function that takes the cell contents as an input and returns the unit (or false if no units are found).

// extract the units from a string
function extractUnits(str) {

    // compare with the regular expression
    const matches = str.match(/\d+(\.\d+)?\s+(.+)/);

    // was there a match?
    if (matches !== null) {

        // return the units
        return matches[2];
    } else {

        // otherwise return false
        return false;
    }
}

Extract Units

Now that we have a function that will parse the string and pull out the units if there are any, let's create an array that contains the units for each column (based on the first row). We are making the assumption that the units will be the same in every cell in a given column, and that the first data row exists with no blank values.

// make a list of units based on the first row of data
const units = grid[1].map(value => extractUnits(value));

Now that we've determined what the units are, we need to remove the units from the grid.

// remove the units from the grid
grid = grid.map((row, i) => {

    // is this the header row?
    if (i == 0) {

        // don't change anything
        return row;
    } else {

        // if we have a unit, remove it from the cell
        return row.map((x, j) => units[j] ? x.split(/\s/)[0] : x);
    }
});

Column Widths

With the units moving to the header, we need to make sure the column widths are all wide enough. Let's update the width calculation to make sure that the units are included when we're looking at the header row words. We consider the whole unit string as a "word" in this case as we do not want to break in the middle of the unit string.

// calculate the column widths
let columnWidths = new Array(grid[0].length)
    .fill(minimumColumnWidth)
    .map((width, i) => Math.max(...grid.map((row, j) => {
        
        // is this the first row?
        if (j === 0) {
            // find the width of the longest word (including the units)
            return Math.max(...row[i].split(/\b/)                      // words
                                     .concat(units[i] ? units[i] : []) // units
                                     .map(w => w.length));             // lengths
        } else {
            // find the width of the whole contents
            return row[i].length;
        }
    }).concat(width)));

Now that we've extracted the units and taken them into consideration when sizing the columns, let's update the code to print the units as part of the column headers. The units will come out on the bottom line of the header.

// wrap the column headers based on the calculated widths
let headers = columnWidths.map((w, i) => {
    return grid[0][i].split(/\s+/).reduce((accumulator, current) => {
        if (accumulator.length == 0) {
            return [current];
        } else {
            const lastLine = accumulator.pop();
            const testLine = lastLine + ' ' + current;
            if (testLine.length <= w) {
                return accumulator.concat(testLine);
            } else {
                return accumulator.concat(lastLine, current);
            }
        }
    }, []).concat(units[i] ? units[i] : []); // add units
});

Output

With our changes, the output now looks like this:

                                                 Gross           
                                                 Caloric         
                                      Fixed      Value at        
                    Moisture Volatile Carbon by  Constant        
Sample              Content  Matter   Difference Volume   Ash    
Name    Date        wt. %    wt. %    wt. %      J/g      wt. %  
------- ----------- -------- -------- ---------- -------- -------
X24-03  01-Nov-2018 4.85     79.29    15.62      19985    0.25   
X24-02  31-Oct-2018 4.52     80.91    16.01      20004    0.23   
X24-01  30-Oct-2018 4.68     80.03    15.89      19996    0.24   

You can download the complete code here.

Next Steps

In the next iteration, we'll improve the layout by expanding to fill the page width.

Column Formatting for Text Reports

1 November 2018

Here's the scenario: we have a system that outputs plain-text reports with data formatted into a table. Our raw data comes in JSON format; we'll use this as the input to our program:

{
    "data": [
        {
            "Sample Name": "X24-03",
            "Date": "01-Nov-2018",
            "Moisture Content": "4.85 wt. %",
            "Volatile Matter": "79.29 wt. %",
            "Fixed Carbon by Difference": "15.62 wt. %",
            "Gross Caloric Value at Constant Volume": "19985 J/g",
            "Ash": "0.25 wt. %"
        },
        {
            "Sample Name": "X24-02",
            "Date": "31-Oct-2018",
            "Moisture Content": "4.52 wt. %",
            "Volatile Matter": "80.91 wt. %",
            "Fixed Carbon by Difference": "16.01 wt. %",
            "Gross Caloric Value at Constant Volume": "20004 J/g",
            "Ash": "0.23 wt. %"
        },
        {
            "Sample Name": "X24-01",
            "Date": "30-Oct-2018",
            "Moisture Content": "4.68 wt. %",
            "Volatile Matter": "80.03 wt. %",
            "Fixed Carbon by Difference": "15.89 wt. %",
            "Gross Caloric Value at Constant Volume": "19996 J/g",
            "Ash": "0.24 wt. %"
        }
    ]
}

Create Grid

Using the built-in JSON.parse() function, we take the above JSON and interpret it as a JavaScript object called input. Next, we need to transform it into a simple 2-dimensional array of strings that we'll refer to as our grid. The first row of the grid is made up of the column names, and each row after that is data.

// initialize our grid with the first row as the keys of the input data objects
let grid = [Object.keys(input.data[0])];

// load the rest of the rows into the grid
input.data.forEach(row => {
    grid.push(grid[0].map(key => row[key]));
});

Minimum Column Widths

Let's continue by calculating the minimum column width for each column in the grid. We'll say that we don't want any columns narrower than 7 characters wide. We also don't want to wrap any of the actual data values, so we'll break down the column headers (in grid[0]) into words but not the rest of the rows.

// minimum column width
const minimumColumnWidth = 7

// calculate the column widths
let columnWidths = new Array(grid[0].length)
    .fill(minimumColumnWidth)
    .map((width, i) => Math.max(...grid.map((row, j) => {
        
        // is this the first row?
        if (j === 0) {
            // find the width of the longest word
            return Math.max(...row[i].split(/\b/).map(w => w.length));
        } else {
            // find the width of the whole contents
            return row[i].length;
        }
    }).concat(width)));

Print Column Headers

Great, now that we have the width of each column, we can continue with outputting the column header lines. Because of the wrapping, some column headers will take more lines than others, so we'll take care to pad the header lines such that the column headers are aligned to the bottom.

// wrap the column headers based on the calculated widths
let headers = columnWidths.map((w, i) => {
    return grid[0][i].split(/\s+/).reduce((accumulator, current) => {
        if (accumulator.length == 0) {
            return [current];
        } else {
            const lastLine = accumulator.pop();
            const testLine = lastLine + ' ' + current;
            if (testLine.length <= w) {
                return accumulator.concat(testLine);
            } else {
                return accumulator.concat(lastLine, current);
            }
        }
    }, []);
});

// how many header lines do we need?
const headerHeight = Math.max(...headers.map(x => x.length));

// pad our headers with blank lines so the content is bottom-aligned
headers = headers.map(h =>
    new Array(headerHeight - h.length).fill('').concat(...h));

// create the headers
let lines = new Array(headerHeight).fill('').map((h, i) => 
    columnWidths.map((w, j) => headers[j][i].padEnd(w)).join(' '));

// create the separator lines
lines.push(columnWidths.map(w => ''.padEnd(w, '-')).join(' '));

Print Data Rows

Printing the data rows is a bit simpler as we do not do any wrapping, although we still pad the end of each cell with spaces using String.padEnd() to help everything line up correctly.

// compose each data line
grid.forEach((row, i) => {

    // skip the first line, we already have the headers
    if (i > 0) {
        lines.push(columnWidths.map((w, j) => row[j].padEnd(w)).join(' '));
    }

});

// print out the lines
console.log(lines.join('\n'));

Output

Here's what the program outputs to the console:

                                                       Gross               
                                                       Caloric             
                                           Fixed       Value at            
Sample              Moisture   Volatile    Carbon by   Constant            
Name    Date        Content    Matter      Difference  Volume    Ash       
------- ----------- ---------- ----------- ----------- --------- ----------
X24-03  01-Nov-2018 4.85 wt. % 79.29 wt. % 15.62 wt. % 19985 J/g 0.25 wt. %
X24-02  31-Oct-2018 4.52 wt. % 80.91 wt. % 16.01 wt. % 20004 J/g 0.23 wt. %
X24-01  30-Oct-2018 4.68 wt. % 80.03 wt. % 15.89 wt. % 19996 J/g 0.24 wt. %

You can download the complete code here.

Next Steps

In the next installment, we'll update the program to make the output more readable by extracting the units out of the cells and putting them in the column headers.

Removing Facebook Tracking Params

22 October 2018

Facebook recently began adding a fbclid parameter to external links. Using the Neat URL Firefox Add-on and the Neat URL Chrome extension, you can easily remove these and other similar tracking parameters.

Install the Neat URL Firefox add-on or the Neat URL Chrome extension, depending on which browser you are using. I have only tried the Firefox add-on as I am not a regular Chrome user.
Go to the add-on preferences by right-clicking on the ?_ icon selecting Preferences. This will bring up the Firefox Add-ons Manager with the Neat URL preferences page open.
Scroll down a bit to the Blocked parameters box. It should already be prepopulated with a lot of parameters. At the time of this writing, fbclid is not in there by default, but I won't be surprised when the author adds it.
If it's not in there already, add fbclid to the list. I added it in the middle with the other fb_* parameters.
Make sure you hit the Save preferences button at the bottom of the page.

Once you've completed the above steps, you can test by going to https://ianc.blog?fbclid=foo. The extension modifies the request before it is sent to the server, so you should see the address bar show https://ianc.blog right away. You're all set!

Ian Cooper

Functional VGL List Class

API Summary

Create a new list

Add elements to the list

Access elements of the list by index

Removing elements of the list

Order elements of the list

Slice and splice

Filter

Bounds

Contents

Chaining

Output

GitHub Repository

Text Reports III: Filling the Page

How Wide?

Let's Get Organized

This Wide

Put It Together

Output

Next Steps

Text Reports II: Extract Units

Define Pattern

Extract Units

Column Widths

Header

Output

Next Steps

Column Formatting for Text Reports

Create Grid

Minimum Column Widths

Print Column Headers

Print Data Rows

Output

Next Steps

Removing Facebook Tracking Params