A piggy bank of commands, fixes, succinct reviews, some mini articles and technical opinions from a (mostly) Perl developer.

How to do logging and monitoring

A Logging Standard

Format

Use JSON

  • Human readable(ish)
  • Machine readable
  • Abundant tooling
    • jq
    • Document stores like Elastic Seach
    • almost every programming langauge

Common Data Structures

Interoperability enhanced by adopting consistent data structures (detailed below).
Include Provenance IDs.

Provenance IDs

What is a Provenance ID?

  • Provide a Universally Unique ID (but not a UUID)
  • Provide a calling context
    • Clear relationship between a parent ID and its children.

ID Generation

If a provenance ID is provided with a request, use it as the parent of all work.
If no ID is found, generate a new ID from your service's Base ID and use it.

Adoption patterns

As services add support, we will be able to make connections between systems, but no system will fail to track traffic while waiting for upstream services to implement tracking.

Web Service Implementation

HTTP/HTTPS services need to check for the X-Bean-ProvenanceId header.

Structure

  • Base ID 
    • Any alpha-numeric string
    • Something like "SuperProductTypeCGI"
    • One scheme that works well is "Product-Service" like "ABC-Superprod"
  • Unique Element
    • Any string that provides uniqueness
    • UUID is convenient, base64 encoding can keep it shorter.
    • But any string that is unique across multiple systems running with the same base ID.
  • Request Path
    • Each child request adds a segment with a monotonically increasing value.
    • So the first child gets any natural number.  Typically 1
    • Any subsequent children must get a larger number. Typically the previous value, incremented by one.
    • Local and remote children can have separate or shared counters.
    • New items are appended as a comma delimited list.

ABNF Grammar

provenance-id = base-id "." unique-element request-path

base-id = ALPHA *safe-character

unique-element = *safe-character

request-path = ":" request-list
request-path =/ ""

request-path = local-request request-tail
request-path =/ remote-request request-tail

request-tail = "," 
request-tail =/ ""

local-request = number

remote-work = number "r"

number = non-zero-digit *DIGIT

non-zero-digit = "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"

safe-character = ALPHA / DIGIT / safe-symbol

; Any printing, non-alphanumeric character isn't ,: or whitespace
safe-symbol =  "!" / DQUOTE / "#" /"$" / "%" / "^" / "&" / "'" / "(" ")" / "*" / "+"
safe-symbol =/ "-" / "." 
safe-symbol =/ "/"
safe-symbol =/ "<" / "=" / ">" / "?" / "@"
safe-symbol =/ "[" / "\" / "^" / "_" / "`"
safe-symbol =/ "{" / "|" / "}" / "~"

Example

Base ID:  ABC-Superprod
Unique Element: 8ed56f13-bd6c-4f47-a8eb-94604e07e6ac
  • New request from customer: ABC-Superprod.8ed56f13-bd6c-4f47-a8eb-94604e07e6ac
  • Internal call to client config: ABC-Superprod.8ed56f13-bd6c-4f47-a8eb-94604e07e6ac:3
  • Client config looks at User config: ABC-Superprod.8ed56f13-bd6c-4f47-a8eb-94604e07e6ac.3.12r

Log Entry Objects

Log Item Types

Log Entry

[ Log Header, Log Body ]

Log Header

[   ISO8601-Time with milliseconds in UTC,
    Log Type Name, 
    hostname,
    program name,
    process ID, 
    thread ID,
    Provenance ID,
    Context name
]

Log Body

{   Log Type Name => Log Type Data,
    file => name of file where log entry generated,
    line => line number of log entry,
    method => name of method/function that generated the entry,
}

Event

Type Name: EVENT
{    message => A string describing the event
}

Unit Of Work

Type Name: UOW
{   start    => ISO6801 date time stamp with milliseconds
    end      => ISO6801 date time stamp with milliseconds
    duration => duration in milliseconds, integer value
    result   => result type identifier
    metrics  => dictionary, keys are strings values are metric objects
    values   => dictionary, keys are strings, values are strings
}

Metric

{    units => string
     count => integer, the number of times the metric was incremented
     total => sum of all values assigned to the metric
}   

Result Types

NORMAL -  successful completion of work
INVALID - Unit of Work terminated improperly
FAILURE - Unit of Work could not be completed, for example, the requested file does not exist.
EXCEPTION - Unit of Work generated an exception and could not be completed.

What Next?

Early to Mid-November

  • Review Object definitions
  • Prepare JSON Schema for object definitions
  • Prepare enhanced demo.
  • Meet with Tech Teams (US and UK) and cover proposal.

The Future

  • Collate feedback from Tech Teams.
  • Finalize object and document all data structures
  • Adapt Log::Work to work with existing BB logging facilities
  • Work with Operations to ensure that their tools can digest generated logs
  • Work with other teams to get adoption of the standard
  • Create Provenance ID tooling in Node.  
    • Should client side code be able to generate Provenance IDs?
      • Lean strongly towards "no" - risk of duplicate IDs being injected maliciously.  
      • Instead use session IDs as a correlation ID for UI interaction.
      • Extending tracking would be nice, but need a way to do it safely.
  • Set up proxies/firewalls to strip provenance ID headers from outgoing requests/responses.
    • May need exceptions to handle cases where data round trips through customer servers/services.

See Also

^ Thanks to Hypno-Mark for the section above ^
Types of logging events

0. Requests & responses

Code lives at framework level, should be tracked in ElasticSearch ONLY. 
For all apps.

1. Unexpected exception

Caught by framework, i.e. application crashes, should be tracked in Sentry
Fail-safe catch-all, very useful

2. Expected fatal exception

Programmed in deliberately, "should never happen, but just in case", should be tracked in Sentry
Quite rare, but always some action required by devops.

3. Expected non-fatal, actionable error

Programmed in deliberately, "action required by devops", should be tracked in Sentry

4. Expected non-fatal non-actionable error

Programmed in deliberately, "no action required", should NOT go to Sentry, only to ElasticSearch.
3 & 4 are often confused with each other, leading to an alerts system polluted with non-actionable events.

5. Expected WARN or INFO level events

Also desirable to track these. Programmed in deliberately, should NOT go to Sentry, only to ElasticSearch.

Systems

Which systems contain data about events?
  • Grafana - metrics only (i.e. a number not a string, so no error messages here).
  • EKK stack (ElasticSearch, Amazon Kinesis, and Kibana) - receives ndjson files.
  • Sentry - supposed to handle urgent events that require an action, but may be polluted over time with non-actionable events.
  • S3 - some applications may send a complete decoded copy of the request/response to an S3 bucket for future reference

^ Will's work above ^

Reserved keywords - Do Not Use these in your code

Do not use these names, you will get confusing errors...

Perl:

  • sub import


Mojolicious:

  • stash('data', ...)



Notes on "The Principles of Clean Architecture" by Uncle Bob Martin

Notes on "The Principles of Clean Architecture" by Uncle Bob Martin


Shows Rails app structure on slide.  Why do I know this is a Rails app?  Why doesn't it tell me what the application does?  What's important about the app is hidden by the structure of the framework.  Why is the web (merely a delivery mechanism) central to the application structure.  Back in the 90's we thought the web changed everything, but it didn't.  We were duped into thinking that the web was an architecturally significant part of an application.  Shows blueprints of library and church.  The use is clear from the structure.

Suggests that Architecture is about Intent.  It's not about tools or libraries, those are details that should be hidden.

Recalls Ivar Jacobsen's classic book Object-Oriented Software Engineering: A Use Case Driven Approach.  Use Cases are like User Stories.  Best kept simple and high level.

Application specific rules (use cases) may be represented with "interactor" objects.
Cross application rules are represented in "entities".

Interface/Boundary objects/classes (act like java interfaces) provide a way for external systems to present requests.

Request model is passed thru boundary into interactor.  Interactors process request and control entities and gather results and emit response object thru interface to delivery system.

Looks like MVC but is NOT because MVC is misunderstood.

MVC is oldest pattern (predates patterns).  MVC is not an architecture.

Defines MVC as originally described in the 70s.  Model object hides a very small business rule.  There is a controller that is connected to an input device like a keyboard or mouse.  Controller sends input to model.  The view registers with the model, which sends state updates to the view, to trigger updates.  MVC was done in the *small*.  You had an MVC for *a button*.

Now we have MVC for whole pages etc, because:  What happens to any kind of good idea in software is that people know that it's a good idea and they also know that their ideas are good.  And so they figure that their ideas are that idea...and they completely lose the original idea.

Problem in app scale:  Poor boundaries.  Not clear what is view, controller or model logic.  Unless you keep extremely careful discipline the concerns will get mixed.

Talks about Plug In pattern.

Dependencies flow only one direction between a plug in and the application.

Changes in a plug in can not impact the core aplication.

Protect you business rules/core application by treating UI and data storage as plug ins.

Things that change frequently should be plugins.

Talk about model View Presenter.  Presenter gets data from app across plugin boundary.  Feeds data to view model.  View renders view model and updates presenter with UX events.

What about the Database?  We put it central to a lot of our architecture.  Because Oracle fooled us all.

The DB is a storage detail and should be abstracted away.

Use a boundary class that handles every query you might want to make.  The DB code should all live across the plug in boundary.  All SQL, etc generated in the plug in.

This architecture helps with testing, because you can easily remove the DB.

Case study:  Deferring a feature out of existence.

Worked on project called Fitnesse (a wiki, which shows test results).  They were going to build it based on MySQL, but they decided to wait and worked on Wiki text to HTML and used a mock DB.  Next they needed to do links and navigation, and so thought they were going to do DB storage.  Instead they just used a mock class that stored pages in RAM.  Then they needed to do start up and shut down, and so they were going to add DB support.  Instead they made their mock object serialize the hash to the file system.  That worked well enough, so they eventually shipped the product without database support.

Eventually they added MySQL support for a customer, but even they gave up on the DB.

Lessons:

*A good architecture allows major decisions to be deferred.*

*A good architecture maximizes the number of decisions NOT made.*

*Achieve that by using a plugin model.*

Use cases form the center of the application.

UI, Datastorage, Frameworks, etc are all external to the use cases.

Remember:  Framework authors will screw you.  You make a tremendous commitment to a framework when you adopt it.  The framework author makes NO commitment to you.

Use frameworks because they are powerful and can make you productive.

But don't follow the examples, because the framework author is going to be comfortable with too much coupling to their framework.

Frameworks are great, but keep them at length.

-- Notes by Mr Swayne.

Use cpm to manage your Perl dependencies

cpm is much faster than cpanm because it makes HTTP requests in parallel and caches modules between runs. The output is also infinitely more clear than cpanm.

# Define the local lib directory
unset PERL5LIB; unset PERL_LOCAL_LIB_ROOT; eval $(perl -Mlocal::lib=./local)

# Install cpm using cpanm
cpanm -nf -L local App::cpm # 35 modules

# Then manually install any problematic modules (as necessary) that have been moved in/out of core
cpm install Module::Build
cpm install Module::Pluggable
cpm install Archive::Extract
cpm install CGI::Cookie
cpm install CGI

# Finally fetch all the CPAN and other dependencies
export DARKPAN=https://username:passwork@mydarkpan.example.com
cpm install --resolver 02packages,$DARKPAN --resolver metadb

Note: --resolver is experimental. The standard argument is --mirror

Thanks HypnoMark

# Another form
cpm install --show-build-log-on-failure --resolver metadb --resolver 02packages,$PINTO_MIRROR --feature=client

# s/PINTO_MIRROR/DARKPAN/

Python basics

Most basic: Use the latest Python 3 (or higher)

Use multiple lines in a one-liner with \n:
echo -e "import sys\nsys.exit('Error')" | python3



Coercions in Perl

Lately I've been using Moo instead of Moose, and I came across Type::Tiny Coercions, but I found the documentation difficult to follow.

My colleagues figured out a working example:

"You need an enum type that allows only 'yes' and 'no'.  It's also important that you order your coercions with the most specific first.  (Bool before Str)"

"The arguments inside the `plus_coersions( ... )` are basically `$FromType, $coercion_sub`"

Thanks Nelo and Mark S.

On which table should the foreign key be defined?

For simple relational databases the foreign key is usually defined on one table only:
  • For join tables (linking tables), put the foreign keys on the join table itself.
  • For lookup tables, don't put the foreign key on them, put it on the other (main) table.

Flat file vs database

Flat file
  • Easy to set up, only have to consider local file permissions
  • Easy to implement in an ad-hoc way, ideal for a prototype
    • Plain text for very simple things like a list
    • JSON/YAML/Perl for complex data structures
  • Doesn't work when the app is load balanced across multiple servers
  • Not automatically backed up
  • Amazon S3 is relatively expensive
  • Have to make your own model
Database
  • Requires up-front schema design - more work, but forces you to consider design of data 
  • You can put business logic in the ResultSet models
  • Requires an instance to be provisioned
  • Easy cross-referencing of data
  • Works when the app is load balanced across multiple servers
  • Backup-as-a-service (i.e. replication)
  • Amazon RDS is cheaper than S3
  • Get the model for free with ORM
Conclusion

For production services that have redundancy (load balanced), always use a database unless the overhead of setting one up for the first time is considered too high for the business.

Database schema management systems

A list:

Some icons for webpages

A list:

CSS/Layout basics

CSS/styling ideas for developers with little front-end design experience:

  • use Bootstrap
  • get someone with experience to do the graphical design, then implement it using CSS
  • turn a list into a grid of links with JQuery
  • If you want to build something from scratch, the basic principles as I have come to understand them are:
    • (1) decide on an element type, e.g.
      • <span> for in-line elements (in a horizontal line),
      • <div> for a vertical line of elements,
      • <ul> for a list, etc.
    • (2) fiddle with the CSS attributes like: float, clear, overflow, border, padding to make it look like you want.
    • (3) use javascript if anything needs to appear/disappear based on user interaction.
      • But I usually start with a framework layout like Bootstrap, add pre-made Javascript libraries for widgets and only do the above steps for any little tweaks needed.
  • The current state of the art is the "CSS Grid" layout (for 2D grids).
  • Enlightenment and inspiration: CSS Zen Garden
    • examples of the same content transformed with CSS to appear totally different
  • Mozilla Developer Network (MDN) has a good CSS reference
  • CSS tricks has useful guides
  • Quirks Mode tells you which browsers support which features.

Mojolicious basics

How to do common stuff.

Run an app:

MOJO_USERAGENT_DEBUG=1 perl -I lib ~/path/to/morbo --verbose --watch lib --watch local bin/app.pl

View existing routes:

perl -I lib bin/app.pl routes

Find the code that defines the routes:

grep -r '$r->get' .
grep -r '$r->post' .

Install database:

perl -I lib bin/app.pl dbic_migration --action=install

Write a script that uses the app config, etc:

See Mojolicious::Command

Debug Test::Mojo:

print $t->tx->res->body; # See also guide to debugging

Catch unexpected exceptions:

Dump out 2nd-level routes:
my @routes = sort map { $_->to_string } map { @{ $_->children } }
    grep { $_->name eq 'distro' || $_->name eq 'candidates' }
    @{ $t->app->routes->children };

Best Data::Dumper configuration for debugging Perl

This:

  local $Data::Dumper::Indent=0;
  local $Data::Dumper::Varname='';
  local $Data::Dumper::Terse=1;
  local $Data::Dumper::Pair='=>';
  local $Data::Dumper::Sortkeys=1;
  local $Data::Dumper::Quotekeys=0;

(source)

Or:

use Mojo::Util 'dumper';

The best way to embed code snippets in your blogger posts

Use gist.github.com

That is all.

Its very easy, its hosted by Github.
I like git.

Guide to debugging Mojolicious applications

When MOJO_MODE=development or is not set, you should see exceptions rendered in the browser. But if you don't, try creating a templates/exception.html.ep template that contains this: %= $exception

You can also create error templates for different environments:
  • templates/exception.production.html.ep
  • templates/exception.development.html.ep

Mojo provides useful hooks that can be used to dump out debugging info:

How to use local::lib to install Perl modules locally

First:

cd project_dir
eval $( perl -Mlocal::lib=local )

Install dependencies:

cpanm -L local --installdeps .

Or individual modules:

cpanm -L local Some::Module

Wrap those magic data structures with a class

You understand the the dangers of using magic numbers in your code. And magic strings are another face of the same issue. When you use a lot of strings that have special meanings, it makes your code smell bad, i.e. it's an indication of low quality. These are not user messages or log messages, but rather a fixed string that if mis-typed will break the systems functionality. But it's not just the basic variable types that are magic. Arrays and hashes can also be magic, in the worst possible way.

If you find yourself using a hash in a lot of different places, this can be thought of as a magic hash. It may be a simple hash or it may contain many nested arrays and other sub-hashes. Every operation on the hash has to be done in exactly the right way or it won't work. It's very easy to perform a operation wrong and get unexpected results that won't be detected immediately. You're writing a significant amount of code to read and write the data within the hash, and to catch errors, and you're likely to be creating bugs too. The more code you write, the more bugs you create. This similar, duplicated and boring boilerplate code is spread around all over the application wherever the hash is used. It's also likely that you will need to use magic strings for the hash keys, with all the problems they bring. Even if you use an array at the top level, there may be hashes within it.

The solution is to put a class around the hash, so that you only need to write the hash manipulation code once, and can thoroughly unit test it. All the code related to this data is encapsulated in one place. All calling code will interact with the class interface instead of the hash directly. This is an example of object-oriented development, where objects are passed around and manipulated instead of raw data structures.

P.S. Even without using a class, replacing magic strings with constants would be a serious improvement. Maintainers will be unable to accidentally get a string wrong without seeing an error message that makes it very obvious what is wrong. It's a more foolproof way to develop.

Ruby debugging gems

A list of debugger add-ons for ruby:
  • debug (old, built-in)
    • ruby -rdebug foo.rb
    • # works
    • # but can't view code around current line automatically like Perl's  {{v
  • pry (no step-through)
    • require 'pry' # at the top
    • binding.pry # in code to define a breakpoint
    • ruby foo.rb
  • pry with step through:
  • pry-byebug (requires ruby >= 2.2.0)

Ruby iterator cheat sheet

A list of many useful iterators in Ruby for hashes and arrays:
  • each - Do block for each item. Works on hash or array. Can use 'break'. Return nothing.
    • each_index - Same as each but pass in the index to the block
  • collect (aka map) - Do block for each item. Return array. Same as Perl map.
    • collect! / map! - Same as map, but modify array in place
  • select (aka find_all) - Do block for each item. Return array if block is true. Same as Perl grep.
    • select! (aka keep_if) - Same as select but modify array in place
    • reject - Opposite of select. Return if block is false.
    • reject! (aka delete_if) - Same as reject but modify array in place
  • inject(xyzzy) (aka reduce) - Do block for each pair of items, starting with xyzzy & first item.
  • compact - Return array with nil items removed
    • compact! - Same as collect but modify array in place
  • delete(foo) - Remove all items that are equal to foo
  • clear - Remove all items
  • empty? - Return true if array has no items
  • count - Return number of items
  • length (aka size) - Return number of items
  • flatten! - Modify array to be 1-dimensional
  • shuffle! - Shuffle in place
  • sort! - Sort in place
  • bsearch - Find item using binary search
  • combination(3) - Return all combainations of length 3
  • permutation(4) - Return all permutations of length 4
  • sort_by! - Sort in place using keys in block
  • transpose - swap rows and columns
  • zip - step through two arrays at the same time


(sourcesource, source)

URLs open in Safari instead of Chrome from Mac OSX Terminal

To set the correct default browser on OSX, go to System preferences | General | Default web browser.

If the default browser is correct but Terminal is still not behaving correctly, then change the default browser, and change it back again.

TIOA-TIOA FTW

(source)

Detailed Ruby notes

A list of somewhat advanced Ruby topics:
  • Scope of constants, and namespace: link, link.
  • Keyword arguments: link.

How to apply a stash by name in git

1) In git it's easy to retrieve stashes by number like this:

git stash apply stash@{5}

But that number changes as you add more stashes to the list.

2) Instead you could perform some jiggery-pokery in bash to look up the stash by name:

git stash apply $(git stash list | grep "$NAME" | cut -d: -f1)

You can apply multiple stashes as long as they don't overlap.

3) Or you could save the commit and refer to it by tag:

git commit -m'The usual changes to aid debugging'
git tag foo
git reset --hard HEAD^
git cherry-pick -n foo

(-n is --no-commit)

This way if they overlap you can use git's merging/conflict mechanism.

(source)

Adventures in Scala 1.0

Mac OSX:

brew install sbt@1

Try the "Hello, World" example from scala-lang.org:

object HelloWorld {
  def main(args: Array[String]): Unit = {
    println("Hello, world!")
  }
}

Run it:

Go [15:46:  learning-scala$] sbt run
[warn] No sbt.version set in project/build.properties, base directory:~alt/learning-scala
[info] Set current project to learning-scala (in build file:~/alt/learning-scala/)
[info] Compiling 1 Scala source to ~/alt/learning-scala/target/scala-2.12/classes ...
[info] Non-compiled module 'compiler-bridge_2.12' for Scala 2.12.4. Compiling...
[info]   Compilation completed in 11.024s.
[info] Done compiling.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil (file:~/.sbt/boot/scala-2.12.4/org.scala-sbt/sbt/1.1.0/protobuf-java-3.3.1.jar) to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[info] Packaging ~/alt/learning-scala/target/scala-2.12/learning-scala_2.12-0.1.0-SNAPSHOT.jar ...
[info] Done packaging.
[info] Running HelloWorld 
Hello, world!
[success] Total time: 16 s, completed 9 Feb 2018, 15:47:03


Well, it worked. But that's a helluva lot of warnings for very little code!

A colleague informs me how to set up another Hello World that's known to work:

sbt new https://github.com/scala/scala-seed.g8
(enter name: hello)
cd hello
sbt run

...but I still get the warnings, although the code does output "hello" as expected.

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil (file:~/.sbt/boot/scala-2.12.4/org.scala-sbt/sbt/1.1.0/protobuf-java-3.3.1.jar) to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil

It seems I've just joined Scala at a time when 1.0 has recently been released, and lots of existing code does not exactly make the compiler happy yet.

Note: The warning was not displayed the second time I ran the program. Somehow it remembers.

To do:
  • Find out how to avoid the warning.
  • Find out how to repeat the warning!

Why should you validate parameters to all subroutines?

A list of reasons:

  • If you don't catch bad data immediately, it will be propagated onwards and the program may fail in an unexpected way
    • The problem and may not even be caught by your tests if the bad data is passed on to an external system and not checked by you


Use MooseX::Params::Validate, or Params::Validate.