A recent kerfuffle over in Yahoo Land has got me thinking about the mis-application of statistics.
http://allthingsd.com/20131108/because-marissa-said-so-yahoos-bristle-at-mayers-new-qpr-ranking-system-and-silent-layoffs/
The assumption is that employee performance follows a bell curve. But you only get a bell curve in your sample of a larger population if your sample is random and the larger population is also a bell curve.
What this means is that if your employee's performance truly follows a bell curve, then you should fire your HR department and completely replace all those expensive people with a random process that just picks a resume out of the hat. Frankly, I suspect you'd be hard pressed to tell the difference in most organizations.
Friday, November 8, 2013
Thursday, October 10, 2013
Person Capital and the Government Shutdown
One of the mantras of the information age is
"People are Capital"
In other words, the experience and knowledge of your workforce is a significant capital investment. It is the only capital investment that can walk out the door whenever it get's pissed off enough and right now all the best people employed by either directly or indirectly by the federal government are polishing their resumes and checking just exactly who they know on LinkedIn. I know because that's exactly what I've been doing.
The great lie that has been put out there is that government jobs are cushy and easy and only incompetent people would take them. Well, one way or another I've been employed by the Federal government indirectly for the every job I've ever had since I stopped waiting tables, cooking or washing dishes in 1987. The reality is that most people in the government work long hours for relatively low wages given their expertise and education. Despite every attempt to belittle it on Fox News, Public service is alive and well. Many of the people not getting paychecks are hard working competent people that
could make more money in the private sector, but choose to work for the federal paycheck because they believe in public service. For me, it matters that what I do advances the knowledge of mankind in some way.
I remember a conversion in high school (1978) about the future and I said my "dream job" would be to work at a "national lab doing physics research". One of my friends that had more exposure to the larger world said "Those are terrible jobs".
Monday, September 16, 2013
Chef Encrypted Data Bags Are A Code Smell.
Chef databags are a centralized key/value store for json based data in the Chef environment. Encrypted databags are exactly what is advertised on the box, the data is encrypted with an external key before uploading to the data store.
Encrypted databags are meant to help solve the problem of dealing with identifier information that you want automatically installed, but that you need to keep private. Examples are database login passwords, the private key of a public key pair and other tokens that allow a server access to additional services.
Encrypted databags provide protection against two kinds of access:
Implicit Access:
This is defined to mean access outside the chef protocol to the underlying data store on your chef server. If you're using a service like Hosted Chef or just practicing general good data hygiene, DB access to the server should only expose "public" data if at all possible.
Explicit Access:
Explicit access is via the standard chef protocol were using the ssl keypair created as part of the chef client bootstrap to access data in the chef datastore or via a knife command using an admin key pair.
The problem with encrypted databags in both use cases is that they only appear to solve the underlying problem by moving it out of the chef workflow. In order to use either protection effectively, you need to create an "out of band" system to manage the shared secret required to access the databag.
In the implicit access protection case, this is likely a worthwhile and manageable cost since the key only needs to be secret from the chef server. However, this only protects against read-only implicit access. If the bad guy has write implicit access to your chef datastore, the game is over on your next client chef run.
The explicit access case is where the real problems arise. In this case, the intention is to prevent some admins/hosts from access to private data. This however creates an unfortunate side effect in that access to the shared secret becomes an "invisible access control list". Using an encryption key as an authorization object creates problems since you destroy the chain of identity. All you ever know is that "somebody with the key" accessed the data. You need to create a separate tracking and access control system to provide an audit-able trail. Encrypted databags don't solve a problem in this case, they just create a whole new set of problems.
Chef Vault is an attempt to get around this problem by creating access control lists using the available private key on each chef client. Without the strong ACL system that either Private or Hosted Chef provide, this is probably the simplest workable solution. Any other solution should implement all of the features of Chef Vault. The most general solution in the explicit access case is ACL's based on the ssl identity of the client. There are many other objects in Chef on which the ACL system of the Opscode Chef would provide useful security boundaries.
It's important to remember that more keys is not better security. Encryption keys should be used to provide data integrity, privacy and authentication. (i.e. they answer the who questions, who are you? who am I? who sent that message? ). They should never be used to answer the what questions. ( what can I read? what can I write?, etc ).
Providing secrets in a scalable and secure fashion is still a largely unsolved problem. But any solution that attempts to use only shared secret encryption will not scale. Public key encryption and rings of shared access seem to be the only workable way forward and every chef client and admin has key pair already available. Any scalable solution should be using this existing identity. If you are using encrypted databags to control explicit access, then you are building in scaling, access and audit problems for the future.
Thursday, May 23, 2013
There Are Only Two Ways to Save Money with Software
To paraphrase a civil war general:
"The only good code is deleted code."
I have recently been in several meetings where the phrase
"We'll save resources by doing things with new software",
has been used a justification for change.
This is one of the great lies that drives Silicon Valley. Every line of software is a cost liability.
The only way to save money with software is to delete the number of lines of software you run or to use software to make the problem somebody else's problem. Every advance in computer software engineering has followed these two principals.
Compiled languages follow this principal by both reducing the number of lines of code required to solve a problem and also by handing the problem of turning those lines of code into executable instructions to the people that write compilers.
Operating systems follow the same principal; instead of having to manage access to millions of blocks in both memory and disk, you write enough software to make it the user's problem.
The only way to save resources with new software is to delete old software, or to make the things that the old software was responsible for somebody else's problem.
New software can make doing new things cheaper, but that is a completely different problem.
"The only good code is deleted code."
I have recently been in several meetings where the phrase
"We'll save resources by doing things with new software",
has been used a justification for change.
This is one of the great lies that drives Silicon Valley. Every line of software is a cost liability.
The only way to save money with software is to delete the number of lines of software you run or to use software to make the problem somebody else's problem. Every advance in computer software engineering has followed these two principals.
Compiled languages follow this principal by both reducing the number of lines of code required to solve a problem and also by handing the problem of turning those lines of code into executable instructions to the people that write compilers.
Operating systems follow the same principal; instead of having to manage access to millions of blocks in both memory and disk, you write enough software to make it the user's problem.
The only way to save resources with new software is to delete old software, or to make the things that the old software was responsible for somebody else's problem.
New software can make doing new things cheaper, but that is a completely different problem.
Wednesday, May 22, 2013
How to build ruby 1.9.3 with libyaml in a funky place
The only solution I've found is to hack the location of libyaml library and include files
into the extconf.rb file in ext/psych
into the extconf.rb file in ext/psych
header_dir = '/afs/slac/package/ruby/@sys/include'
library_dir = '/afs/slac/package/ruby/@sys/lib'
dir_config 'libyaml', header_dir, library_dir
Subscribe to:
Posts (Atom)