Thursday, October 25, 2012

IT Architects in bonus-driven organisations

Over years I observed people being excited and disappointed when approaching IT architecture efforts in mid-size (under 500-600 people) bonus-driven organizations. I can not recall one instance, where a forethought architecture allowed a continuous effort to survive - and continue to extract value from past efforts - for more than three years. Even when business environment was relatively stable, individuals' need to justify bonus drove unnecessary changes in to the system. With bonus periods ranging from 3 month to 1 year, there seem to be a mis-alignment with the time line of benefits from a thoughtful architecture realizing over 1 to 5 years.

I have put together some thoughts on risks IT Architects may encounter in mid-size bonus-driven environments. From this unscientific elaboration it seems to boil down to power balance.

There are organizations, which set architects on business and technology sidelines, with no organizational and financial authority over development process. The common rhetoric in this case is: "If your architecture suggestions are so good, the people will take them to better their own future." There are few problems with that:

  • Like almost anyone else architects are too busy to address improvement of a working system and mostly address broken stuff. Which means that in 99% cases they are trouble messengers.
  • More often than not, teams and individuals have vested interests in maintaining a status quo - pride, complacency, inertia, misplaced priorities, lack of resources all contribute to it. That makes first point really negative in their eyes, instead of "let's improve" positive twist.
  • No matter what "leadership" BS floats around about pervasive communication skills, no sweet or pervasive talk will override people's financial and career incentives in the long run. As architectural changes are most about long run, mis-aligned incentives either require an architect to possess organizational power or make architect's job impossible.
  • No organization I know of can achieve reasonable alignment of individual's incentives and long-term organization incentives in a context of a bonus system, so prominent in financial companies. That enhances adverse effects of previous points.

So, which environments seem to be receptive and able to extract long-term value of architect's skills?

Obviously there are environments, which explicitly delegate power to architects. Usually those are larger organizations. The upside is that architectural effort and continuity is possible. The downside is that those organizations are often perceived as too bureaucratic and mind-numbing environments.

Secondly, there are situations, when technology managers are architects by nature, experience, or training. That is where architects by role get the necessary powers implicitly. The upside is that that leader's specific group will benefit from architectural effort. The downside is that it will not necessarily be aligned with organization as a whole, but rather with that manager's incentives.

Another scenario, is when individual technologists possess talents of an architect. There are tangible benefits for an organization at that. Those developers usually worth their weight in gold every quarter - you get the idea. They are, however, subject to the same downsides as a manager-architect - that is their incentives impact architecture too much.

Next scenario is an executive in a mid-size organization has a strong conviction of the necessity of architectural work and heavy-handily enforces the power of an architect's word. The downside it that the architect's position is very politically unstable. It is also hard to fulfill with an external architect (vs home-grown). The problem is that an experienced architect will have a hard time taking a promise of organizational support. Only a few mid-size organizations are willing to build architect's powers into an org chart with real executive weight behind it - exactly because they are so bureaucracy-avert and that seems to them as a bureaucratic thing to do.

There is one other scenario which I can speculate about. Consider an architect is an agent of business operations, paid by business operations, who interfaces with technologists and, as a customer representative, can demand a certain architecture. That may only work if technologists are not defensive about someone telling them what to do in such a detail. There are two other problems with this scenario. I have never seen it done (or talked to a witness). This scenario also risks to transform into the "throw-over-the-fence" development model - the one I do not like.

In a flat-compensation organization where bonuses play below-noticeable role some of these concerns are minimal or not applicable.

If I missed a scenario - let me know. I am not saying that an architect's position is a no-go in a bonus-driven organization. I am merely raising awareness of the risks present. Understanding how (and if) an organizational architecture supports a role of architect.

read more ...

Sunday, September 30, 2012

Cross your "tee"

In dealing with bash scripts we often need to troubleshoot a data pipeline. Many times I saw injecting tee utility into the pipeline as means to copy data into a log file. Often those tee calls are left behind - as it seem innocuous. It is a pass-through, changes nothing, right?

Not necessarily so. Look at the way we write pipelines:

producer | tee file.log | consumer

The fact that producer sends data to tee through the pipe means that there is a buffer allocated for the data transfer, which is most commonly a mere 64 Kb (4 x16 Kb buffers) size. If the receiving process is not fast enough to take data off of that buffer and the buffer is full, then the producer process will block on the next write. It is a totally reasonable architecture in many cases. However, if you have a fast producer-consumer pair and inject a slow tee between them, you will pay a performance penalty. Look at this very simple scenario.

Here is what in the directory:

$ l
total 19531296
-rwxr-x---  1 vlad  staff  -   38B Jun 21 16:30 cat2null*
-rwxr-x---  1 vlad  staff  -   44B Jun 21 16:22 catcat2null*
-rwxr-x---  1 vlad  staff  -   50B Jun 21 16:24 catcatcat2null*
-rwxr-x---  1 vlad  staff  -   60B Jun 21 16:23 catntee*
-rw-r-----  1 vlad  staff  -  9.3G Jun 21 16:21 large_file
-rwxr-x---  1 vlad  staff  -  343B Jun 21 16:21 mk10g*

Which is a 10 gigabyte file and a few scripts. The large_file consists of 99-char-long lines.

The cat* scripts have a relatively slow producer, reading data from local disk (SSD in my case). We have some very fast consumers, which either promptly discard the data using /dev/null, or put it through other supposedly fast interim pipes.

Note how scripts with supposedly fast pass-through stages take significantly longer with each extra buffered StdIO pipe involved (cache is primed to have comparable conditions):

$ cat large_file >/dev/null; for i in ./cat*; do echo ""; echo "----- $i -----"; cat $i; echo -n "-----"; time $i; done; echo ""

----- ./cat2null -----
#!/bin/bash
cat large_file >/dev/null
-----
real 0m7.929s
user 0m0.027s
sys 0m3.130s

----- ./catcat2null -----
#!/bin/bash
cat large_file | cat >/dev/null
-----
real 0m8.688s
user 0m0.242s
sys 0m6.435s

----- ./catcatcat2null -----
#!/bin/bash
cat large_file | cat | cat >/dev/null
-----
real 0m8.668s
user 0m0.393s
sys 0m9.941s

----- ./catntee -----
#!/bin/bash
cat large_file | tee /dev/null | cat >/dev/null
-----
real 0m9.399s
user 0m0.639s
sys 0m12.258s

It is very telling. Interim cat filters and tee for /dev/null are not perceived as expensive, yet they turn out to consume significantly more system time. I am running on a single-CPU, multi-core MacBook Pro and the scripts do not exhaust available cores. So the wall (real) time does not grow as fast as the system (sys) time does. That is, if about 9% jump for the first pipe introduced is not a big thing for you. Keep in mind, that in case of a busy system, or with a highly concurrent software, where existing cores are tightly scheduled, that sys time will swiftly spill over to real time. And you will not like the slow-down.

I have seen a production process on a twelve-core server speed up eight fold when the logging tee which pointed to an NFS mount was removed. The point is that mindless cruft has a good chance to hurt your system.

Beware. Keep your scripts clean. Remove unnecessary pipe components, especially if they do IO.

P.S. If you would like to reproduce the test on your system, here is the mk10g script to generate that 10 gigabyte test file:

#!/bin/bash
d="1234567890123456789"
l=${d}0${d}0${d}0${d}0${d}
k="${l}\n${l}\n${l}\n${l}\n${l}\n${l}\n${l}\n${l}\n${l}\n${l}\n"

rm -f 1m 1g large_file

for (( i=0 ; i<1000 ; i=i+1 ))
do
  echo -en ${k} >> 1m
done

for (( i=0 ; i<1000 ; i=i+1 ))
do
  cat 1m >> 1g
done

for (( i=0 ; i<10 ; i=i+1 ))
do
  cat 1g >> large_file
done

rm -f 1m 1g

read more ...

Monday, April 2, 2012

Should consumer privacy be a regulated issue?

Feels like the recent criticism of Google's unified privacy policy is a misplaced focus. The controversy centers on company, while it should focus on practice. There are three new problems with large media companies:

Ecosystems are too big to ignore

Commercial ecosystems on internet become big and unavoidable by definition of their success. It is important, as there are more and more professions, whose professional success requires participation in certain ecosystems. In my mind, LinkedIn and Facebook are front runners in by that criteria. Making people aware of privacy policies and turning them away if they do not agree is a strong-arming policy which serves only the ecosystem operator entity, but not participant people or society at large. It is a pretend choice, not a real one.
An example is LinkedIn for some types of businesses. Can a technology recruiter survive these days without a LinkedIn contract? I do not think so.

Sender forces you to subscribe for a policy

People often operate as guests of ecosystems. They may have financial or personal needs to attend to content offered by ecosystem participants. There are other drivers which deprive visitors of real choice.

All or nothing approach - especially in hidden contracts

When an application offers a user to accept it's privacy policy, there are privacy-related functions which the usen has to take or leave as a package. Even though a privacy setting may affect a specific function, not the whole application, the user will need to abandon the application.
In the matter of fact, Google shows an example of an opposite - one can browse Google Maps on Android with or without GPS turned on.
This is especially a problem, when one buys a phone at a carrier shop, which is broadly advertised having an application ecosystem and specific applications in it.
A user, who buys into the contract after they liked the advertised application set, is unpleasantly surprised with the terms and conditions (in addition to the privacy policy) of, for example, Google Play. Talk about feeling strong-armed.

So, if all this coercion by companies is so prominent and still is unchecked at large, is it time to ask for regulation?

Monday, February 20, 2012

Parameters for setTimeout payload (JavaScript)

A step in building an implementation of Conway's Game of Life with my son - make the game turns going on the web page. What is not obvious for him yet is why it is important to avoid global objects to feed as parameters to setTimeout function or just keep around. But it can be done - easily:

var game = function(times, interval){
    game.times = times;
    game.interval = interval;
    game.turn = function(){
        if (times > 0) {
            console.log("You have " + times + " turns left");
            times = times - 1;
            setTimeout(game.turn, interval, game);
        } else {
            console.log("Game over!");
        };
    };
    game.turn();
};

game( 10, 1000 );

Tuesday, January 31, 2012

NodeJS: install oddball tarballs with NPM

It is useful at times to install a head version of a module or a branch which is not in the npm.js repository. No big deal, if the module hosted on github.com. Npm can install from tar files available over http, and Github allows simple URL acces to tarballs.

The UI process: go to the repository of your choice, click on Code and select a branch. Otherwise go to Tags or Downloads on the right side of the page. Right-click on a tarball link of your liking and copy it to the clipboard. Then simply npm install <paste_here> install the module.
To directly enter the URL run

npm install https://github.com/USER/REPO/tarball/BRANCH_TAG_or_SHA

Keep in mind, that you may royally mess up versioning using this method. Be mindful.