Philosophical aspects of the software architecture
This article aims at the top-level system
architects and scientifically minded researchers; however, I hope that even
junior coders may find many interesting and useful things in it.
The Matrix is watching
Even the simplest digital system is the interactive simulation system; in fact, it is the extension of our world. Not like in the Matrix movie, but the principles are same - it is the interactive simulation.
Remote control in your hand to turn your air conditioner on, is the interactive simulation system.
You do not believe it? It is primitive; however, it is based on the Alan Turing machine that just crunches the numbers. By the way, what is the number? The number is the abstraction made up by the humans to simulate the reality. There are no numbers in our universe. The world around us is 100% analogue. We use the numbers and booleans to describe the surrounding us universe. The description is always the model of something and the model is always a simulation, static or dynamic. What is the story, for instance, by Conan Doyle about Sherlock Holms? Was Sherlock real?
No, the character was purely fictional, in other words simulated. Such concepts like "Yes"," No", "Bigger"," Smaller" are just the abstractions. They do not exist in the reality. If for any reason the humankind disappears, such things like "yes" and "no" will disappear with us, because they exist only inside our brains. We make them up.
The world without objects
The objects do not exist in the universe either. We make the objects in order to abstract one part of the universe from another. How about the stars? Do they exists? Of course, they do, however, there is no physical boundary between the single star and the star system, the star system is the part of the galaxy and so on. It all depends on the way we look at it. If we deal with the star, we focus on the star object ignoring its surroundings, if it is required.
In the software development, we use the objects because it is the only way to overcome the complexity of our real world. An object is always a model of something real that we deal with.
Of course, we can make the objects that are not the models of the real objects, but they are still the models of the models, which were derived from the reality for the simple reason - the reality (the Universe) is the primary source of everything. That is why the Object Oriented programming (OO) is so important and ubiquitous. Encapsulation and polymorphism are just the formal methods of dealing with the objects.
Now we can see some different paradigms in
the software development world, like emerging functional programming,
previously there was a procedural programming, and we hear the voices that the
OO programming will soon be gone.
This one, for instance
http://www.smashcompany.com/technology/object-oriented-programming-is-an-expensive-disaster-which-must-end
Personally, I think that it will never happen because the object is the base concept of our world. Remove the objects from our set of concepts and everything will disappear. The object is the building block of any virtual (simulated) reality.
This one, for instance
http://www.smashcompany.com/technology/object-oriented-programming-is-an-expensive-disaster-which-must-end
Personally, I think that it will never happen because the object is the base concept of our world. Remove the objects from our set of concepts and everything will disappear. The object is the building block of any virtual (simulated) reality.
The system
The system is the collection of the objects that interact with each other. How? Interaction in the IT world is always a sending the message to the target and receiving a response (if any).
Sync vs async
Communication patterns
The complete communication pattern is a request-response. Request-response is always synchronous; we have to wait for the response.
Building the system, we have to choose carefully
the communication pattern or rather the patterns, because complex system
usually requires more than one channel of the control.
Usually it is a carefully chosen the combination of sync and async methods.
Usually it is a carefully chosen the combination of sync and async methods.
Let us look into both patterns
Synchronous approach implies that the system stops and waits for the end of the execution.
Asynchronous, on the contrary sends the command and continues the execution without waiting for the result.
Historically the communications became
mostly synchronous. Example: Remoting, WCF, CORBA and other. They are all
sync. There were apparently two reasons
for that - the popularity of http protocol and rise of RMI (remote method invocation).
Http protocol is a stateless one. Maintaining
of the connection is not required. Open
the connection, send the request, immediately receive the reply and close the
connection. That was the idea at early days of our global web. Perhaps at that
time it was justified, the systems were very primitive and the pattern "request
- response" covered all the needs. Not any more though.
The RMI has also contributed to the sync
pattern. The idea was to execute the commands remotely in same manner they are
executed locally. Wow! How convenient, you do not even have to care where the
target is, here or in Japan
or on the Moon.
The live is more complex
Imagine you are writing the letter to your fiancée asking to become your wife. You drop the letter to the post box and wait for the reply. Do you stop eating or going to work, brushing your teeth? Unlikely, otherwise your bride is risking becoming a widow instead of a wife.
So, it appears that sending the letter is asynchronous. You send and forget?
Not quite. The response, if it comes, will change your life. In other words, it will change the state of the system (you). Well, It looks again it is synchronous, but what about brushing the teeth?
So, we can clearly see that the behaviour
of the system (you and your bride interaction) cannot be covered by the
existing common patterns and if you are designing the real time system (I would
rather call them real life systems), you have to stop relying on the sync-async
patterns. Simply they are not sufficient. They cover only a very limited number of cases
but we keep pushing this pattern instead of thoroughly reviewing it.
The pattern Async_WithConfirmation_and_Timeout
covers all the cases, including pure async and pure sync. Just make the Confirmation=false and the
timeout = 0 and the pattern becomes purely async, Make the timeout = infinity
and the confirmation=true and we have pure sync pattern.
Was Frankenstein synchronous?
Imagine we built the Frankenstein, kind of the android and everything is running synchronously,
every his step corresponds to two heartbeats and so on.
During the construction, we also created the program that controls our Frankenstein. The Frankenstein is successfully built and released to the nearby town. His real life begins. The first problem this guy will experience would be the inability to cross the road because crossing the road will require the change of the ratio between the number of heart beats and the number of steps. Even if the controlling program is perfect, let us imagine unimaginable, the physics in the universe we live, will not allow to follow the program strictly. The macro world is still built of the subatomic particles and they are governed by the quantum physics, which has Heisenberg’s Uncertainty Principle. Even a perfect program will fail eventually and our system must adapt to the changing world around us.
The connection
The connectionless protocols are becoming less popular due to the inability to assess the state of the system they are dealing with and the state of the object is a fundamental property of the reality, it is not just a software factor. Do not forget that without the state (the memory) the Alan Turing’s machine cannot exist.
Client-Server is not good enough?
A typical distributed system now is based on the client-server architecture, where the client communicates with the server synchronously. Intuitively the developers feel that this pattern is not sufficient. Look at this article
http://www.codeproject.com/Articles/491844/A-Beginners-Guide-to-Duplex-WCF
It is the attempt, in fact relatively
successful to compensate for the inherent client-server pattern deficiency. I
say successful and that is partly right. Nothing really can compensate for the inherent
deficiency of the sync pattern. You cannot
turn the steam engine into the space shuttle, and the shuttle into the steam
engine. Simply they were designed for different purpose.
Timeout
Timeout is the most important moment in the component design. Why is that?
Because we assume that the time flows at
the same pace on the other side of the network or even in the whole universe.
It is the only parameter that is available without sending and receiving
anything, it is also invariable. That why it is so universal and so valuable.
The timeout and the probabilities
We wait for the bus at the bus stop. The bus is not yet coming.
What is the probability of the bus to come?
Well, it all depends upon the period we set for this probability (or rather a
mathematical expectation) to materialize. In other words, it is a function of
time. At first, the probability monotonously
grows and then starts to drop sharply. What do we do? We wait for a bus and within
first 5 minutes, we do not even think about catching the taxicab.
However, the situation changes, we become desperate and eventually we are ready to take a taxi.
What do we see? What pattern describes the situation? Actually, the expected probability changes the scenario we are following. So, we see that it is not a simple timeout, at every individual moment we have a different scenario. Our software usually is not that smart, however some different levels of timeouts should be implemented.
However, the situation changes, we become desperate and eventually we are ready to take a taxi.
What do we see? What pattern describes the situation? Actually, the expected probability changes the scenario we are following. So, we see that it is not a simple timeout, at every individual moment we have a different scenario. Our software usually is not that smart, however some different levels of timeouts should be implemented.
Francisco Scaramanga and the software development
The rule number one of the engineering is - do not re-invent the wheel. Take something that exists and improve it. (in the software terms it is an inheritance) Well, sounds good. What is the best system in the world? So far in the known us universe, we humans are the most sophisticated systems. Coping ourselves in the C# or C++ code? What a nonsense? Not quite. I suggest taking a bit closer look at ourselves.
If you remember James Bond movie "The
Man with the Golden Gun", you possibly can recollect the character- Francisco Scaramanga, the villain and the man
with 3 nipples. Error of nature occurred and the person had 3 nipples instead
of two.
Our genome (DNA) which is the instruction
how to build our organism was broken or somehow misinterpreted during the
construction. The most important knowledge out of this error is that the
instruction how to build our body is not the instruction at all. It is just a recommendation;
otherwise, the third nipple would not fit in. Imagine the airplane construction
plant. You have the drawing how to build the plane. Is it possible by some
mistake to build the plane with one extra wing? Even if this extra wing is
built, there is no way that this wing can be fitted onto the plane; you have to
redesign all other bits and pieces. However, unlike our poor three-wing plane, Francisco
Scaramanga was fully functional and almost killed our perfect James Bond. How
come? The reason for that is, when Scaramanga was constructed (let us stick to
this generic term), the building blocks of our body try to adjust to each
other. It is a mutual adjustment; it is
not the construction according to the plan.
The conclusion from that is - the more complex the system is, the less coupling should be between the blocks. The real life complex systems are always multithreaded because without multithreading it is physically impossible to achieve the decoupling of the components of the system and without the decoupling the large system is not functional. Decoupling also means that the synchronization between the different blocks is external in relation to the block itself. There should be a system manager that synchronizes all the subsystems in whole system. The systems must be multithreaded not because of the performance issues. The major reason is that they must be built from the self-adjustable and self-adaptable components.
The conclusion from that is - the more complex the system is, the less coupling should be between the blocks. The real life complex systems are always multithreaded because without multithreading it is physically impossible to achieve the decoupling of the components of the system and without the decoupling the large system is not functional. Decoupling also means that the synchronization between the different blocks is external in relation to the block itself. There should be a system manager that synchronizes all the subsystems in whole system. The systems must be multithreaded not because of the performance issues. The major reason is that they must be built from the self-adjustable and self-adaptable components.
The system built with one thread is always
sequential, if your heart waits for the piece of meat to be digested in the stomach,
you are doomed to die.
Choosing the wife and the software design
What a strange question. What is the connection between the software design and choosing the partner? Well, there is one, very fundamental.
The reason why biological objects (humans for instance) have two genders is simple - two is the minimum and yet sufficient number for spreading the gens into the wider population.
Could be not two genders, but 3 or even 4. Simply adding the number of sexes is not adding anything functional to the gens exchange mechanism, so, two is optimal. Why do we exchange the genetic material at all? Would it be easier to reproduce the children by recombining the gens internally and then giving birth to this new organism, and later on this new organism enters the natural selection as we all do? What is wrong with that? The major problem with this approach is that 99.9999% of the descendants will consist of total genetic garbage and will not be functional.
Instead, with the sex (or rather binary) approach, the organisms exchange the bits and pieces that are already functional. Don't we have the father's eyes and mother's lips? So, we inherit the functional blocks and the blocks get recombined at the moment when the child is conceived.
This is a simplified version of genetics, in reality it is far more complex, but the basics is - only the functional blocks are used for the building of the whole organism and microscopic bit is left to the mutations.
In the software world, we have same pattern
- we use only the blocks that were built long time ago and did have the time
and the opportunity to pass the real life test. When we build everything from
the scratch, we simply leave 100% of the design to the mutations. Typical
mutation kills the organism, only the tiny fraction of the mutations are useful,
but without the mutation, the new species will never appear. So, the practical
outcome of this is that the developer has to reuse the existing frameworks and relievable
patterns as much as possible, relying only on your home made software will kill
your product, but you have to leave some room for the design from scratch, that
is how the new breed of the software gets created.
The music of the system development
There are thousands if not millions of the articles and tips on how to write the software.
Codeproject has at least hundred of them.
Take a look at this one:
http://www.codeproject.com/Articles/539179/Some-practices-to-write-better-Csharp-NET-code
It is the most popular article on the software development. In my humble opinion,
this article is not about the software development at all. Just a simple analogy - there is a piano performer and there is a music composer. The piano performer plays only what was written by the composer, just that and what all this articles are focused on is how to write the notes, what ink to be used, what paper, handwriting style but absolutely nothing about the music itself. Everybody forgets that it is the music that is played, not the note sheets. We all remember Mozart and Bach not because they wrote heaps of the note sheets, but because they created the Music.
In fact all these articles are not about building the software, they all about the writing the code and the purpose of this article is to show that writing the code and building the systems that work, are from parallel though different universes. Let us begin our journey to a parallel universe.
It is the most popular article on the software development. In my humble opinion,
this article is not about the software development at all. Just a simple analogy - there is a piano performer and there is a music composer. The piano performer plays only what was written by the composer, just that and what all this articles are focused on is how to write the notes, what ink to be used, what paper, handwriting style but absolutely nothing about the music itself. Everybody forgets that it is the music that is played, not the note sheets. We all remember Mozart and Bach not because they wrote heaps of the note sheets, but because they created the Music.
In fact all these articles are not about building the software, they all about the writing the code and the purpose of this article is to show that writing the code and building the systems that work, are from parallel though different universes. Let us begin our journey to a parallel universe.
Firstly, the software, as it was shown
above, is merely a reflection of the real world we all live in. This
fundamental fact is often overlooked and when the software becomes too
artificial, it stops working.
Default settings
Everything in our world is defined by the probabilities. Even crossing the road sometimes can be fatal. There is always the chance of the catastrophic outcome of anything; on the other hand, the opposite is also true - we can win 50 million in lotto.
When we build the software component, we have to rely on the probabilities of its usage.
Typically, the component has the set of the parameters. Naturally, all of them are set to some defaults.
How do we chose these defaults?
The rule is very simple and straightforward - the default must rely on the potential frequency of usage. If 99% of the developers set the param A to, say, 5 and the rest 1% set it to 10, means that the component must be released with the default set to 5. So, if the parameter is not set at all explicitly, the system will still be functional. That is obvious; however, the major component vendors for some reason keep forgetting this simple rule.
Imagine you are sending the letter to your
beloved girlfriend, and in order this letter to be delivered you have to
specify the color of the envelope, the number plate of post truck that will
carry the letter, the religion of driver
and so on. Perhaps you will change your mind about the sending the letter at
all. Clearly it is all irrelevant, you just want the letter to be delivered in
the default manner, and if you need extra options, like confirmation of the
delivery, you specify them separately.
However, exact same situation we have with
WCF or different components or frameworks.
The configuration even for the simplest operation is enormous.
The configuration even for the simplest operation is enormous.
What is the difference between the server and the client?
The actual difference is only in who exactly initiates the connection, after the connection is made, there is no difference between the server and the client. The relation between them becomes peer-to-peer and the canonical software architecture bluntly ignores this fact.
They are no longer the client and the server. They interact with each other.
Let us take the example from the real life. You come to the restaurant for a dinner. The waiter is a typical server, and you are a client. You ask the waiter to approach and when he comes, you order the meal. Ok, up to this point the relation is client - server, but after the first words, the waiter has to clarify which kind of vodka-martini you prefer. Shaken, may be stirred? In fact, you start talking and it is not as if you keep ordering everything until the very end.
The software (which is the reflection of
our world) using standard components simply cannot do that. The software, most programmers
use, is inadequate. We twist it one way or another, but it is not designed to
serve us properly because people who designed it in the first place never ever
thought about something real.
Brain surgery and coding
The ,software that works, copies the real world because the world around us simply works, as we know it. Let us assume for a moment, you are a brain surgeon and right in the middle of the operation. At this moment, your wife calls you and starts talking about the cute kitten that plays in the backyard. What would you do? Most likely, you hang up and later on you apologize for not being nice. What would the average software do? I suspect that in 99% of cases you should drop your brain surgery, talk to your wife and when the business with the cute kitten is finished, you get back to your (apparently dead) patient.
So, what was wrong? The priority. We do not
think about it much, but our life is the set of the priorities and the robust
software must prioritize the action, otherwise it ends up like our unlucky
patient. The priority can be static or the dynamic one, depending on the real
task.
The software is firstly a system, and
secondly is the sequence of the commands.
If we have just a couple of components, it is easy to interact. Just an ordinary event handling will do the trick:
If we have just a couple of components, it is easy to interact. Just an ordinary event handling will do the trick:
Writer.MessEvent += new /...
void HereWeReceive(string mess)
{
{
}
What if we have thousands of subsystems and
they have to interact?
If we just use the simple event handling,
the system stalls if one of the components develops a fault. Oops! So, the system must be built in a way that
allows to ignore less significant signals. It is how it is happening in the
real life. The chirping of the bird up in the tree should not stop our heart
beating.
The real robust system always has more
than one level of the signal priority and typically it is implemented through
having more than one message delivering system.
In practical terms there always should be
the subsystem that runs in its own thread. Without multithreading it is
physically impossible to ignore the useless or wrong signal because of
sequential nature of our CPUs. In our body we have also multiple signal
delivery systems - central nervous , peripheral nervous, endocrine etc because they also have different speeds and
priorities.
The rule of thumb is - the less important the signal is, the less the probability of delivering it to the core of the system, the peripherals should deal with the garbage. The least important signals have to be processed locally without even delivery to the core.
The rule of thumb is - the less important the signal is, the less the probability of delivering it to the core of the system, the peripherals should deal with the garbage. The least important signals have to be processed locally without even delivery to the core.
Exceptions
How to handle the exceptions?
It is so much written about it. Is
everything that is written wrong? No, it not wrong, simply sometimes it is good
to look at the things under the different angle.
The way how the exception has to be used,
firstly should depend on what we are going to do with this exception. In some organizations,
there are very strict rules on how the exceptions should be handled. Usually it
requires the error code, message and something else.
The error codes could be possibly put in
the list with thousands of numbers (typically it is unsigned integer). Therefore,
when the exception occurs, we know the error code. How nice! However, the point
is, why would we need the error code in the first place?
The error code we need only for the
recovery from the fault in order the system should be able to undertake some
action to recover. However, in 99.99% of cases no such an intelligent recovery
system was ever implemented, in fact, it might be right. The design of such a
recovery system is already a challenge and usually the waste of the recourses.
So, why do we need to maintain the tables
with thousands of the error codes?
As we can see the designers of this system
did not think that the exception handling system is not only the recovery
system, it is also a signal delivery system and the signal once it is delivered,
must be interpreted, otherwise the delivery does not make sense whatsoever. The
signal that was delivered and not interpreted is a garbage by definition. The
smart designer has to take this into consideration - what to deliver and the
most important why. Getting back to the practical code, the rule is - the error
message is usually the most important info because it is interpreted by the
humans when other systems fail, whereas the error code is kind of optional,
depending on what is implemented in terms of fault recovery. Usually it is
nothing.
Redundancy
"That which does not kill us makes us stronger."
Friedrich Nietzsche
What is redundancy? The redundancy is the
excessive resources that can be used in the case of emergency.
Racing car example
what about redundancy in the racing car. Well, it must be zero. The ideal racing car should fall apart right after it crosses the finish line.
Racing car example
what about redundancy in the racing car. Well, it must be zero. The ideal racing car should fall apart right after it crosses the finish line.
Have you seen the old healthy person? One day he falls ill, nothing serious,
probably a flu and a few days later he dies from a kidney failure. Why? He
looked healthy.
In fact not only looked, but he was
healthy. Why did he die? He died because all his redundancies were exhausted
and any external cause (flue in our case) killed him. What happened? Simply the
flu triggered a chain reaction, it stressed the immune system, then failure of
the immune system caused the kidney infection and the person died. What does it
have to do with the software? Same
thing, the software modules that have some degree of freedom must have the
redundancy otherwise, any stress on the individual component will provoke the
chain reaction and eventually will cause a catastrophic failure.
No comments:
Post a Comment