Logins and Transactions
This chapter deals with the following concepts: identification, authentication, confirmation, authorization and financial transactions. It deals with the actual infrastructure that supports numerous real-life activities, ranging from using our phone, to browsing the internet, to buying or selling something.
This chapter currently has only two sections. The first section, The Problem, elaborates the problem that we are trying to solve. In short, the problem is of providing the ideal identification, authentication and authorization mechanism. It is only a problem from our current experience and perspective. The second section provides The Outline Solution to the problem.
The Problem
In our society we have "no stealing" and "no trespassing" laws. In short, these laws prohibit us from doing anything with something that does not belong to us.
Let us consider our home; the place where we generally live. While we may have laws that tell people to not violate our rights related to our home, we also try to reduce temptations for others by having doors on our home and keeping them shut and locked from inside when we are at home. Similarly, when we leave our home, we shut these doors and lock them from outside. In a way, we make it difficult for others to break the law. When some people break these "no stealing" or "no trespassing" laws, we deem it to be an intentional infraction of the law and we give such people some adverse consequences so as to dissuade them from repeating their behavior.
In the digital world, we have many things that we own. Currently, we attempt to protect them with passwords. The login screen is like the shut and locked door and our userid and password together is like the key to open the locked door.
In the physical world we may call those who use our property without our permission as bad guys. The bad guys, in order to gain entry to our home, could simply break down the doors of our home or they could steal our key or they could pick the lock. In the digital world the bad guys are called hackers and they do the equivalent of breaking down our door or stealing our key or picking our lock.
When we are thinking about security, we are thinking about ways and means of making it difficult, and ideally impossible, for others to use our things without our permission. Locks and keys are our primary tools.
Currently, in the digital world, we could consider "our stuff" as being placed in the care of several websites. A login screen is like an intelligent door to a website. We identify ourselves with our userid. The userid is the equivalent of our name in the digital world. If we are not even authorized to enter some website, we won't have a userid for that website. Our password is like our key to gain access to our stuff that is placed at the website. We gain access to our stuff by getting in front of the intelligent door (the login screen), we identify ourselves (by our userid) and then we open the door using our key (that is our password).
The idea of userid and password is simple and easy to implement in the digital world. Thus, it started out with every website implementing these concepts. Since it is easy to implement, there was never a need for any coordination between two organizations implementing such things for their websites; they did it independently. And thus we have a different userid and password for each of these websites.
Not just websites, even devices (like our phones and laptops) have this notion of a local user. From a device's perspective, a local user has an entry in the device's database of users. This database could be very small and may contain only a handful of users (as on a home computer). Each such user has a userid (a nice short name) and a password. Thus any userid that is not in this local users database cannot log into this device.
Everything said in the context of websites and our devices also holds true for logging-in to applications installed on our electronic devices (computers, phones, tablets). We do not want anyone to merely get hold of our device, start it and use the installed applications as there may be things stored in those applications that are supposed to be private, personal, confidential and even secret. Perhaps the application lets the hacker do things that you would consider undesirable or harmful and if such a thing actually happens then it will appear that you have done it. So, all these installed applications have a need to identify who is invoking them and authenticating that they are really who they say they are. How many applications actually do it?
Then there are organizations and their computers. An organization has userids for their employees and could have a single database that is referred by the computers and other devices owned by this organization for allowing people to log into these computers.
From our perspective, each one of us may end up having over a hundred userids and their associated passwords. Since each password is like a key, essentially, it is as though we are carrying around with us a key-chain with a hundred keys!
Remembering a hundred different passwords is not easy, so people started having a handful of passwords for multiple userids. Even remembering a few passwords is not easy unless the passwords themselves are easy. Hence people started choosing easy passwords (like "1234" or "password"). Similarly, remembering a hundred different userids is not easy and since a userid is like our name, given a choice, we tend to choose a userid that matches our name as closely as possible. Now if the passwords that people choose are easy and the userids are a close match to their actual name, both of these are easy to guess. This makes it easy for bad people to guess other people's userids and passwords and gain access to their accounts (that is the userid-password combination).
To counter this, technical people started "educating" the rest of us about why we need to have "strong passwords". Passwords are considered to be strong if they are difficult to guess. Such passwords need to be a long string of digits, alphabets and symbols that seems to be randomly picked. This makes the passwords nearly impossible to guess, but not impossible to "crack" or "hack" using programs. To counter this "crack" or "hack" issue, the further guidance that we receive from technical people is to change our passwords frequently.
Thus, the need to have seemingly random and long passwords that also need to be changed frequently, makes these passwords impossible to remember for almost all of us.
Further, our userid on various websites may not be the same and that means we need to memorize all these website-userid-password combinations. What was already a very hard task, now becomes even harder. That means that either we don't use randomly generated passwords (which is bad as they can be easily guessed) or we use the same randomly generated password for many websites (which is also bad because cracking password for one website means the hacker has cracked our password for many of our website accounts).
The main problem of this system of identification and authentication is that we have to remember too many random things and that is well beyond the capabilities of most humans. But, because userids and passwords are easy to implement and give to people and since such implementations don't need any coordination with others' implementations, organizations and devices continue to do it. Essentially, to overcome what technology has not yet implemented, we are asking ourselves to do things that are humanly impossible.
To help us manage our passwords, some technical people created a software solution in the form of a password manager. When we use a password manager, we store a combination of a website, our userid for that website and our password associated with our userid for that website into the password manager. When we visit the website, we use the password manager to "fill" the userid and password fields that the website presents to us to login. The password manager can also generate random passwords for us to be used by us on various websites.
Because the password manager is storing something of value to us, access to this stored information also needs to be protected by a password. So, the problem of passwords is not gone, it has been reduced to a manageable proportion.
Then there is the problem of many different software creators creating their own password manager software. Now we have to choose a specific password manager software from among the several available choices. It is a problem because most of us really don't know how to evaluate whether a specific password manager is good or not. We simply trust others who express their opinions about the goodness of such password managers. Those who we trust may themselves have no true sense about the actual goodness of various password managers; they may just have their opinions. Finally, we don't have the knowledge to judge others on their competence in choosing a good password manager. Thus, if we start using a password manager, we do so by happenstance.
What happens if we forget the password to our password manager? We lose access to all the websites for which we stored userid-password combinations in that password manager.
More recently, instead-of or in-addition-to the password, some websites send a one time password, abbreviated as OTP, to the person's phone and the person has to enter the OTP to login. The idea behind sending an OTP and requiring its entry is that "you are in possession of something unique and you can provide sufficient evidence of it by entering the OTP". This is considered by some to be another level of security in authenticating people.
But, what happens if you lose your phone? Will such an unfortunate event lock you out of some important website? To address this kind of unfortunate event, some websites provide multiple ways of sending you an OTP like: sending a text message to your phone or calling you or sending you the OTP in an email using any one of your designated email addresses. So if you have lost your phone, then you can ask the OTP to be sent to an email account.
But, what happens if you lose access to your password manager? Would you be able to get to your email to fetch the OTP? Or would you use easy to remember passwords for your email accounts?
Password managers are at best a temporary solution to our password problem. Further, if we have to depend on software and technology to generate and remember our nearly impossible to remember passwords for us, then, in effect, it is as though we are not relying on passwords, but we are relying on some technology. There ought to be a better technological solution to our identification and authentication problem.
Some other technical people saw the issue with passwords and password managers. Instead of making it easy for us to remember hundred or so passwords, they started to reduce the number of userids that we need to have. The idea is for a person to have a single account, a single userid-password combination, that can be used to access many websites and devices.
As with any good idea, many organizations implemented this good idea and made it available to their customers. Thus today we have an ecosystem of such service-provider organizations and several customer organizations where one service-provider organization maintains the accounts, that is the userid-password combinations, and verifies, on behalf of their customer organizations, whether the person intending to login onto the customer organizations website is indeed the owner of the userid with which the login attempt was made. The ownership is still on the basis of knowing the password.
With this sort of arrangement between the service-provider organizations and customer organizations, the total number of accounts that we would need to possess gets reduced. The arrangement can extend to even devices. Thus, these days, we could log into our devices and websites using a single account that we may have with a service provider organization. For the purpose of this discussion, we will call this capability of using a single account to login to multiple websites and devices as a single signon capability.
But, there are many such service provider organizations and we may have accounts with only a few of them and the organization that we interact with, may or may not be the customers of the service provider organization with whom we have our account.
So, we are still not close to a single account per person situation and we still need passwords for our accounts and those passwords need to be strong and we still need to periodically change those passwords and hence we still need password managers.
Even with these two advancements of password managers and single signon capabilities, we still cannot solve the real problem of knowing the true identity of the person who is using the underlying userid. The main reason is that these organizations that offer single signon capabilities are not the authorities in identifying individuals. The only authority in a society that can attest to a person's identity is the society itself. Moreover, we do not want to get in a situation where private sector organizations claim such authority. Why? Because private sector organizations are not equally owned by all citizens.
Because the true identity of the person who logs into some website is not revealed by the person's userid, and when the person has to buy some product or service from that organization, the person now has to go through the process of providing the organization a "payment card" and all associated identification information (name as it appears on the card, card number, date of expiry, security code on the card, address and so on).
Having shared all this information about the payment card, all that information is now stored in the databases of multiple organizations and that means that hackers can potentially hack multiple organizations to get at the same data. If some organization has poor security precautions, then our payment card information can get stolen and misused.
Every time we use our payment card, the organization that issued us the payment card does not ask us whether we intend to make a payment. Most of the time, there is no explicit check to ascertain if we intend to make a payment.
Sometimes and when at a retailer and when we use the card to make a large payment, the card reading machine may ask us to enter a PIN. These PINs are supposed to be a secret known only to us. Thus, we have to remember this PIN and enter it in a public place in full view of whoever may be watching or recording. That exposes our PIN in a public place and makes it possible for someone else to know this secret.
The payment card number is like a userid and the PIN is like a password. Thus, if the physical card gets stolen or even when information about the card gets stolen, it can be misused and we would not even know about it till we check our payment history with that particular payment card.
If you have many such cards, then it is advisable to have different PINs for each one of them and that is remembering more such PINs.
If the card that we use to buy things is our bank's debit card, then what is required for us to obtain such a bank account and the debit card? The most important piece is a government issued identification document (like driver's license, passport, social security number or some such document). So our bank creates an account for us and gives us a debit card only because it can trust our government to validate that we are who we are by issuing us some identification documents.
If the card that we use to buy things is a credit card, then we still need at least one government issued identification document to obtain the credit card.
The point is that we can only undertake such online financial transactions because some time before that, our government issued us an identification document.
Interestingly enough, since there are many government issued identification documents, clearly one of them must have been the very first such document. For citizens born in the society that they live in, the first such document is their birth certificate. For such citizens, even a citizenship card or certificate is the second document. Passports, driver's license, health cards are all subsequently issued documents.
Since each one is important and each one serves a single purpose, misplacing or losing any one of these would result in a situation that is at least inconvenient. Having lost it, to get another copy of it also requires some effort on our part.
Why are they paper based? Why are they not a digital record that can be accessed by anyone who needs to access it and is also authorized to access it? Because, digital records only emerged in the last few decades. Before that it was paper. Moreover, the practice of digital record-keeping has been evolving since it was introduced a few decades ago.
Why do we have to carry them around with us? If someone asks us "who are you?" and we just tell them our name, that is not enough because they have no reason to believe that we are speaking the truth. If they have sufficiently strong reason to know who exactly we are, then they will demand some evidence to back our claim. A driver's license is usually considered as sufficient evidence. But usually any government issued id should serve as evidence to substantiate our claim.
Why does the government have to issue so many documents? This is a result of the paper based legacy of all these government issued identification cards issued by different departments of government. In the digital world, there is no real reason for each of these kinds of documents to exist.
Since all these government issued documents started as paper documents, yet another problem with them is forgery. With forged documents, one could get bank accounts. Back in those days, there was also a concept of a cheque. These cheques were instructions to our bank to pay someone some amount of money. In order for cheques to be honored by our banks, they required our signatures on them which they presumably matched before making every payment. Cheques being paper based could also be forged. Signatures can also be forged.
Technological evolution always attempts to overcome these limitations and our current system has mostly discarded cheques, and most of the government issued ids are encased with hard plastic and some of them even have some electronic chips with information encoded on those chips; with appropriate reading devices, this information can be directly read by computers.
With all these improvements, it is still archaic; archaic enough to be totally unsatisfactory for it to exist in Utopia.
Yet another problem is whether the so-called secret password can remain secret. There are two ways in which the secret password may not remain all that secret. The first is when someone forces us to reveal it. The other is when someone guesses it. Most of us may think that someone else would not have a sufficiently strong reason to force us to reveal our password. That may be true for many; but not for all. We also hear occurrences of user accounts being hacked. Sometimes these hacks are based on guessing the password using some algorithms and computers. Sometimes hackers hack the databases of the websites we visit and steal the userid and passwords from there.
Finally, there are our computers and phones with plenty of security holes in them. Mass produced computers were designed to be tools to do some numerical crunching work. The notions in cryptographic security have only been discovered, invented and implemented in the past few decades. Thus, back then, when the core design of computers and basic softwares was occurring, a comprehensive security solution was impossible. But the innovation desire of people is strong and they provided various means in the computers and basic softwares, like browsers, to deal with the problem of numerous logins, passwords and the state of being logged in.
All these innovations were mere conveniences to work with the underlying security system which was itself being developed and enhanced. There have been many enhancements to the security, the most recent and publicly noticeable example is the "marking" of HTTPS sites as "secure" and marking HTTP sites as insecure; and there is truth in it. The "cookie" inside a browser is a good innovation. The underlying technological solution enables us to "stay logged in" into various websites and web applications that we visit and it allows all that state associated with our "logged in session" to be reused when we intentionally or unintentionally open new browser tabs or new browser windows. That has led to security exploits. When we hear advice like "don't click on links in an email", we are being instructed to not do something in order to thwart attempts at security breaches. Experts in this field also give extensive advice to website and web application developers to follow some best practices to mitigate such security breaches. All this advice implies that we have to do something so that bad people cannot do bad things. It highlights the fact that people can do bad things; it highlights the fact that the security system is not sufficiently secure.
Making "nascent and inadequate security" easy to use by providing conveniences can only mask the problem; never solve it. Such kinds of enhancements will provide better conveniences and better risk mitigation. When we account for the progress that has been made, it will most definitely seem like tremendous progress. But all that progress still does not solve the issue.
As a result of all these inadequacies in the identification and authentication systems of the current times, we have what we term as identity theft.
In short, identity theft is when Person1 successfully "convinces" some Person2 (or Organization2) that they are actually Person3 and then proceed to do some financial transactions that benefit Person1 (pretending to be Person3) and usually to the detriment of Person3. The "convincing" is accomplished with stolen information about the victim (like name, address, date of birth, government issued identification numbers, passwords, etc.). The "convincing" is done to a person who does not know the victim's true identity. So, such a person is incapable of detecting a fraud as it is unfolding. This is a systemic problem.
Thus, we face many challenges in securely identifying and authenticating ourselves to the websites we wish to access. How many of us are capable of dealing with all these challenges? Are the little kids capable? Are the very old people capable? What about everyone in between?
What would eliminate all these problems? Instead of the current state of affairs, how should it really be?
Take your time to think about it before proceeding to read further.
The Outline Solution
The ideal solution should be able to handle the needs that all these identification documents, cards, userids and passwords are trying to accomplish. In this section, we present an outline of the approach that solves all the problems mentioned in the previous section.
We will be discussing legal entities (like citizens, organizations, etc.) and robotic agents (like phones, computers, web applications, etc.). Some legal entities, like organizations, are not human; and, of course, all things are not human. Regardless of whether the other entity is human or not, we humans interact with organizations and robotic agents. When humans interact with organizations, they may be interacting with some human who works for the organization or they may be interacting with some robotic agent. While discussing these interactions, we will personify all non-human entities and robotic agents, because it is easier to write the descriptions and understand them.
We need an ability to uniquely identify each one of the legal entities and each one of our devices (computers, phones, robotic agents, and similar things). While names are perfect to refer to people and things, such names are not unique across all humans and things; hence a name is just not good enough for the purpose of identifying someone or something.
We need an identification number to associate with and record any information about us (and our devices) in a digital system (that is the computers and databases). One such identification number is necessary and sufficient; we don't need more than one.
This identification number should not be entirely made up of some random alphabets and digits. There should be something meaningful for us to interpret in the identification number. That makes the identification number simple enough for us to remember if we wish to, even though we don't have to remember them. In addition, this identification number also needs a country code in it so that the same identification number can be used across country boundaries.
This identification number is to be used everywhere where we currently use a userid or show some form of government issued identification card or document.
The act of identifying someone or something is called Identification. We will abbreviate identification number as ID. When we communicate our identity, we communicate both our ID and our legal name; not just one or the other.
The identification number for all entities has the format CCC.UniqueId where CCC is an up to three character country code and UniqueId is the actual and intuitively meaningful unique identifier for that entity within its country. Further, for humans, UniqueId has the format Type.FullName-DOB-SrNo where:
- Type: is a two or three character type of entity. For humans it could be CIT as an abbreviation for citizen.
- FullName is the person's full name without a space character and using a dash (-) to separate parts of the full name. For someone whose first name is ABC and last name is XYZ and does not have a middle name, the FullName is ABC-XYZ. Note that organizations and self-owning entities must have unique names within a country. Also note that we are distinguishing a person's legal name from a person's full cultural name. Cultural names can be as long as they need to be, but a legal name can consist of just a few parts. For the purpose of the FullName, the legal name is used.
- DOB: It is the human's date of birth in the format YYYYMMDD.
- SrNo: it is a serial number that makes the UniqueId actually unique for humans. Thus the first person whose full name (without spaces) is ABC.XYZ and is born on a specific date will have SrNo as 1, the second person with the same full name and also born on the same date will have SrNo 2, and so on.
Thus the following can all be memorable IDs of different persons: USA.CIT.ABC-XYZ-20211201-1, USA.CIT.ABC-XYZ-20211201-2, CAN.CIT.ABC-20211231-1, CAN.CIT.ABC-XYZ-20211231-1, UK.CIT.ABCXYZ-20211231-1, IN.CIT.ABCXYZ-20211231-1
For the organization name LMNO, the following all represent distinct organizations: USA.ORG.LMNO, CAN.ORG.LMNO, UK.ORG.LMNO, IN.ORG.LMNO, etc. These could be the unique identification numbers of those organizations as well.
Citizens can change their legal name and that will change their ID. So, for citizens, underlying this memorable and yet unique identification number, there is still a non-memorable serial number that is used for actual implementation of data storage, so that all legal name changes still reference the same person's record. This is an example where our intuition about memorable and unique identification numbers cannot be implemented as-is. This is also an example of the reasons why any discussion of implementation details can get very involved.
With these unique IDs, remembering phone numbers of people is not necessary. Forgetting people's birthdays will be hard. These unique IDs can also serve as our email addresses. The similarity of these memorable and yet unique identification numbers to domain names is not accidental or coincidental. All these aspects will be discussed later in this book.
We need a method to ask for someone's ID and present our ID when asked for it. Currently we use identification cards. In a digital world, we should be able to present our ID digitally. This requires us to have the ID on a digital device.
Most of us have one such digital device already; it is our phone and more precisely our smart mobile phone. For every individual, we will store the person's ID on their phone. A person could have more than one phone and every such phone will have the person's ID. With this arrangement, we would not need to carry any identification cards anymore; our phone should do that job.
The way someone or something asks for our ID will remain the same. They will just communicate the request using words (like "ID Please"). This aspect is the same whether a police officer is asking or our home computer is asking or some website is asking. Anyone who asks for our ID should have some legitimate need to know it. Sometimes it is because they have authority (like a police officer) and sometimes it is just curiosity (like a website giving us free services and hoping to earn its revenue by showing us advertisements).
The one who asks for our ID must be willing to give us their ID and this willingness is implicit in their asking us for our ID. That is, the interaction of one entity giving a second entity their ID is symmetric and hence accompanied by the second entity also giving the first entity its own ID. Giving our identification information is a two-way interaction, a two-way exchange of information. This applies to interactions of all entities; not just humans. Thus when we wish to log into our own laptop, we are giving the laptop our ID and the laptop provides us with its ID. Similarly, when we wish to log into a website, we give our ID to the website and the website gives its ID to us.
The person or thing (like a laptop or a website) asking for our ID should have a digital device (with appropriate sensors) to receive this information from us, which we would give them digitally using our phone. For humans, it would be their phone (or a phone-like device). For computers, websites and such kinds of things, it will be some sensor (like camera, bluetooth, etc.) or some software interface (an API) that they can use to communicate this information.
When someone asks us for our ID, that is just the beginning of their interaction with us. Since, in practice, anyone can give any ID to anyone, the one who asks for the ID has a need to be sure that they are interacting with the same person whose ID was given to them. Thus when we give our ID, we will also need to prove that "we are who we claim we are by means of the ID that we provided". This is called Authentication. We are asked to authenticate ourselves and we authenticate ourselves.
To accomplish authentication, we will use financial infrastructure as an intermediary or Trusted Third Party. Financial infrastructure will receive authentication requests, conduct the appropriate authentication (with the cooperation of the party being authenticated) and report the results back to those who requested such authentication.
Authentication is a mutual need. Just like the other party wants to be sure that we are who we claim to be, we also want to be sure that the other party is who the other party claims to be. When we wish to log into our own computer, we need to know that our own computer has not been somehow altered and its identity changed. Also note that when we wish to log into some website, which we know by a name, we need to be sure that we are indeed logging into the website that we intended to log in.
Just like identification, authentication is symmetric. When one entity asks the financial infrastructure to authenticate the second entity, the first entity must do so by authenticating itself to the financial infrastructure and this authentication is communicated to the second entity by financial infrastructure when the financial infrastructure seeks authentication of the second entity.
When we exchange our ID with some other entity, we ask the financial infrastructure to authenticate the other entity for us by giving the financial infrastructure the ID that we got from the other entity. At the same time the other entity also asks the financial infrastructure to authenticate us using the ID that we gave. Thus the two requests that reach financial infrastructure are in quick succession and mutually match; that is, Entity1 asks for Entity2's authentication and Entity2 asks for Entity1's authentication. Only when such matching requests are received by the financial infrastructure within a few seconds of each other, then it proceeds to complete the mutual authentication.
Because we use the financial infrastructure as a trusted third party, after authentication is complete, both entities can be sure that they are interacting with the entity that they intended to interact with. This is important when we interact with strangers, websites and even our own computers.
How can we accomplish authentication of humans? How can we "prove" that we are who we claim to be? We make the claim when we give out our ID (and our name). It is possible that someone else may give out our ID and claim to be us. We want such an impersonator to fail when authenticating as us and we want us to succeed when we are authenticating ourselves.
The first part of the authentication idea is that every human is unique and can be uniquely identified by technology using our unique characteristics; our biometrics. Some examples of biometrics are fingerprint pattern, iris pattern, face pattern and voice pattern. We have technology that is fully capable of identifying humans by just observing humans in action; it is based solely on the uniqueness of various human characteristics; their biometrics. Sometimes, for some specific biometrics, the observation may be needed at a close range (e.g. fingerprint pattern, iris pattern). Other potential biometrics could be gait pattern, movement patterns, and other yet to be discovered human characteristics that can be picked up by sensors or deduced by algorithms.
The second part of the authentication idea is that we allow our government to record all these biometrics. In this context, by "government" we specifically mean the financial infrastructure, since it is the part of our government that is most concerned with it. Financial infrastructure uses all the technological advances and sets up elaborate sensor infrastructure to gather and store the identification characteristics in a way that is secure and not disputable. This infrastructure will be located in the local branch offices of the financial infrastructure.
The third part of the authentication idea is that we ourselves are willing to give these biometrics. Humans make themselves available, periodically, to ensure that their identification records are up-to-date. If, for some reason, a single identification characteristic changes drastically (e.g. burning a fingertip) then humans have to go and update their identification characteristics as and when such a change happens.
The fourth part of the authentication idea is that the financial infrastructure initiates two parallel streams of communication to complete our authentication. On the first stream financial infrastructure directs our phone to gather our biometrics and send them back. On the second stream financial infrastructure directs us to interact with our phone's biometric sensors. For example, the financial infrastructure could tell us to swipe our right index finger on the fingerprint sensor, or it could tell us to hold the phone's camera closer to our right eye for two seconds, or it could direct us to swing our phone horizontally in front of us twice, or it could tell us to say some word that it knows that we can say, etc. There are many more things that the financial infrastructure can ask us to do and these examples are only for illustrating the kinds of things that it can ask us to do; these examples are not intended to be an exhaustive list.
Through all this the financial infrastructure is attempting to determine whether we are responsive to its instructions and at the same time it is asking the phone to capture our biometrics (fingerprint scans, iris scans, voice scans, phone movement scans, etc.) and send them back to the financial infrastructure.
Note that our current phones have a sufficient number of different sensors. For example, we already use a fingerprint sensor to unlock our phone. In addition, our phones have cameras, microphones and gyroscopes. In the future we will have even more sensors like stereoscopic cameras facing us, temperature sensors, etc.
The fifth part of the authentication idea is that our phone sends the biometric scans back to the financial infrastructure as instructed. The financial infrastructure then matches our actions (as it instructed us) with the evidence of those actions (the biometric scans) and our biometric scans with the scans that it has on record. Using all this matched data, it makes a determination if we are in control of our phone, intellectually aware to accurately follow instructions and can provide our biometrics that match what it has on record. If all of this checks out fine, it decides that authentication was successful.
The sixth and final part of the authentication idea is that the financial infrastructure awaits the completion of the authentication for both parties to succeed. As mentioned earlier, both identification and authentication are symmetric. Once authentication completes for both parties, the financial infrastructure informs both of them of the success or failure of the authentication process.
How can we accomplish authentication of our devices and robotic agents?
We need to authenticate only those devices and robotic agents that will interact with either humans or the financial infrastructure or both. For any such device, we humans and the financial infrastructure both need to be sure that they are indeed what we think they are.
For our phones, the financial infrastructure needs to know that it can trust the phone in order to ask it to capture our biometrics. For that, the financial infrastructure requires our phones to satisfy certain technical requirements. Also, the financial infrastructure needs to directly set up our phone for us. During this setup, the financial infrastructure gives our phone an ID (for the phone itself) and gives it a predictably changing and yet unique characteristic (similar to our human characteristics) and then the financial infrastructure binds our ID to our phone. All that is done in a cryptographically secure way.
We will call the "predictably changing and yet unique characteristic" as phone-fingerprint (and when given to some device, the device-fingerprint). Since phones are just computers, they can be fully copied and the phone-fingerprint is a way to detect such duplicates, raise red flags and disable all such duplicates.
The first assurance that the financial infrastructure gets, in the normal course of things, is that when our phone is in communication with the financial infrastructure, it can be nothing other than our phone. The second assurance that the financial infrastructure gets is that when things are normal, we will be using our phone to communicate with the financial infrastructure. The third assurance has to be given by us to the financial infrastructure and it is that "if we lose our phone or think it has been compromised, we will inform the financial infrastructure (at any one of its branches) about it as soon as possible". With this final assurance, the financial infrastructure will be able to narrow down the period of uncertainty about who is in possession of the phone and investigate all actions taken by our phone for any suspicious activity. Note that the financial infrastructure records everything that our phone communicates with the financial infrastructure, so the financial infrastructure knows everything that the phone may have been used for during the time that we were not in possession of the phone.
A note regarding the statements in previous paragraphs that mention phrases like "certain technical requirements", "cryptographically secure way", "device-fingerprints": This book is not intended to outline the technical "how" part of the systems being discussed here. The first reason is that this book is intended to be read by everyone; so, placing an inordinate amount of technical detail in this book will hinder understanding of the subject matter of the book (which is intended to be conceptual in nature). The second reason is that writing the technical details will require plenty of pages and hence such technical details need to be documented in a separate book.
With the setup and bindings mentioned above, the financial infrastructure can trust its communication with our phone, our phone's unique identity (which in essence is the authentication of the phone) and its communications to us (through our phone).
Authentication of other non-phone devices is similar to the authentication of a phone. We humans, using software provided by the financial infrastructure, initialize our computers and other devices by giving each such computer an new ID (and that ID is bound to the ID of the creator of that new ID), a device-fingerprint and report all that to the financial infrastructure. For each such device, the financial infrastructure does its first "test communication" with such a device and if that succeeds then after that time the financial infrastructure knows about this device and after this the financial infrastructure can trust its communications with a device initialized in this way.
These computers and devices, if permitted, could initialize other robotic agents in a similar way.
In each case, the one who initializes some device must keep track of them all (by assuming ownership of the actions of such devices) and if such an owner (that is owner of the responsibility of keeping track of the device) loses control over its devices, then that loss must be reported to the financial infrastructure so that the financial infrastructure can deal with such devices (and other devices that such a device is responsible for).
So far we have discussed identification and authentication. With these two implemented, we humans only ever need a single userid (which is our ID) and we do not need any password to login to any system that permits us to login. The authentication mechanism is the assurance that the party interacting at the other end of the "line" is indeed the one claimed to be.
Note the phrase "permits us to login" in the above paragraph.
Here is what it means for a computer. Imagine that we visit some friend and wander off into their study room and find the friend's laptop on the desk. Should we be able to login to this laptop? No. Because the friend never permitted us to login to their laptop. Just because we can authenticate ourselves to any laptop using our phone does not mean we get to login to it. For that we need the permission of the owner. The owner can instruct their laptop to allow us to login by providing the laptop with our ID and permitting login using that ID.
Similarly, for websites, our ID and the authentication mechanism could allow us to login to any website. But websites only allow their customers to login. So, our ID should be in the website's customer database for us to have permission to login.
Speaking of permissions ...
We, humans, have freedom to do many things and we are prohibited (by some law or regulation) from doing some things. The idea of a Permission is somewhere between freedom and prohibition. In this general context, there are things that we can do, but before we can do them we require permission to do it. In the context of a society we call such permissions "privileges". In the context of an organization we call such permissions as "authority" or "authorization to do something".
An example of privilege in our societies is that of driving a motor vehicle. Driving a motor vehicle is not a prohibited activity, but it requires us to obtain a driving permit and different types of motor vehicles have different kinds of permits and the process of obtaining those permits requires us to demonstrate that we can operate those kinds of vehicles safely and by following all the rules of operating them. Driving a motor vehicle is considered a privilege; not a freedom.
Recording the fact that we have driving privileges is in essence noting down that fact in association with our ID. Note that the financial infrastructure maintains the identification data along with other kinds of data. The department of motor vehicles maintains data about driving privileges and it does so in association with our ID, since that is the only thing that identifies us. So, when we are driving a vehicle and a police officer pulls us over and demands to see a proof of our driving privileges, we present the officer with our ID and they presents their ID and we both authenticate ourselves to each other and then because the police officer would have permission to look into our records, they would notice that we indeed have driving privileges. Thus there is no need for a physical driver's license.
Let us consider an example of authority in an organization. In an organization there are many tasks to be done and not every employee of that organization is permitted to do every task. For example, a CEO of a retailing organization would not have the authorization to stand at a checkout counter and help customers checkout the things that they desire to buy; primarily because the CEO may not have adequate training and practice in how to do it well enough to result in a satisfactory customer experience. Another example is that only a few people in an organization are authorized to buy things on behalf of the organization; not every employee can do this. Both these examples require some record of permissions in some databases. In the case of permission to operate a checkout counter, its record would be in the organization's database. In the case of permission to buy things on behalf of the organization, its record would be in the financial infrastructure database, because it deals with money and ownership and hence it involves a transaction.
Now it is time to discuss financial transactions.
Anything that we typically do, we call it an action. In this context, it is some digital action. Some examples are: saving a file, permanently deleting a file, buying something online, buying something in a physical store for which we pay using some digital mechanism, etc.
Some actions (like saving a file) are completed without any confirmation. Some actions (like permanently deleting some file) require a confirmation; typically the computer presents us with a dialog box with a question like "Are you sure?" and with two buttons called "yes" and "no" to indicate our confirmation or rejection of the proposed action. Intuitively, it is like answering "yes" to a question like "are you sure you want to proceed?".
Some actions are multi-part. So in order to complete them, all the parts need to be done. If in doing one of the parts, the system encounters some sort of failure, then what is the fate of this multi-part action? Is it partially done or is it like it was never attempted to be done?
The word transaction conveys the following meaning: any single-part or multi-part action that is intended to be either completely done or in case of any failures in completing any one of the actions it appears to never have been initiated. Computer systems, that is software and databases combined, have been good at doing this for decades.
A financial transaction is an exchange or a trade of some assets between two entities. In such an exchange or trade, both parties first agree what they are intending to trade and then actually make the trade. It is implicit in this understanding that both parties own the things that they are willing to give to the other party.
For example, Participant1 may agree to give Participant2 ownership of Item1 in exchange of 10 dollars. To make such an agreement, they must first own the items that they are intending to transfer. If they both agree to this exchange, they can proceed to complete this financial transaction. In actually doing it, Participant1 transfers ownership of Item1 to Participant2 and simultaneously Participant2 transfers the ownership of 10 dollars to Participant1. If for some reason, one of the two simultaneous actions of the transaction cannot be completed, then the financial transaction fails as a whole, and both participants are left with the things that they originally owned.
While the above description and example explicitly mentioned two participants, a financial transaction can be conducted by a single participant (for example when transferring some asset from one account to another) or there can be more than two parties engaged in a single financial transaction (for example when many friends pay their restaurant bill by splitting it in some proportion).
In the previous chapter, we discussed that financial infrastructure will maintain a record of "who owns what". In doing that, every legal entity will be identified and recorded in the entity registry. These entities will use the ID as clarified in this chapter. Every thing which can be owned, will also have an identifier (in the product registry). The facts like "who owns what" will be recorded in the financial book of accounts, which is a computer system; that is software and databases maintained by the financial infrastructure. That enables the financial infrastructure to record financial transactions. And that enables us to account for the wealth of every one.
But, before we can record a transaction, we need the parties conducting the transaction to agree to it. This is where the identification and authentication discussed earlier are used.
Both transacting parties identify and authenticate themselves. This enables the financial infrastructure to open their respective book of accounts and from these, they can pick the items that they want to trade (or exchange). The parties digitally put together the list of items that they wish to give and take. Then the parties review this list and see if they missed something that they wish to change. Once the review is done, they both confirm to the financial infrastructure that they wish to carry out the changes as indicated by their transaction to their respective book of accounts. This confirmation is called authorization. When a person gives such a confirmation, the person is said to have authorized the transaction.
Authorization may seem very similar to a confirmation. The difference between a confirmation and an authorization is in the importance of the changes that the action will do. Important changes require authorization; and for normal changes a confirmation is sufficient.
Authorization means giving the final permission to carry out an action. If multiple participants together initiate an action, like a transaction, then all the participants involved in conducting the transaction must authorize the transaction before the transaction is attempted.
When discussing digital systems and actions, the digital system definitely needs to know who is initiating the action and must also be certain that the initiator is indeed who they claim to be. That is, a digital system must never undertake to initiate any action instructed to it by someone before completing identification and authentication. When a robotic agent is interacting with the financial infrastructure, its identification and authentication is done automatically. When we interact with the financial infrastructure, we complete our identification and authentication using our phone.
When a digital system performs normal actions and requires a confirmation for it, authentication is assumed and hence not explicitly required. Just pressing the "yes" button to confirm is sufficient.
However, when a digital system is about to perform an important action (like a financial transaction) and requires a confirmation for it, all prior identification and authentication is not sufficient. So, just pressing the "yes" button to confirm is not sufficient. A fresh authentication is required to complete the authorization. This final authentication in conjunction with the immediately prior "yes" is authorization.
For example, in order to authorize a purchase, the financial infrastructure will ask the buyer to review the intended transaction. This is done by presenting the transaction on their phone. Then the financial infrastructure will ask the person to press the "yes" button to confirm. If the individual presses the "yes" button, then the financial infrastructure will ask the person to authorize the transaction by asking the person to perform one or more authentication actions (as discussed earlier). This final authentication must succeed in order for the financial infrastructure to deem the authorization to have completed.
In the immediately preceding example, instead of using the system we described, if we were to pay for it using a cheque, then our signature on the cheque is the authorization, but only for the payment part. It lacks clarity about the ownership of the things that we purchased.
If there is only a single party to the transaction (like a person moving some assets from one account to another), then only that person needs to authorize the transaction. If there are two parties in a transaction, both parties must authorize the transaction in order for the transaction to complete. If there are more than two parties that together want to do a transaction, then all those parties must authorize the transaction and only then the financial transaction can complete.
When we go to a retail store and buy things, we take physical possession of the items we buy and engage in a financial transaction. Thus all items in such transactions are known to be in the possession of the rightful owner immediately after the conclusion of the transaction and further tracking is not needed for the items on the transaction. This situation (transfer of physical goods) is indicated within the transaction itself.
If we buy something online, those items get shipped, so while the financial transaction has concluded but physical possession is still unresolved. Items in this situation (that is to-be-shipped) are marked in the transaction as such and when shipped and delivered, their possession needs to be also recorded. Such tracking and record keeping helps in ensuring that owned items reach the rightful owner in a reasonable amount of time. It confirms whether the rightful owner is actually in possession of the physical item.
When we buy or sell things that have purely a digital form (like ownership shares in some publicly traded organization), the conclusion of the financial transaction also means that the rightful owner actually possesses the purchased item.
Money is always in digital form (since we have eliminated cash). Hence payments are immediate and immediately reflected in the account balances.
Thus financial transactions can be conducted with full identification, authentication and authorization of the participants and with confirmation that the rightful owner has the possession of the things that they own.
When transacting participants are assembling their intent in a to-be-completed transaction, the financial infrastructure validates such planned transactions. It checks if the transaction is valid from an ownership perspective. It checks whether what is being transferred by a participant is indeed owned by that participant.
This validation eliminates the possibility of stolen items being sold. If the item was stolen by someone and that someone was trying to sell it to someone else, then that item would not be on the list of owned items for that person; that is it will not be in their book of accounts.
Such validation occurs in the pre-authorization phase of the transaction as well as immediately after the authorization and before embarking on doing the changes as planned in the transaction.
This second validation also eliminates selling things that one has unwittingly come in possession of without actually owning them. There are many ways in which this can happen. For example, a retail store may get a shipment of items with a quantity much more than they ordered and paid for (because the manufacturer or whole-seller made a mistake). So the physical items that represent the excess quantity are not owned by the retail store and yet the retail store is in the possession of these physical items. When the retail store placed the order for the shipment, the seller had at least those many items and the ownership of those items was transferred over to the retailer in exchange for payment for it. So, once the retailer has exhausted its legitimate inventory of those items, the remaining items are not for them to sell; they need to be sent back to the seller. The person stocking the retail store's shelves may not know that these items are excess. A shopper definitely does not know of this situation and hence may have picked it from the shelf. This situation is neither fraud nor is it theft; it is a mistake. Yet, financial infrastructure is capable of identifying the situation and pointing it out. The net outcome is that such excess items cannot be sold by the retail store.
Financial Infrastructure maintains a full record of these transactions for a long time. Participants have access to this record. Hence any disputes regarding those transactions can be investigated with the help of the complete record of the transaction.
Because transactions are authenticated and authorized as they occur, there is no need to give each other a printed receipt, send a text message about the conclusion of such transactions or send an email about a payment having occurred for such transactions. For us, it means less (and mostly useless) messages to check and paper to shred.
Financial Infrastructure provides extensive security related to authentication and authorization, so financial frauds associated with actual financial transactions should be impossible. However, if someone makes a claim of fraudulent transaction, those claims can be fully investigated because the financial infrastructure has a complete record of not just those specific transactions but all the transactions of the parties involved in the alleged fraudulent transactions. In such cases both parties to the transaction are fully known and their entire transaction history is known and if necessary all that can be traced to whatever length is necessary. Such a possibility does not exist today.
Note that frauds can be of many kinds and we discussed frauds pertaining to actual transactions themselves. There are other frauds outside the scope of financial transactions like misrepresentation, misinformation, etc.
The final concept in this context is that of Limits. The purpose of "limits" is to limit the amount of damage due to unwise but permitted actions. A simple example of limits are our current credit card limits or the limits on the amount of cash that we can withdraw on a single day. In the context of organizations, those employees that are authorized to buy things on behalf of the organization may have per-day and per-month limits.
Limits have to be reviewed periodically and then either confirmed or revised.