Sunday, April 19, 2020

statistical and password-based memory protection

I've recently had an interesting online conversation with Terry Lambert (whom I know from the FreeBSD times) about the memory protection that resulted in I think a new and interesting idea, that I want to write down here.

Terry was talking about the statistical memory protection. I'm not sure if he invented it, or at least the term, since I can't find it anywhere else. Terry gives the related links but none of them are exactly about that:


The statistical memory protection works by having a large common address space where the protection is achieved by allocating memory at random addresses, so the address itself serves as a password. Plus the same physical page can be mapped at different addresses with different permissions, so knowing one address would give the write access while another one the read-only access.

As far as I see it, the major benefit of this system would be in the ability to pass the complex data structures linked by pointers between processes. With the address map being common, the pointers would work in any process. This gives cheap and convenient "mailboxes" for exchanging messages between processes.

This would have multiple potential problems too.

One is the “password leakage” that is also mentioned in one of the papers. Once a password gets out, it can be saved and passed around. With the separate addresses for RO and RW access to the same page this could get pretty bad:  if one of these pointers in an RO page accidentally refers to the RW address, the permission leaks. And actually the whole passing of pointer-connected structures breaks down because the writer and reader would have to use different pointers (with RO and RW addresses). 

Another issue, the memory segments would have to be allocated at the globally unique random addresses, which would have to be a synchronized operation. Considering that the segments have to be spaced far enough form each other, that would consume a sizable number of bits and degrade security of the passwords. 

This would also be susceptible to the brute-force attacks: maybe the attack won’t enumerate the whole address space but it can start with a random point and just see what can be found there. It won’t always be lucky, but sometimes it would. Another way of the attack would be to stick incorrect pointers into the shared pages, hoping that the other process reading the pointer from that page would cause some damage.

The brute-force attacks can be frustrated by keeping a fault counter, and killing the process that has too many faults, but it's harder than it seems at first. The fault counters would have to have a scope greater than a process. Because if you kill a process, nothing stops the user from spawning another one. Also nothing would stop the user from spawning a lot of processes, each doing fewer probes than the fault limit and exiting. But on the other hand, with a greater scope, a malicious application would be able to screw the user completely, so drawing this scope boundary won’t be easy. A possible solution for that problem is to introduce a delay in process's scheduling after a fault, and these delays can even be made to increase if user's global rate is high. This would dramatically slow down the attacks without affecting the correctly running code.

There is one more reason for why the fault counting would have issues: if we allow to transfer the pointers through the global memory, there is no way to know whose fault is the faulty pointer, did the current process get it from someone else?

I've come up with an idea to improve on the statistical memory protection, let's call it password-based memory protection: the same can be achieved in a better and cheaper way by associating a long (say, 128 bit) “password” with a page, effectively making the addresses in the “password-protected” pages 192-bit long (64 bit for the address and 128-bit password). There can be separate passwords for reading-only and reading-writing a page. Possibly execute-only too. But the passwords can be kept separate in some context, so the “short” 64-bit part of the address would be the same for the reader and writer. The context can be inherited from the page: when we load a pointer from a page, we set in the CPU register the password part of it equal to the password of the page. This would also allow to catch the malicious addresses in the shared pages by mismatch of the password. And the whole address space can be split into the private and shared parts, with only the shared part using the passwords. This would allow the private segments to allocate pages without getting a global lock on the address space. And the whole 128-bit space can be used for the password.

It would still be randomly susceptible to the brute-force attacks, as well as to the password leakage.

Some progress can be made on the fault-blaming through. If only the pointers without passwords cold be transferred through the password-protected memory then when we read such a pointer, we can implicitly add the password of the protected page to it and immediately verify that the pointer is valid, that this password also matches the page referred by the pointer. If it doesn't, it means that someone tried to slip in an invalid pointer. If our process has read-only access to the page, we can even say for sure that if was another process's fault and not penalize this one (unless of course it mapped the same page twice, and the other mapping is read-write).


Let's consider in more detail how the pointers would work. For the pointers-with-passwords let's use the "far" keyword from the days of MS-DOS. The common (AKA "near") pointers without password would be just pointers without a special keyword.
 

int *pn; // a common pointer
int far *pf; // a far pointer that includes the password

They generally can't be cast to each other, a far pointer has to be explicitly converted from a near pointer and a password:

pf = make_far(pn, password);

Typically we would use the common pointers in the structures that get passed through the password-protected memory, to store only the address but not password:

 
struct data {
  int *p1;
  int *p2;
};

This would also allow to use the same structure in the normal and password-protected memory.

But when  a common pointer gets accessed through a far pointer, it gets treated in a special way: essentially, it becomes implicitly converted to a far pointer.

 
data far *dpf;
data *dpn; 

pn = dpf->p1; // incorrect, won't compile
dpn->p1 = dpf->p1; // incorrect, won't compile
pf = dpf->p1; // correct, implicitly copies the password 
  // from dpf to pf and verifies that pf is valid, throws an
  // exception if invalid
dpf->p1 = pn; // incorrect, won't compile
dpf->p1 = dpn->p1; // incorrect, won't compile
dpf->p1 = pf; // correct, verifies that dpf and pf have
  // the same password (throws an exception if the
  // passwords differ), and drops the password when
  // writing into p1
dpf->p2 = dpf->p1; // correct

It would actually be more correct to say that the reference to a pointer accessed through a far pointer turns implicitly into a reference to a far pointer, and handles the conversion between the kinds of pointers.

This of course wouldn't stop someone from copying pointers as data to or from the password-protected memory via memcpy (though it would of course have to be a special version of memcpy that accepts the far pointers as arguments), but that would be their own problem of forfeiting the hardware help in the protection.

On the assembly level, this can be done with the addition of the password registers in the CPU along with the instructions for accessing the password-protected pages. Something like this:


 
  ; load the address part of the far pointer
  load (r1), r2
  ; load the password part of the far pointer into a password register
  loadpwd (r1+16), pr1
  ; read a value at a password-protected  far pointer
  loadpp r2, pr1, r3
  ; assuming that the read value was a pointer, verify that it was correct,
  ; it must be pointing to a page that uses the same password from pr1
  verifyp r3, pr1, failure_label
  ; or maybe read and verify in one instruction
  loadpv r2, pr1, r3, failure_label
  ; store a far pointer
  store r3, (r5)
  storepwd pr1, (r5+16)


The next development to prevent the brute-force attacks and password leaks is to have the OS to track on the side, access to which segments is granted to the process. 

Then the problem of the leaked passwords can be solved by changing them periodically. Instead of keeping the pair (password, address) in the far pointer, place the all the passwords for all the granted segments into a canonical location in the process. Then represent the far pointer as  the pair (address of the password, address of the data). Whenever a password needs to be changed, change it in the canonical location (of which the OS would keep track). And when an access exception happens due to a stale password in the CPU register, the kernel can check the access rights of the process against the canonical list. If the rights are present, it can update the password in the register and return to retry the access. 

But a consequence of keeping this special section is that the passwords themselves can be stored in a protected section of memory, otherwise inaccessible in the user mode. The password registers in the CPU would have two parts: the address of the password in the special section that can be loaded and stored by the process and the shadow value of the password itself, invisible to the process and used to check the access. Then the users won’t be able to brute-force the passwords at all because they won't be ever able to set the arbitrary passwords, all the password management would happen in the OS. 

Then if the passwords are not user-visible, they don’t have to be long any more, 64 bits, or maybe even 32 bits would suffice as long as they are all unique. Another consequence would be that changing the password would not be needed most of the time, its only use would be to revoke someone’s access.

But wait, there is more. If we don't have the special hardware available, instead of checking on every access we can check when mapping the page into the address space of the process. It’s just that the shared pages would be mapped at the same virtual address in every process. So we’ve made a full circle back to the existing solution here. It feels like not disturbing the page table would be more efficient, but actually not: the segment would have to be allocated first anyway before making it available to any process, and if the page map is shared between all the processes, that means potentially disturbing all the processes, not only the ones that will use that segment. 

The only other part of the implementation would be the verification that the pointers received in the shared pages are safe, that they point to the pages of he same source. This can be done in user space, allocating a “page id map”, where each page address is direct-mapped to the “source id” of this page (say, 4 bytes each). Look up the ids of the page that contained the pointer and of the pointed page, and verify that the ids are the same. The map itself can be sparse, the unused parts mapped to an all-0 page, so it would be cheap. Some kernel help might be useful for handling the sparseness and making sure that the page id map is consistent with the page table. The compiler support with the "far pointers" would also still be useful for the automated verification, even with the user space solution.