[Linux][Android] Analyzing Memory Usage.

|

First of all, I am going to avoid methods that using ‘Shell’ command, because in embedded environment(embedded linux like Android), we cannot expect full-functioned shell (ex. bash, tcsh etc).
Before talking about memory analysis, let’s look over fundamental concepts related with memory.
(This article is written based on Kernel 2.6.29)

There are 4 type of memory.
private clean, private dirty, shared clean and shared dirty are those.
* Clean vs. Dirty.
Clean means “It doesn’t affect to system in point of semantics.” So, we can abandon this at any time. Usually, mmap()ed or unwritten memory can be it.
Dirty is exactly opposite.
* Private vs. Shared
This is trivial.

Here is example in Android,
Shared clean : common dex files.
Private clean : application specific dex files
Shared dirty : library “live” dex structures(ex. Class objects), shared copy-on-write heap. – That’s why ‘Zygote’ exists.
Private dirty : application “live” dex structures, application heap.

Linux uses Virtual Memory(henceforth VM). I think reader already familiar with this. Let’s move one step forward. Usually, “demand paging” is used. By using “demand paging”, Linux doesn’t use RAM space before the page is really requested. Then what this exactly means. Let’s see below codes.

#define _BUFSZ (1024*1024*10)
static int _mem[_BUFSZ];
int main (int argc, char* argv[]) {
    int i;
    /* --- (*1) --- */
    for(i=0; i<_BUFSZ; i++) {
        _mem[i] = i;
    }
    /* --- (*2) --- */
}

As you see, “sizeof(_mem)” is sizeof(int)*10*1024*1024 = 40MB (let’s assume that sizeof(int)==4).
But, at (*1), _mem is not REALLY requested yet. So, Linux doesn’t allocate pages in the RAM. But, at (*2), _mem is requested. So, pages for _mem is in RAM.
OK? Later, we will confirm this from the Kernel.

Now, let’s go to the practical stage.
As reader already may know, there is special file system – called procfs – in Linux. We can get lots of kernel information from procfs including memory information.
Try “cat /proc/meminfo”.
Then you can see lots of information about memory. Let’s ignore all others except for ‘MemTotal’, ‘MemFree’, ‘Buffers’, ‘Cached’
(Documents in Kernel source are quoted for below description)
———————————————–
MemTotal : Total usable ram (i.e. physical ram minus a few reserved bits and the kernel binary code)
MemFree: The sum of LowFree + HighFree
LowFree: Lowmem is memory which can be used for everything that highmem can be used for, but it is also available for the kernel’s use for its own data structures.  Among many other things, it is where everything from the Slab is allocated.  Bad things happen when you’re out of lowmem.
HighFree: Highmem is all memory above ~860MB of physical memory Highmem areas are for use by userspace programs, or for the pagecache.  The kernel must use tricks to access this memory, making it slower to access than lowmem.
Buffers: Relatively temporary storage for raw disk blocks shouldn’t get tremendously large (20MB or so)
Cached: in-memory cache for files read from the disk (the pagecache). Doesn’t include SwapCached.
———————————————–

Now, we know that size of total memory and free memory etc.
Type ‘adb shell ps’
we can see VSIZE(henceforth VSS), RSS(Resident Set Size) column. VSS is amount of memory that process requires. RSS is amount of memory that is REALLY located at physical memory – demanded one!.
As mentioned above, main reason of difference between VSS and RSS  is ‘demand paging’.
Now, let’s sum all RSSs. Interestingly sum of RSSs is larger than total memory size from ‘meminfo’
Why? Can you guess? Right. Due to shared one. For example, In case of Android, there are some prelinked objects. And those are shared among processes. And process RSS size includes those shared one. That’s why sum of RSSs is larger than memory size.

To understand deeper about VSS, see following example.
Make empty program, execute it and check it’s VSS. For example

void main() { sleep(1000); }

It’s size is over 1M!. Why? Kernel reserves memory blocks to handle process itself – for example, page table, control block etc.  As an example, in case of page table, normal 32-bit machine uses 4Kb page and 4G virtual memory. So, number of pages are 4G/4K = 1M. To keep tracking 1M pages, definitely, certain amount of memory is required.
So, at least some – actually, not small – amount of memory is required even in very tiny process.

As mentioned above RSS includes shared memory block. But, we want to know reasonable size of memory that is really located in.
Here is PSS(Proportional Set Size).

PSS = "Non-shared process private memory" + "Shared memory" / "Number of processes that shares those".

Great!. So, sum of PSS is real size of occupied memory.
Then, how can we know it?
The most primitive way is checking

/proc/<PID>/smaps

You can easily found PSS field in it. (For more details, see kernel source code ‘task_mmu.c’)
==> Article is not completed…. Will be updated more later…

And