Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs

Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs

Awesome… View On WordPress
from Tumblr http://ift.tt/1TEv6GJ
via IFTTT

Advertisements

Urlshortener in python

I attempted this(url shortener) project as a kind of a dare. Over a lunch one day, atul casually commented that a url shortener,

would be a ten minute work for someone like you and a blog about it would make a nice fit for my site.

I was not sure it’s a 10 minute work, but sounded like a small fun project for weekend attempts.

It took more than half a day, for me to be content(not happy, but just content) with the resulting work.

Nevertheless, it was time well spent. Here’s the link to the actual project.<a href=”https://github.com/emofeedback/urlshortener”&gt;

 

Few clarifications about some choices(should i call it software engineering??):

  1. Redis because it’s extremely efficient, stays in RAM, and has some unconfirmed reports of being used to power china’s internal twitter clone.
  2. RAM based data base means lower latency, as opposed to a disk+ cache based database, but needs higher RAM.
  3. Used 5 chars, because, well the internet is big, and thought it’s a good number to cover most unique urls.(especially, when combined with the 78 chars allowed for url, i.e: 5^78) Thanks to a colleague’s suggestion for the idea, i was thinking about hashing/random before his suggestion.
  4.  A hash map of shortened url-> original url and original url -> shortened url was kinda against my idea. I still keep thinking this is too much memory, we should find a different way. I think if I take away the use-case of already shortened url, being submitted for new shortening, should return, I can eliminate the original url -> shortened url hash map.
  1. To check if a new original url has already been shortened, the hyperloglog comes to the rescue. Just pass it through the HLL algorithm and see if it returns True and False.
  2. From the UI point, well i just put up a minimal html form to take a original url and return a json.
  3. Yet to do, add a proper fabric script to setup virtualenv, python packages, redis install and nginx config and restart nginx.

You can see it live here.
 

C Arch patterns

There are some patterns you can see in C code.
Mostly signs that give off indications of what are the data structures used.

For example, this is a code from cpython source(Include/object.h):(obfuscated for illustrative purposes)

typedef struct XXX {
struct XXX *nnn;
const char* string;
PyObject *object;
} XXX;

The sign here is that the struct name (XXX here) is used, within the
struct definition for another element. In this case, this element is a pointer to the self. This indicates this is a linked list.

It can be a circular linked list or singly linked list with the last node pointing to NULL. That distinction can be guessed with initial values. Given it’s not being initialized, I would guess this is a circular linked list, but to verify I will have to check parts of the code where this struct is initialized.

On the contrary, if I had needed this to be a doubly linked list instead, I would have defined it this way instead.

typedef struct XXX {
struct XXX *ppp;
struct XXX *nnn;
const char* string;
} XXX;

The point being two pointers to the same structure, but this does not necessarily mean there’s a doubly linked list implementation. If someone’s trying to write Obfuscated C code, they might use this struct but use it differently, on purpose to hinder code readability.

Oh, by the way, this could also be made into a binary-tree structure by manipulating the insertions into *ppp, *nnn, as new structs. In case of making this a binary tree, we would need a check to make sure no new object linked to *nnn(and *ppp) is already in the structure. Infact it would be more apt naming those *left and *right than *ppp, *nnn.

i.e:

typedef struct XXX {
struct XXX *left;
struct XXX *right;
const char* nodelabel;
} XXX;

Note purely in terms of the type of elements composing/constituting the struct, there’s no difference between this and the above doubly linked list. However, the difference will arise out of how these struct pointers are inserted.
#TODO: CODE SAMPLE HERE.

Now, I’ll graduate to the next data structure, I believe is logically next, (and is one of my favourite) data structure — Graphs.

Linked lists, are simply a very specific case of a graph(which can be called a specific case of sets).
Linked lists are simply graphs restricted to only one edge,(directed) between any two nodes.
Ofcourse, am assuming planar graph<link>,(i.e: one edge can only connect two nodes) and not hyper-graph<link>(think nodes in 3D, and edges that can connect more than “2 nodes”.)

So, Opencog<link> is a open-source attempt to create an Artificial General Intelligence system.
At the core of it is a hypergraph<link to why hypergraphs>(aka atomspace).
Note: this is a C++ project.

Here’s the code for defining an atom:

class Atom
: public std::enable_shared_from_this<Atom>
{
friend class ::AtomUTest; // Needs to call setFlag()
friend class AtomTable; // Needs to call MarkedForRemoval()
friend class ImportanceIndex; // Needs to call setFlag()
friend class Handle; // Needs to view _uuid
friend class SavingLoading; // Needs to set _uuid
friend class TLB; // Needs to view _uuid

private:
//! Sets the AtomTable in which this Atom is inserted.
void setAtomTable(AtomTable *);

//! Returns the AtomTable in which this Atom is inserted.
AtomTable *getAtomTable() const { return _atomTable; }

protected:
UUID _uuid;
AtomTable *_atomTable;

Type _type;
char _flags;

TruthValuePtr _truthValue;
AttentionValuePtr _attentionValue;

// Lock needed to serialize AV changes.
// Just right now, we will use a single shared mutex for all
// locking. If this causes too much contention, then we can
// fall back to a non-global lock, at the cost of 40 additional
// bytes per atom.
static std::mutex _avmtx;
static std::mutex _mtx;

/**
* Constructor for this class.
*
* @param The type of the atom.
* @param Outgoing set of the atom, that is, the set of atoms this
* atom references. It must be an empty vector if the atom is a node.
* @param The truthValue of the atom. note: This is not cloned as
* in setTruthValue.
*/
Atom(Type, TruthValuePtr = TruthValue::NULL_TV(),
AttentionValuePtr = AttentionValue::DEFAULT_AV());

struct InSet
{
// incoming set is not tracked by garbage collector,
// to avoid cyclic references.
// std::set<ptr> uses 48 bytes (per atom).
IncomingSet _iset;
// Some people want to know if the incoming set has changed...
AtomPairSignal _addAtomSignal;
AtomPairSignal _removeAtomSignal;
};
typedef std::shared_ptr<InSet> InSetPtr;
InSetPtr _incoming_set;
void keep_incoming_set();
void drop_incoming_set();

// Insert and remove links from the incoming set.
void insert_atom(LinkPtr);
void remove_atom(LinkPtr);

private:
/** Returns whether this atom is marked for removal.
*
* @return Whether this atom is marked for removal.
*/
bool isMarkedForRemoval() const;

/** Returns an atom flag.
* A byte represents all flags. Each bit is one of them.
*
* @param An int indicating which of the flags will be returned.
* @return A boolean indicating if that flag is set or not.
*/
bool getFlag(int) const;

/** Changes the value of the given flag.
*
* @param An int indicating which of the flags will be set.
* @param A boolean indicating the new value of the flag.
*/
void setFlag(int, bool);

//! Marks the atom for removal.
void markForRemoval();

//! Unsets removal flag.
void unsetRemovalFlag();

public:

virtual ~Atom();

/** Returns the type of the atom.
*
* @return The type of the atom.
*/
inline Type getType() const { return _type; }

/** Returns the handle of the atom.
*
* @return The handle of the atom.
*/
inline Handle getHandle() {
return Handle(shared_from_this());
}

/** Returns the AttentionValue object of the atom.
*
* @return The const reference to the AttentionValue object
* of the atom.
*/
AttentionValuePtr getAttentionValue() const { return _attentionValue; }

//! Sets the AttentionValue object of the atom.
void setAttentionValue(AttentionValuePtr) throw (RuntimeException);

/** Returns the TruthValue object of the atom.
*
* @return The const referent to the TruthValue object of the atom.
*/
TruthValuePtr getTruthValue() const { return _truthValue; }

//! Sets the TruthValue object of the atom.
void setTruthValue(TruthValuePtr);
void setTruthValue(CompositeTruthValuePtr ctv) {
setTruthValue(std::static_pointer_cast<TruthValue>(ctv));
}

//! Return the incoming set of this atom.
IncomingSet getIncomingSet() const;

/** Returns a string representation of the node.
*
* @return A string representation of the node.
*/
virtual std::string toString(std::string indent = "") const = 0;
virtual std::string toShortString(std::string indent = "") const = 0;

/** Returns whether two atoms are equal.
*
* @return true if the atoms are equal, false otherwise.
*/
virtual bool operator==(const Atom&) const = 0;

/** Returns whether two atoms are different.
*
* @return true if the atoms are different, false otherwise.
*/
virtual bool operator!=(const Atom&) const = 0;
};

Whoa, C++, reputation of being verbose and fearsome is justified, i think. Now let’s break this down.
# TODO: EXPLAIN THE STRUCT
Let’s now check one of my favourite project redis, And a data structure/algorithm that fascinates me. The hyperloglog, HLL


struct hllhdr {
char magic[4]; /* "HYLL" */
uint8_t encoding; /* HLL_DENSE or HLL_SPARSE. */
uint8_t notused[3]; /* Reserved for future use, must be zero. */
uint8_t card[8]; /* Cached cardinality, little endian. */
uint8_t registers[]; /* Data bytes. */
};

Now this looks simple, and some of you might think it’s deceptively simple.(Don’t it’s straight forward simple). The real magic is actually in how these values are computed. There’s a reason HLL is called an algorithm and not considered a data structure.
#TODO: MORE ELABORATION

Software codebase as an organism/human child

It’s a useful metaphor for provoking thought. It’s all from here

Software

Here’s what no one tells you when you graduate with a CS degree and take up a job in software engineering:

The computer is a machine, but a codebase is an organism.

This should make sense to anyone who’s worked on a large software project. Computer science is all about controlling the machine — making it do what you want, on the time scale of nano- and milliseconds. But software engineering is more than that. It’s also about nurturing a codebase — keeping it healthy (low entropy) as it evolves, on a time scale of months and years.

Like any organism, a codebase will experience both growth and decay, and much of the art of software development lies in learning to manage these two forces.

Growth, for example, isn’t an unmitigated good. Clearly a project needs to grow in order to become valuable, but unfettered growth can be a big problem. In particular, a codebase tends to grow opportunistically, by means of short-sighted local optimizations. And the bigger it gets, the more ‘volume’ it has to maintain against the forces of entropy. Left to its own devices, then, a codebase will quickly devolve into an unmanageable mess, and can easily collapse under its own weight.

Thus any engineer worth her salt soon learns to be paranoid of code growth. She assumes, correctly, that whenever she ceases to be vigilant, the code will get itself into trouble. She knows, for example, that two modules tend to grow ever more dependent on each other unless separated by hard (‘physical’) boundaries.

(Of course all code changes are introduced by people, by programmers. It’s just a useful shortcut to pretend that the code has its own agenda.)

Faced with the necessity but also the dangers of growth, the seasoned engineer seeks a balance between nurture and discipline. She can’t be too permissive — coddled code won’t learn its boundaries. But she also knows not to be too tyrannical. Code needs some freedom to grow at the optimal rate.

She also understands how to manage code decay. She has a good ‘nose’ for code smells: hints that a piece of code is about to take a turn for the worse. She knows about code rot, which is what happens when code doesn’t get enough testing/execution/exercise. (Use it or lose it, as they say.) She’s seen how bad APIs can metastasize across the codebase like a cancer. She even knows when to amputate a large piece of code. Better it should die than continue to bog down the rest of the project.

Bottom line: building software isn’t like assembling a car. In terms of growth management, it’s more like raising a child or tending a garden. In terms of decay, it’s like caring for a sick patient.

And all of these metaphors help explain why you shouldn’t build software using the factory model.

So the question is how “grown-up” is your codebase? Terrible twos? middle school? Teenaged? Left home early adult? Or more than that?

ELF file– Tentative Symbols

Tentative symbols:
I was messing around with an .o object file and exploring it.
I was having trouble with modifying redis source, and ended up
doing an objdump -t on the object.o.
I noticed an entry ‘bloom ‘ with the section classification as
‘*COM*’.
So what exactly does *COM* section mean in a elf object file?
After googling and traipsing through a list of links,on google page,
and stack overflow questions, I finally found this meaningful.

Unfortunately, it seems to be a documentation for solaris, or atleast written by the solaris team.
Any case, it’s meaningful. The basic reasoning being if the linker finds any variable at the file scope level,
that is not initialized or declared extern, it assumes that variable is defined in another source file.
And therefore while creating the ELF object file it marks that variable as COMMON blocks.
As it turns out the name COMMON blocks, originated while linking fortran files.
It seems it used to be a common practice in the days of Fortran based program compiling and linking.

Make lessons

I was messing around with opencv library for C++ and it’s interface.
Very soon, after I began typing out code from web pages and compiling, I got tired of running the compilation command manually from the terminal.

Not to mention, since I had built opencv from source, I had to explicitly pass -I folder.

Soon, I was wishing I can just run make and get the code to run.

Well, instead of that I figured out a way to run make and just compile and create all object files in the current folder.

Here’s the MakeFile I ended up using:

PROGRAMS_SRC=$(wildcard *.c)
TARGETS=$(patsubst %.c, %,$(PROGRAMS_SRC))
CPPFLAGS=I/usr/local/include/opencv2

all:
$(foreach var,$(PROGRAMS_SRC),g++ $(var) -o $(patsubst %.c, %,$(var)) -$(CPPFLAGS); )

The two key things I learnt were patsubst and foreach. I had seen patsubst and wildcard, while running through LCTHW before.
As the name implies, wildcard keyword is a regex to match a bunch of files.
patsubst is again regex, but this time to replace a set of text.

foreach is simply a way to iterate over a list of values in a variable. I just end up running the g++ command for each of the file.

Note: Am using .c extension here, though i should really be naming the files .cpp and using that in this makefile.
Note2: This assumes all of your code is confined to one file(except for library imports). Otherwise this just won’t work.

Deep C — activation frames/ stack frames

Ok, this ‘Deep C programming’ — Slide talk suggests that(slide 136) activation frames and execution stack are two very different concepts in the executable created by compiling and linking a C program.

I was curious and looked up activation frames in C, which led me to here. Ofcourse, before i began this journey, I only had heard of stack memory used by a program, and had an understanding that it is one part of the initial memory allocated to the process by the OS, that is used as a stack fashion.(i.e LIFO).

The university of calgary link mixes the terms activation,stack, frame, and execution rather freely. Something’s up. Though first let me try out the example in that Univ. of Calgary link and see what happens.

Here’s the code:, compilation and running it.

#include 

void print_facts(int num1, int num2);
int max_of_two(int j, int k);
double avg_of_two(int c, int d);

int main(void)
{
  int i;
  int j;
  /* point  1 */
  i = -8;
  j = 7;
  /* point  2 */
  print_facts(i, j);
  /* point 10 */
  return 0;
}

void print_facts(int num1, int num2)
{
  int larger;
  double the_avg;
  /* point  3 */
  larger = max_of_two(num1, num2);
  /* point  6 */
  the_avg = avg_of_two(num1, num2);
  /* point  9 */
  printf("For the two integers %d and %d,n", num1, num2);
  printf("the larger is %d and the average is %g.n",
         larger, the_avg);
}

int max_of_two(int j, int k)
{
  /* point  4 */
  if (j < k)
    j = k;
  /* point  5 */
  return j;
}

double avg_of_two(int c, int d)
{
  double sum;
  /* point  7 */
  sum = c + d;
  /* point  8 */
  return (c + d) / 2.0;
}

Compile(make), and run executable:

anand@anand-usb-boot:C [master] $ make activation_record_demo
gcc --std=c99 -g -pedantic -Wall -lm    activation_record_demo.c   -o activation_record_demo
activation_record_demo.c: In function ‘avg_of_two’:
activation_record_demo.c:50:12: warning: variable ‘sum’ set but not used [-Wunused-but-set-variable]
anand@anand-usb-boot:C [master] $ ./activation_record_demo 
For the tow integers -8 and 7, 
the larger is 7 and the average is -0.5.

Now comes the fun part: We need to be able to examine the stack, during the execution of the program.
GDB to the rescue.

Here’s the gdb session i went through:

anand@anand-usb-boot:C [master] $ gdb64 ./activation_record_demo
GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/anand/workspace/github_stuff_public/Miscellaneous/C/activation_record_demo...done.
(gdb) b 11
Breakpoint 1 at 0x400534: file activation_record_demo.c, line 11.
(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/activation_record_demo 

Breakpoint 1, main () at activation_record_demo.c:12
12	    i = -8;
(gdb) frame
#0  main () at activation_record_demo.c:12
12	    i = -8;
(gdb) n
13	    j = 7;
(gdb) n
16	    print_facts(i,j);
(gdb) frame
#0  main () at activation_record_demo.c:16
16	    print_facts(i,j);
(gdb) s
print_facts (num1=-8, num2=7) at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) frame
#0  print_facts (num1=-8, num2=7) at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) s
max_of_two (j=-8, k=7) at activation_record_demo.c:42
42	    if (j < k)
(gdb) frame
#0  max_of_two (j=-8, k=7) at activation_record_demo.c:42
42	    if (j < k)
(gdb) frame 0
#0  max_of_two (j=-8, k=7) at activation_record_demo.c:42
42	    if (j < k)
(gdb) frame 
#0  max_of_two (j=-8, k=7) at activation_record_demo.c:42
42	    if (j < k)
(gdb) down
Bottom (innermost) frame selected; you cannot go down.
(gdb) s
43	        j = k;
(gdb) s
45	    return j;
(gdb) s
46	}
(gdb) s
print_facts (num1=-8, num2=7) at activation_record_demo.c:31
31	    the_avg = avg_of_two(num1,num2);
(gdb) s
avg_of_two (c=-8, d=7) at activation_record_demo.c:52
52	    sum = c + d;
(gdb) s
54	    return (c+d) / 2.0;
(gdb) frame
#0  avg_of_two (c=-8, d=7) at activation_record_demo.c:54
54	    return (c+d) / 2.0;
(gdb) s
55	}
(gdb) s
print_facts (num1=-8, num2=7) at activation_record_demo.c:34
34	    printf("For the tow integers %d and %d, n",num1,num2);
(gdb) n
For the tow integers -8 and 7, 
35	    printf("the larger is %d and the average is %g. n",larger,the_avg);
(gdb) n
the larger is 7 and the average is -0.5. 
37	}
(gdb) c
Continuing.
[Inferior 1 (process 24420) exited normally]
(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/activation_record_demo 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000

Breakpoint 1, main () at activation_record_demo.c:12
12	    i = -8;
(gdb) s
13	    j = 7;
(gdb) s
16	    print_facts(i,j);
(gdb) frame
#0  main () at activation_record_demo.c:16
16	    print_facts(i,j);
(gdb) s
print_facts (num1=-8, num2=7) at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) frame
#0  print_facts (num1=-8, num2=7) at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) info frame
Stack level 0, frame at 0x7fffffffddb0:
 rip = 0x400566 in print_facts (activation_record_demo.c:28); 
    saved rip 0x400551
 called by frame at 0x7fffffffddd0
 source language c.
 Arglist at 0x7fffffffdda0, args: num1=-8, num2=7
 Locals at 0x7fffffffdda0, Previous frame's sp is 0x7fffffffddb0
 Saved registers:
  rbp at 0x7fffffffdda0, rip at 0x7fffffffdda8
(gdb) s
max_of_two (j=-8, k=7) at activation_record_demo.c:42
42	    if (j < k)
(gdb) s
43	        j = k;
(gdb) s
45	    return j;
(gdb) info frame
Stack level 0, frame at 0x7fffffffdd80:
 rip = 0x4005e6 in max_of_two (activation_record_demo.c:45); 
    saved rip 0x400575
 called by frame at 0x7fffffffddb0
 source language c.
 Arglist at 0x7fffffffdd70, args: j=7, k=7
 Locals at 0x7fffffffdd70, Previous frame's sp is 0x7fffffffdd80
 Saved registers:
  rbp at 0x7fffffffdd70, rip at 0x7fffffffdd78
(gdb) frame
#0  max_of_two (j=7, k=7) at activation_record_demo.c:45
45	    return j;
(gdb) up
#1  0x0000000000400575 in print_facts (num1=-8, num2=7)
    at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) frame
#1  0x0000000000400575 in print_facts (num1=-8, num2=7)
    at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) info frame
Stack level 1, frame at 0x7fffffffddb0:
 rip = 0x400575 in print_facts (activation_record_demo.c:28); 
    saved rip 0x400551
 called by frame at 0x7fffffffddd0, caller of frame at 0x7fffffffdd80
 source language c.
 Arglist at 0x7fffffffdda0, args: num1=-8, num2=7
 Locals at 0x7fffffffdda0, Previous frame's sp is 0x7fffffffddb0
 Saved registers:
  rbp at 0x7fffffffdda0, rip at 0x7fffffffdda8
(gdb) up
#2  0x0000000000400551 in main () at activation_record_demo.c:16
16	    print_facts(i,j);
(gdb) info frame
Stack level 2, frame at 0x7fffffffddd0:
 rip = 0x400551 in main (activation_record_demo.c:16); 
    saved rip 0x7ffff7a33ea5
 caller of frame at 0x7fffffffddb0
 source language c.
 Arglist at 0x7fffffffddc0, args: 
 Locals at 0x7fffffffddc0, Previous frame's sp is 0x7fffffffddd0
 Saved registers:
  rbp at 0x7fffffffddc0, rip at 0x7fffffffddc8
(gdb) down
#1  0x0000000000400575 in print_facts (num1=-8, num2=7)
    at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) down
#0  max_of_two (j=7, k=7) at activation_record_demo.c:45
45	    return j;
(gdb) down
Bottom (innermost) frame selected; you cannot go down.
(gdb) s
46	}
(gdb) s
print_facts (num1=-8, num2=7) at activation_record_demo.c:31
31	    the_avg = avg_of_two(num1,num2);
(gdb) n
34	    printf("For the tow integers %d and %d, n",num1,num2);
(gdb) n
For the tow integers -8 and 7, 
35	    printf("the larger is %d and the average is %g. n",larger,the_avg);
(gdb) n
the larger is 7 and the average is -0.5. 
37	}
(gdb) c
Continuing.

Note the output of the frame commands.
It print a number followed by an address, followed by the caller function, and the source code file and line no.
The next line prints out the actual source code at that line.

Interestingly look at the address differences when i move up the stack using up/down.

For ex:

(gdb) info frame
Stack level 0, frame at 0x7fffffffdd80:
 rip = 0x4005e6 in max_of_two (activation_record_demo.c:45); 
    saved rip 0x400575
 called by frame at 0x7fffffffddb0
 source language c.
 Arglist at 0x7fffffffdd70, args: j=7, k=7
 Locals at 0x7fffffffdd70, Previous frame's sp is 0x7fffffffdd80
 Saved registers:
  rbp at 0x7fffffffdd70, rip at 0x7fffffffdd78
(gdb) frame
#0  max_of_two (j=7, k=7) at activation_record_demo.c:45
45	    return j;
(gdb) up
#1  0x0000000000400575 in print_facts (num1=-8, num2=7)
    at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);
(gdb) frame
#1  0x0000000000400575 in print_facts (num1=-8, num2=7)
    at activation_record_demo.c:28
28	    larger = max_of_two(num1,num2);

Here, i move up the stack twice. The stack address which originally was at

0x7fffffffdd80

moves to

0x0000000000400575

, At first look it seems bizarre, till you realize there’s a return and the first stack address located function returns.

Deep C++ —- virtual table

Moving further along in that ridiculously long, too many slides unneccessarily pdf Deep C, we come across Deep C++ .


#include

struct X
{
int a;
char b;
int c;

};

int main(void)
{
std:: cout << sizeof(X) << std::endl;
return 1;

}

Now that code is no different from C by much(Except for those std::cout instead of printf) and so should print out the same size as padded structs in C. (see prev post)
So it does:

 g++ cpp_vtable.cpp  -o cpp_vtable
anand@anand-usb-boot:C [master] $ gdb64 cpp_vtable
Reading symbols from /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable...(no debugging symbols found)...done.

(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
12

Now what happens if we add a member function to that struct? Let’s see:
Am adding this line to the end of the struct.

void set_value(int v) {a = v; }

Well it still prints the same 12. What does this mean? According to the pdf, normal member functions to C++ are just syntactic sugar and are equivalent to C functions with having an additional argument in the first position as being the struct which they are a member of.

That’s why we get this output:

 g++ cpp_vtable.cpp  -o cpp_vtable
anand@anand-usb-boot:C [master] $ gdb64 cpp_vtable
Reading symbols from /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
12

OK that no loadable sections found warning from gdb is weird.. but otherwise the output is the same 12 as expected.

Now let’s go ahead and make it a virtual function instead.


virtual void set_value(int v) {a = v; }

And here’s the result of compiling and running it


anand@anand-usb-boot:C [master] $ g++ cpp_vtable.cpp  -o cpp_vtable
anand@anand-usb-boot:C [master] $ gdb64 cpp_vtable

(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
24
[Inferior 1 (process 8290) exited with code 01]
(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000

Breakpoint 1, 0x0000000000400830 in main ()

Hmm we still get the same warning, but the size has been doubled. It must be the virtual table pointer. But how do we know, it’s not allocating memory separately for a virtual functions. Time to add another func to the source code.

So here’s how the new struct looks like:

struct X
{
int a;
char b;
int c;
virtual void set_value(int v) {a = v; }
virtual int get_value(int v) {return a; }
virtual void increase_value() { a++;}

};

And here’s the output of running the program.

anand@anand-usb-boot:C [master] $ g++ cpp_vtable.cpp  -o cpp_vtable
anand@anand-usb-boot:C [master] $ gdb64 cpp_vtable
GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu

(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
24
[Inferior 1 (process 8353) exited with code 01]
(gdb) 

You can see that the size is the same as previous code 24. So it confirms if there’s a virtual function, there’s a pointer to a virtual table created as part of the structure. Infact, this same virtual table is used for verifying that inheriting classes, ensure overriding the virtual functions.

By inductive reasoning the next step is to simply add another struct to the code and see what happens.

Here’s what happens, if i copy paste the same struct as struct Y and keep only the get_value virtual function.

anand@anand-usb-boot:C [master] $ g++ cpp_vtable.cpp  -o cpp_vtable
anand@anand-usb-boot:C [master] $ gdb64 cpp_vtable
(gdb) run
Starting program: /home/anand/workspace/github_stuff_public/Miscellaneous/C/cpp_vtable 
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
24
24
[Inferior 1 (process 8674) exited with code 01]
(gdb) quit

Just as expected it’s the same size, and is printed as we coded. :) . So each struct really does have a separate virtual table. We have nullified the hypothesis that all structs/classes share a virtual table.

Next, test would be to see if inherited classes shared a vtable or not, but i’ll save that for some other time.

Deep C- struct packing

Once again, from Deep C. The size of structs.

The basic idea is to test the knowledge that sometimes compilers do optimizations on how the structure is situated in memory. According to the pdf, this is because most hardware architectures are optimized for byte sized access, and so compilers generally pad structures and their elements so that they can be accessed in a byte wise manner. Now this may have been true sometime back, and perhaps still true in a statistical majority sense, but with the profusion of mobile computing devices, I suspect new architectures might have a more fine-grained(than word) access. Anyway,that’s marked for future research. Besides, given how much we have shrunk memory, that’s probably not the case for Smartphones and tablet. They probably still use word optimized architectures. I suppose the older (voice calls only) phones had architectures where this was false. I also think other microcontroller based devices like RSA Security code generators, active noise cancelling chips on headphones/earphones etc.. (aka, wherever memory is still a constraint) this would make a difference, and their architectures might not have inefficient access for sub-word memory accesses.

For now, I’ll just go on with the code and a demo here.


#include

struct X { int a; char b; int c;};

int main(void)
{
printf("%zun",sizeof(int));
printf("%zun",sizeof(char));
printf("%zun",sizeof(struct X));

}

And here’s the output with the minimal set of compilation flags.

anand@anand-usb-boot:C [master] $ gcc struct_padding.c -o struct_padding
anand@anand-usb-boot:C [master] $ ./struct_padding 
4
1
12

Now am running a 64-bit machine, but with 32-bit mode.(Am just reverse guessing since it printed 12, but doesn’t matter for our purposes.
As expected the struct is padded to 12 bytes. Instead of the sum of it’s parts, which is 4+1+4 = 9.

Now let’s add the -fpack-struct option to the gcc command.

anand@anand-usb-boot:C [master] $ gcc struct_padding.c -o struct_padding -fpack-struct
anand@anand-usb-boot:C [master] $ ./struct_padding 
4
1
9

Voila, there it is. the size as our mental model says it should be.
Let’s go add another char pointer to the struct

#include

struct X { int a; char b; int c; char *d;};

int main(void)
{
printf("%zun",sizeof(int));
printf("%zun",sizeof(char));
printf("%zun",sizeof(char *));
printf("%zun",sizeof(struct X));

}


Here’s the output with

anand@anand-usb-boot:C [master] $ gcc struct_padding.c -o struct_padding
anand@anand-usb-boot:C [master] $ ./struct_padding 
4
1
8
24

Here’s the output with -fpack-struct:

gcc struct_padding.c -o struct_padding -fpack-struct
anand@anand-usb-boot:C [master] $ ./struct_padding 
4
1
8
17

That 17 with pack-struct is expected. Note that the pointer size is 8 and without pack-struct it pads up from 20 to 24.

Another good link I came across is this

Calling conventions, ABIs

Well, In another set of weird “following my nose”* instances, I end reading up about cdecl calling conventions for x86.

Turns out there’s more layers involved in between the conversion of C-code to executable than the ones I was aware of.
Not only that there’s a compiler and linker stage, they are there to provide abstractions about the hardware.

Now, that’s not new, anyone who took a basic peek at Operating Systems course can tell you that.
The interesting part according to this wikipedia page
is that there’s something called an Application Binary Interface.

And then there are calling conventions. These turn out to be a set of standards agreed on by hardware manufacturers.
It turns out they needed such a thing because before microcomputers, each hardware manufacturer used to supply his own OS and compilers.

Anyway, to be specific about the cdecl, the core part or according to the wikipage the core part is,
how the caller function, callee function is translated to assembly.

And as the page says the basic idea is to push some values on to the stack by the caller function, call and once returned, clean up the stack.


caller:
    push    ebp
    mov     ebp, esp
    push    3
    push    2
    push    1
    call    callee
    add     esp, 12
    add     eax, 5
    pop     ebp
    ret

Anyway, this obviously led on to one more click and read on wikipage for ABIs.
Ok, I read this page, but most of the sentences just don’t hit any connections in my brain. No wonder, I don’t have all the context or experience,
involved in working at the System internals levels. So this I won’t bother re-phrasing/summarizing.
Here’s the verbatim of Wiki page:
ABIs cover details such as:

the sizes, layout, and alignment of data types
the calling convention, which controls how functions’ arguments are passed and return values retrieved; for example, whether all parameters are passed on the stack or some are passed in registers, which registers are used for which function parameters, and whether the first function parameter passed on the stack is pushed first or last onto the stack
how an application should make system calls to the operating system and, if the ABI specifies direct system calls rather than procedure calls to system call stubs, the system call numbers
and in the case of a complete operating system ABI, the binary format of object files, program libraries and so on.

A complete ABI, such as the Intel Binary Compatibility Standard (iBCS),[1] allows a program from one operating system supporting that ABI to run without modifications on any other such system, provided that necessary shared libraries are present, and similar prerequisites are fulfilled.

Other ABIs standardize details such as the C++ name mangling,[2] exception propagation,[3] and calling convention between compilers on the same platform, but do not require cross-platform compatibility.

EABI

An embedded-application binary interface (EABI) specifies standard conventions for file formats, data types, register usage, stack frame organization, and function parameter passing of an embedded software program.

The main differences of an EABI with respect to an ABI for general purpose operating systems are that privileged instructions are allowed in application code, dynamic linking is not required (sometimes it is completely disallowed), and a more compact stack frame organization is used to save memory.[4]

* — I think the trail I followed was looking for bugs on cpython bugs page,
while came across one about ctypes module, went fishing after it, mucked around ctypes module and the examples provided there,
Ran into some error(original example was tried on windows + cygwin env), tried to figure out that problem and ended up cdecl,
while reading some blog post about ctypes usage, duh…