C Arch patterns

There are some patterns you can see in C code.
Mostly signs that give off indications of what are the data structures used.

For example, this is a code from cpython source(Include/object.h):(obfuscated for illustrative purposes)

typedef struct XXX {
struct XXX *nnn;
const char* string;
PyObject *object;
} XXX;

The sign here is that the struct name (XXX here) is used, within the
struct definition for another element. In this case, this element is a pointer to the self. This indicates this is a linked list.

It can be a circular linked list or singly linked list with the last node pointing to NULL. That distinction can be guessed with initial values. Given it’s not being initialized, I would guess this is a circular linked list, but to verify I will have to check parts of the code where this struct is initialized.

On the contrary, if I had needed this to be a doubly linked list instead, I would have defined it this way instead.

typedef struct XXX {
struct XXX *ppp;
struct XXX *nnn;
const char* string;
} XXX;

The point being two pointers to the same structure, but this does not necessarily mean there’s a doubly linked list implementation. If someone’s trying to write Obfuscated C code, they might use this struct but use it differently, on purpose to hinder code readability.

Oh, by the way, this could also be made into a binary-tree structure by manipulating the insertions into *ppp, *nnn, as new structs. In case of making this a binary tree, we would need a check to make sure no new object linked to *nnn(and *ppp) is already in the structure. Infact it would be more apt naming those *left and *right than *ppp, *nnn.

i.e:

typedef struct XXX {
struct XXX *left;
struct XXX *right;
const char* nodelabel;
} XXX;

Note purely in terms of the type of elements composing/constituting the struct, there’s no difference between this and the above doubly linked list. However, the difference will arise out of how these struct pointers are inserted.
#TODO: CODE SAMPLE HERE.

Now, I’ll graduate to the next data structure, I believe is logically next, (and is one of my favourite) data structure — Graphs.

Linked lists, are simply a very specific case of a graph(which can be called a specific case of sets).
Linked lists are simply graphs restricted to only one edge,(directed) between any two nodes.
Ofcourse, am assuming planar graph<link>,(i.e: one edge can only connect two nodes) and not hyper-graph<link>(think nodes in 3D, and edges that can connect more than “2 nodes”.)

So, Opencog<link> is a open-source attempt to create an Artificial General Intelligence system.
At the core of it is a hypergraph<link to why hypergraphs>(aka atomspace).
Note: this is a C++ project.

Here’s the code for defining an atom:

class Atom
: public std::enable_shared_from_this<Atom>
{
friend class ::AtomUTest; // Needs to call setFlag()
friend class AtomTable; // Needs to call MarkedForRemoval()
friend class ImportanceIndex; // Needs to call setFlag()
friend class Handle; // Needs to view _uuid
friend class SavingLoading; // Needs to set _uuid
friend class TLB; // Needs to view _uuid

private:
//! Sets the AtomTable in which this Atom is inserted.
void setAtomTable(AtomTable *);

//! Returns the AtomTable in which this Atom is inserted.
AtomTable *getAtomTable() const { return _atomTable; }

protected:
UUID _uuid;
AtomTable *_atomTable;

Type _type;
char _flags;

TruthValuePtr _truthValue;
AttentionValuePtr _attentionValue;

// Lock needed to serialize AV changes.
// Just right now, we will use a single shared mutex for all
// locking. If this causes too much contention, then we can
// fall back to a non-global lock, at the cost of 40 additional
// bytes per atom.
static std::mutex _avmtx;
static std::mutex _mtx;

/**
* Constructor for this class.
*
* @param The type of the atom.
* @param Outgoing set of the atom, that is, the set of atoms this
* atom references. It must be an empty vector if the atom is a node.
* @param The truthValue of the atom. note: This is not cloned as
* in setTruthValue.
*/
Atom(Type, TruthValuePtr = TruthValue::NULL_TV(),
AttentionValuePtr = AttentionValue::DEFAULT_AV());

struct InSet
{
// incoming set is not tracked by garbage collector,
// to avoid cyclic references.
// std::set<ptr> uses 48 bytes (per atom).
IncomingSet _iset;
// Some people want to know if the incoming set has changed...
AtomPairSignal _addAtomSignal;
AtomPairSignal _removeAtomSignal;
};
typedef std::shared_ptr<InSet> InSetPtr;
InSetPtr _incoming_set;
void keep_incoming_set();
void drop_incoming_set();

// Insert and remove links from the incoming set.
void insert_atom(LinkPtr);
void remove_atom(LinkPtr);

private:
/** Returns whether this atom is marked for removal.
*
* @return Whether this atom is marked for removal.
*/
bool isMarkedForRemoval() const;

/** Returns an atom flag.
* A byte represents all flags. Each bit is one of them.
*
* @param An int indicating which of the flags will be returned.
* @return A boolean indicating if that flag is set or not.
*/
bool getFlag(int) const;

/** Changes the value of the given flag.
*
* @param An int indicating which of the flags will be set.
* @param A boolean indicating the new value of the flag.
*/
void setFlag(int, bool);

//! Marks the atom for removal.
void markForRemoval();

//! Unsets removal flag.
void unsetRemovalFlag();

public:

virtual ~Atom();

/** Returns the type of the atom.
*
* @return The type of the atom.
*/
inline Type getType() const { return _type; }

/** Returns the handle of the atom.
*
* @return The handle of the atom.
*/
inline Handle getHandle() {
return Handle(shared_from_this());
}

/** Returns the AttentionValue object of the atom.
*
* @return The const reference to the AttentionValue object
* of the atom.
*/
AttentionValuePtr getAttentionValue() const { return _attentionValue; }

//! Sets the AttentionValue object of the atom.
void setAttentionValue(AttentionValuePtr) throw (RuntimeException);

/** Returns the TruthValue object of the atom.
*
* @return The const referent to the TruthValue object of the atom.
*/
TruthValuePtr getTruthValue() const { return _truthValue; }

//! Sets the TruthValue object of the atom.
void setTruthValue(TruthValuePtr);
void setTruthValue(CompositeTruthValuePtr ctv) {
setTruthValue(std::static_pointer_cast<TruthValue>(ctv));
}

//! Return the incoming set of this atom.
IncomingSet getIncomingSet() const;

/** Returns a string representation of the node.
*
* @return A string representation of the node.
*/
virtual std::string toString(std::string indent = "") const = 0;
virtual std::string toShortString(std::string indent = "") const = 0;

/** Returns whether two atoms are equal.
*
* @return true if the atoms are equal, false otherwise.
*/
virtual bool operator==(const Atom&) const = 0;

/** Returns whether two atoms are different.
*
* @return true if the atoms are different, false otherwise.
*/
virtual bool operator!=(const Atom&) const = 0;
};

Whoa, C++, reputation of being verbose and fearsome is justified, i think. Now let’s break this down.
# TODO: EXPLAIN THE STRUCT
Let’s now check one of my favourite project redis, And a data structure/algorithm that fascinates me. The hyperloglog, HLL


struct hllhdr {
char magic[4]; /* "HYLL" */
uint8_t encoding; /* HLL_DENSE or HLL_SPARSE. */
uint8_t notused[3]; /* Reserved for future use, must be zero. */
uint8_t card[8]; /* Cached cardinality, little endian. */
uint8_t registers[]; /* Data bytes. */
};

Now this looks simple, and some of you might think it’s deceptively simple.(Don’t it’s straight forward simple). The real magic is actually in how these values are computed. There’s a reason HLL is called an algorithm and not considered a data structure.
#TODO: MORE ELABORATION

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s