My stock market –dabblings.

Over the last couple of years, I’ve been dabbling, and really just buying on impulse and random
reading online stock tips and forums. At the year-end while filing taxes and tallying up I realized(not
surprisingly, I might add) I’ve lost money(Thanks to the bull market, only little).
Which is when I realized, I’ve been half-assing the amount of research, I should do before investing
in stock market, and what’s worse, I’ve been falling prey to the fallacy ” a little knowledge is a
dangerous thing”.
So this is an attempt to hide the crime and in the process, build a system to avoid committing the
crime in the future.

Before I begin, some of the sources, I’ve been half-assing for research,(but good sources
nevertheless) are:

This is a first of a series of posts:

Most of the data, I used(and will use for the series) in the following analysis was picked up from investr(thanks to r/hapuchu,
for sharing the data), but can be picked up by crawling the webpages of companise for quarterly
and/or annual reports, and then parsing the pdf to consume them.

 

Some Caveats and Exceptions:

  • –I’m writing this around the end/last week of february.
  • — These are all stocks I traded in starting in 2nd half of 2015.
  • — I’ve done some stock investment during the 2006-2009, made some money, but due to
    bad(nah had no clue about it)
    portfolio/cashflow management, had to sell a bunch of them in 2009, which put overall
    returns negative and stopped trading, leaving whatever was left. But learnt the lesson, not to put any amount of money I’m not ok with losing into stocks.

  • — I’m working in the IT industry, and have spent some spare-time reading Finance, but
    nowhere near dedicated or focussed. (Not sure that kinda reading is good.)

  • — Most of the energy stocks are from when I decided I’ll go thematic on renewable energy
    and bought them, but lost patience/nerve when the stocks went down and eventually sold off

Direct-Equity[^1] Portfolio Opinions:

  • Way too many stocks
  • Way too disorganized and unfocused and under researched
  • Not enough Focus on long-term(think companies that’ll stay for > 100 years)
  • Balance long-term (black-swan)focus with dividend-based focus(For ex this)

Ok here’s a list of the stocks I’ve traded:

Reasoning:

  • I’ve seen the share price of this it has been hovering around 1000 for the
    10 years I’ve seen this stock, so this can be part of a stable portfolio.

Flaws:

  • This is absurd, as for all I know, the stock could have split 10 times in
    those 10 years, which means the stock has risen or merged, which means it
    has fallen. I haven’t checked it but the point is that it is a fallacy.

Outcome:

  • It has fallen a little bit, but I’m still keeping it and might even buy
    more.

  • Geometric Ltd.

Outcome:

Reasoning:

  • Not much, just that I’ve like Infy stocks in the past, and they seemed to
    be
    on a downtick

Flaws:

  • Well it’s just an impulse buy, and not better than gambling.

Outcome:

  • It has recovered and got back to trading better, but that’s just luck.

 

Reasoning:

  • Can’t even remember, where but it was some analyst rating and read

Flaws:

  • Belief in expert fallacy.

Outcome:

  • Loss. Panicked and sold at 200. It seems to be doing a little better, but
    even now it would be a loss for me to sell, however the system of
    investment was wrong.

  • HDFC Bank

Outcome:

Outcome:

Reasoning:

  • Can’t even remember, where but it was blog/reddit thread

Flaws:

  • Belief in expert fallacy.
  • Belief in crowd decision fallacy??

Outcome:

  • Up by about 20% lucky bull ride

Reasoning:

  • Analyst recommendation

Flaws:

  • belief-in-Expert-bias

Outcome:

  • Down by about 1/7th (Caveat: Down only because they had a rights issue
    that I missed)

  • Manappuram Finance

Outcome:
* Up by about 1/8th

Outcome:

Outcome:

  • Gained a bit.(can’t be bothered to track sale date and say how much)

  • Orient Green Power

Outcome:
* Lost a bit.

Outcome:
* Lost money

Outcome:
* Gain a little bit

Reasoning:

  • I’ve had good experience buying it during the IPO and making money(
  • I’ve a bias for the non-renewable sector ‘s future prospects.

Fallacy:

  • Using inductive reasoning when there’s no reason(IPO is different from regular trading)
  • Prior Bias (Ideally should have built a prediction, and accounted for non-renewable energy’s future bias I have)

Outcome:
* Lost a fair amount of money

Outcome:

Outcome:

Reasoning:

  • Building a dividend portfolio  and saw good ratings about it on investr’s magic formula
  • Bought a scooter and decided, I could buy some auto stocks

Outcome:

Reasoning:

  • I was in a automobile theme, and maruti has a big brand in India
  • Also was thinking of future plans for a car, and maruti was an automatic pick

Flaws:

  • It has a relatively higher P/E

Outcome:

Outcome:

Reasoning:

  • Was in a automobile theme,
  • Bought a TVS Scooter

Outcome:

Reasoning:

  • I wanted to pick up some in airlines(theme idea) and found that indigo has
    a high P/E so picked up spice jet(based on investr score)

  • Dabur India

Outcome:

Outcome:

Outcome:

Outcome:

Outcome:

Outcome:

Outcome:

  • Slightly up

Moral/TODO:

  • Shorten the number of stocks and focus the money into a few
  • Build a internal system for analyzing companies before invest in the future

[^1] — I might eventually broaden the scope of blog posts, but don’t expect it for a looong, loong time(count in decades)…

Consciousness — some dreamy hand-wavy theorizing

Disclaimer:
I am not a professional scientist or an expert in any of the areas, I refer to in the following article. I have picked up a couple of unrelated degrees in my life so far, but don’t consider myself an expert in any of them. I consider the rest of the article as something, I would have written if I had been interviewed for that (pop-sci) book “What I believe, but cannot prove”.

This is an attempt to summarize a bunch of my thoughts and ideas on consciousness.
This definitely is a biased summary, as abhay asked me to write a post and with a set deadline, my brain was scampering around and recalled the most recent thoughts.
A few basic assumptions*, I tend to make are:
1.Consciousness is a emergent phenomenon, that arises out of essentially materialistic universe.
2.Only known/agreed fundamentals of universe are dictated by physics as Space, Time, Energy, Mass.
3.While our neuronal activity may not be sufficient to explain all the phenomenology of consciousness,
they definitely are a necessary condition for enabling consciousness.**
4. Space-time continuum and the extended Space-Time-Energy-Mass Continuum is true, and will be proven sometime in the future.

A couple of basic hypothesis I would like to add is:
1. Attention is a core, integral, fundamental part of our universe on par with Space, Time, Energy and Mass.
2. Attention is part of a continuum of Space-Time-Energy-Mass-Attention Continuum
3. Big bang is the event when this continuum evolved/broke apart enough to these distinct components.

For the rest of the writing, I would use the terms consciousness and attention rather inter-changeably, because that’s how I consider it to be.
Do note that, I haven’t defined attention, because despite the standard definiton, I am not sure, that fits in with what am building here.
Instead, I’ll try to provide as many specific examples as I can think of and let you infer or construct a definition, that fits.

Some evidence*** I would like to sketch out are:
1.Heisenberg’s uncertainty principle:
i.e: It is impossible to know both position and mass of a particle to the exact precision allowed by the equipments’ capability.
This has been interpreted in many ways, and discussed rather extensively in some philosophical circles and popular science books.
The main point,I wish to make being once you think of the measuring equipments as exhibitng some characteristics of attention, it’s easy to interpret this as evidence for attention being a part of the universe and being converted into other forms.

One corollary of this is

2.Paramathma–Jeevathma:
Am picking just the terms I am familiar with. But this principle, that all human beings/living beings are part of a whole has been proposed in quite a few mythologies and stories spread around the world. Now, it could be argued that, this really is just a bias that led to this kind of independantly originated similar myths, but am at the other end.I’ll just list out a few variants, I’ve come across here, “Treat unto others”, “Gaia theory”
3.Causality and Probabilistic Graphical models:
Judea Pearls’ work on causality is fairly known.But I only have the basic ideas from reading easy text and avoiding the core math part.Based on a few of the sketches, I have indeed read, I would say, quite a few of our currently used models of causality are suspect, which I believe is the biggest argument for a peer review process, rather than a made-up gold standard template for scientific experimentation and conclusions.
But I digress, my core argument being that, if and once you agree that highly engineered, technical equipments(like the LHC) possess some level of attention/consciousness, then your probability graph immediately adds in complexity. To put it mildly, the possible number of agents goes up so dramatically, that trying to assign strong probability to any event as being the cause of another becomes a unrealistic computation problem.

4.Speed of light constant:
That mysterious rule that Speed of light is a maximum limit constant that cannot be exceeded.Also the corollary, about exchanging information.

Some predictions I would like to make are:
1.Sometime in the distant future, physicists will start acknowledging and forming theories around the agency of their equipments and how to make experiments that account for those, but still add value/evidence to the original hypothesis.
2.Neuroscience,technology, and our understanding of ethics will progress to levels, that enable us to more actively experiment on consciousness by enabling us to set/change/modify variables involved in affecting it.

Few corollaries, supposing those hypothesis are true:
1.Consciousness/Attention has been increasing in the universe, since Big bang.
2.It’s possible to convert from consciousness to mass/energy/space/time. (this means some superpowers like that of Vista, clockblocker, etc. in The Worm web serial novel)
3.The orders of magnitude of Consciousness contained in the form of mass/space/time is humongous. (Just think how much Energy is released by converting mass into Energy.of the order of 10^^12 times our known measure)
4.Our current measures of Consciousness/attention are too crude and they underestimate by incredible numbers.
5.In terms of life as a optimization problem viewpoint, we are currently woefully stuck in local optima points. We are just simply ignoring what is atleast 1 extra dimension, by not acknowleding it. (Not to imply we are utilizing the others we acknowledge optimally)

*— Some of these are just plain and simple biases I’ve accrued over 30 years of my life, and not necessarily proven science facts.
** — It follows that, I don’t consider bacteria, viruses, DNA/RNA etc. as conscious. I remember Harward declared animals are conscious, (though not sure they included microbes or not) and I agree with that. Am open to being convinced otherwise though.They do have a few basic reactions to environmental changes, so to take a weaker position, I would say, the bacteria and virii have a very tiny/micro/minute level of consciousness.
*** — evidence as suggested by law scenes in movies and serials, rather than from hard sciences.I could probably argue, there’s probabilistic graphical models based strong evidence, but I don’t have the time or effort to put a good one. This includes narrative, story-based evidences.

Dear interviewers — rants

Dear interviewers,
Please think about interviews first. And don’t use information asymmetry to hide or manipulate the candidates(good ones, hate those interviews). Try to think of it as a honest conversation to figure out if you can work together, instead of trying to play cat and mouse game.

And seriously rethink old habits of calling in people and asking them to write code on paper, and then criticizing the syntax.

And don’t really sit back and shift the “burden of proof” to the candidate altogether.

I tend to think of interviews as two-way sales process. Not one way,
where the candidate has to sell his skillset,with/without knowledge of the job roles.
If you are gonna sit there and question my skillsets, and ask about my interest in the job,
without divulging anything more than a vague “pure python framework”, then well,
we’re not likely to work together at all. Sorry cya.

Because:
1.you are employing information asymmetry
2.You are shifting “burden of proof”, all the while absolving yourself of any such thing.
3.You are refusing to provide enough information for me to judge my interests.

When I say, a two-way sales process, I mean:
1. Both the parties are trying to exchange some thing that’s of value to the other.
2. Both the parties have to convince the other, what they have is worth it.
3. Both the parties have to be able to trust the other won’t cheat them out of what they’re agreeing to.

Establishing all of these, takes deliberate attention to these ideas first, acceptance next and then open communication.

Unfortunately, real-life is filled with imperfect people trying to half-ass things,(some deliberately,some ignorantly, some just plain simply don’t have the time/attention,
but always to their self-advantage) and we end up with the messy abstractions we have.

And if you find it difficult to switch from an admittance gate mode to two-way sales communication mode,
I recommend you give atleast one read to Daniel H. Pink’s “To sell is human”. I highly recommend a read-pause-write/record-read cycle though.

Quantum of work

Quantum of work:
He(Vivek Haldar) talks about quantum of work here.
He mentions that “knowledge work ” is measured in minutes. I hate going by words, rather than some solid measurable definition,
but if I have to roll with it, I would say he has a point. Though he seems to be biased by Academic Researchers’ workstyle.
I get that most of his work(researcher), is limited to reading a set of papers, thinking about the ideas,
writing out the flaws/gaps/criticisms/related implications in them (in a mail to the author?) and then moving on to other papers and ideas.

He defines it as

A quantum of work is the theoretical longest amount of time you can work purely on your own without needing to break out into looking up something on the web or your mail or needing input from another person.

He may be right about it being measured in a matter of minutes, But there are implicit assumptions.
Since he uses a fiction writer,(he says, unless you have all your references and notes in hand,you’ll break out).
I would contend that all of that presumes the fiction writer is looking up references to learn how to use them,
rather than looking them up for patching the details. Or as some writers like to claim, imagine most of what they write themselves.

When I usually am writing these blog posts(obviously not the same as fiction), I start off putting together the elements of the post in my head.
Infact, at this stage, I usually am not even at the computer, then I write out the text, while leaving in placeholders for links and stuff i need to lookup.

Now, I am not really experienced or voluminous or good writer, but I’ll refer to what VGR says here.
Most of his posts even the ones that are 20K words long could have been and sometimes do get written in a condensed form first.
In this post he says that, he usually uses previous ideas and defintions he has established,
and can actually write down the condensed form in a napkin.

Note: this kind of work has a heavily strong backlinking to one’s own previous work, and tends to ignore others’ work around the area.

VH also says,

The trajectory from linear long form work to fragmented quantized work had been quite steady and there are absolutely no signs of it reversing.

. Now for people like me that’s a more scary thought.

My sense of best done/most work-satisfaction, tends to come from having focused on some work for a long time(going over the trade-offs repetitively and obsessively, till am sure I can’t improve the decisions).
I don’t mean continously incrementive work on that time, but not doing anything else when I am on it.
I might just be pondering/reflecting over it.Unfortunately, the modern day workplace is the antithesis of it.
And it seems to be not only seen as normal, but also a requisite to fit into the “Work Culture”.
I suspect this is one of the reasons, modern workplaces are “fragile”
This is one of those things that makes the reliance on big corporations scary.

I also wonder if these two work styles are related to the ghost vs vampire approach to life.

I tend to think of the second work-style(measured in minutes) as vampiricist, in that it’s made of experience seeking rather than meaning-seeking. VGR claims, people fall either in the extreme experience-seeking or extreme meaning-seeking categories. I am not so sure about it though.

I have also noticed when I write code, for a reasonably complex feature, I tend to write code and think alternatively in sprints. These two reach or hit a natural rhythm of variations that help the feeling of flow. I have also observed sitting at my desk is perhaps the easiest way to get stuck in my thinking flow/mode.

But the most important of all is carrying out a conversation with someone or clarifying a doubt or something like that. These things almost always create an immediate impact on the ability to write code. A lot of programmers have written about this, so I will just note the observations/thoughts that came to my mind.

It seems the actual act of conversing with someone about code(either the one you’re writing or he’s writing or either of you are planning) is perhaps the fastest way to lose track of your thoughts/ideas/dilemmas/trade-offs. It’s not that you are not conversing with yourself, it’s just that the conversation with yourself is at a level that’s more primitive(??) or natural than words can capture. You’re usually forming theses, testing them, collecting data. Forming theses is the part that involves most of language, the rest need language only when you’re trying to convince someone else of your conclusions. Otherwise they are left to your perceptive and pattern matching parts of the mind.

As always, this is all theory, and reality has a way of finding richer variations and surprising people (more specifically theorists) :)

Probability– teaching, bayes vs frequentists etc..

http://lesswrong.com/lw/1gc/frequentist_statistics_are_frequently_subjective/

I see this kind of reasoning at the core of denouncing standard null hypothesis testing in financial models as this blog says
http://epchan.blogspot.in/2013/01/the-pseudo-science-of-hypothesis-testing.html

I see the core error being the same.i.e: trying to derive inferences from probability calculations that ignore conditional probabilities or treat them as no different from other probabilities.

Now, i have specifically tried to stay out of the Finance sector as a field of employment. I never really thought or questioned the whys’ of it, but am beginning to understand. I actually like money and am a reasonable saver, and like mathematics so the sector has been and perhaps still is a perennial attraction it does pay a hell of a lot more.
but am beginning to realize the reason i have instinctively flinched from it. the most available jobs are accounting and customer relations, i don’t have much stomach for the routine of accounting and am no good at customer relations.. but after that the jobs and openings are myriad higher and higher levels of abstraction
like:
1.Quantitative Trading
2. derivatives trading
3. risk analysis
4. Portfolio management

etc..

Infact, i think this is the same problem with organizations doing normalizations of ratings and what not. I have a problem not because, i don’t think it makes sense to have all their employee ratings to fit to a normal curve, but i do have a problem in tweaking to fit exactly the normal curve at each reporting level. it’s just stupid and crazy application of standards and rules.

Also despite having a master’s degree, a bachelor’s in engineering, and having read a lot of science publications, and definitely having studied for exams, i never really understood the significance of p-values. I don’t really remember studying them very well, and somehow i don’t think they made sense if we studied it at any level of statistics course must look it up some other time.

(Obliquely related)
Probabability by stories:

I came across this story form of probability theory teaching.
See here

And was reading along, at the initial read of the story my first thought was that’s awfullay bayesian biased.
Soon realized, I never studied probability formally, definitely never beyond the dice/coin-toss example.
Have read, here and there(LW,NNT,EY and other blogs), knew there were three different interpretations,
but never was sure what those three were.

Anyway, reading the blog, it defines ‘classical’ as chalkboard situations, where we naively assume equal likelihood.
Now, that’s a category NNT would have called dangerously academic.(am somehow skeptical of this Defn.)

‘Empirical’ view relies on real-world frequencies.
(based on the examples, it’s more like projecting empirical observations from the past to the future)
Again, that sounds dangerously naive. Simply because it’s extrapolation with static/linear implicit assumptions.

‘Subjective’ view aims to express uncertainty in our minds, and therefore harder to define.

I am now finding all of these views rather, useless.
At this point am not sure what’s the point of these theoretical differences,
as they don’t seem to have a single effect on practice(i.e: reasoning with probabilities)

After reading the rest of the seriees, I get the reason why people are so divided on these interpretations.
But overall,think these should be personal preferences ultimately irrelevant to making a tight argument.(which should be based on the theorems)

Organization man(Losers,Sociopaths,Clueless)

I have been reading the series of dissecting “The Office” by venkatesh rao at the ribbonfarm. And watching the actual TV serial series in a mad obsession.
Some, opinions on the decision making behind the story-line writing. For ex:
from the branch closing episode to the next. I would call there was a
experiment and changed the next episode storyline based on the TRP rating.
But seriously, i was getting sick of Michael was preferring the stanford branch
versions.
And as is natural with any psychology parallels/articles characterized or defined only with words, found myself comparing with the articles. It keeps running through my head.
The more i ithink about work and what i should get done next, the more i realize, i have already become the checked-out loser. Darn it….

Ok, he has a sub-division within the loser list.
The staff-loser and the line-loser.
The definition being that the staff-loser’s function is to be the priest of the hierarchy while the line-loser’s function are closer to revenue-generation.
Now, because of the HIWTYL policy all human beings learn to employ automatically in any social situation you can expect the organization’s bureaucracy to be as heavily riddled with defensive policies as possible.

My instinctual shying away from Team Lead/Project Lead position signal that i have a reflexive aversion towards becoming a staff-loser. I don’t think they are useless, in the unfortunate real world we all live in, they are a necessity, if only to save businesses from oppurtunistic humans. It’s only probably in a Ayn-Randistic Ideal world they can be avoided altogether.

Well, the next thing to do is unlearn the habits of checked-outness and learn
the habits of the socio-path.

Rule 1: Bayesian decision making. estimate potential risk and potential rewards
in any career move(i.e: 8-9 hrs a day of what you do in office.

Rule 2: To quote Venkatesh Rao from his ribbon farm post.” The risk-management work of an organization can be divided into two parts: the unpredictable part that is the responsibility of the line hierarchy, and the predictable, repetitive part that is the responsibility of the staff hierarchy.”
So this means, when in doubt about which is good for your career take the unpredictable option.In your personal family system (IFS)

Rule 3: To quote him again “Bureaucracies are structures designed to do certain things very efficiently and competently: those that are by default in the best interests of the Sociopaths.

They are also designed to do certain things incompetently: those expensive things that the organization is expected to do, but would cut into Sociopath profits if actually done right.And finally, they are designed to obstruct, delay and generally kill things that might hurt the interests of the Sociopaths.” Take the priority of your decisions not from the bureaucracy and the rules/system it makes, but from what you want to achieve/get done.
Oh and while doing that, keep in mind that bureaucracies are hardly ever set in stone and do change, perhaps a thixotropic fluid is a good analogy.i.e: They have high viscousity(read resistance to change), while under normal conditions, but are less viscous when stressed. I might even suggest this is the key behind Jack Welch’s successful turnaround of GE*. One could argue that he personally created the stress required to get the GE Bureaucracy to adopt changes required to earn profits.

Rule 4: Recognize your habit patterns in taking sides within the social groups. Make sure you use ambiguity by deliberate choice(calculating the effect of it on your social status) rather than by reflex habits(read heuristics) shaped by past experiences.

*– While i don’t claim to have a lot of knowledge about the history of GE or of Jack Welch’s term at it, I have read his book “Straight from the Gut” and think i have some guess at his approach/ideas in decision making.

programming abstractions

I remember my first program in C. I remember my self-chosen first programming challenge, write a image processing library in C++ and learn C++ on the way.. Duh.. how naive was i.

Once upon a time i did imagine that new programming languages will make it easier and simpler, how naive was I.

http://www.joelonsoftware.com/articles/LeakyAbstractions.html

Now 12 years on from when I first learnt C, i see myself still running into frustrating, stuck dead-ends, it actually seems i run into the more often than I used to when i started, funny.. But as joel puts it here, new programming languages don’t make the learning curve shorter or easier. They just make it easier to deal with higher complexity programming projects.

Anyway, right now am stuck at the abstraction level of nagios nrpe application/daemon trying to connect.

And what do i do? i take a break go home.. do nonsense and other stuff and then come back.. I end up getting exhausted or frustrated by the sheer volume and variety of open-source software out there. begin to lose interest in figuring out all those, especially since technology seems to be coming out very very fast. Decidi i should go back to math and work on it.. duh..

But overall it has been an interesting experience so far and good training. one of the next things i need to do is quit the GUI for most of my focused work-mode.. the window focus changing si too demanding and distracting if i have to think deeply. I am beginning to understand why so many people working on multilpe languages and stuff, use the console mode for work.

Insight porn

I had recently unsubscribed(about a month ago) from ribbon farm’s rss feed(on my google reader). I had done it in an effort cut my read posts on reader/month ~50 from 100. But i did not really think a lot about why. i know i spend reading the ribbonfarm posts just that little extra time and/or attention. I also know, what i learn from ribbonfarm is valuable. But the number of posts per month convinced me to drop it and read  it instead by visiting on the browser. Ofcourse, due to the historical old posts browsing, it was already on my block list when i am in ‘work’ mode.

I hadn’t considered the tradeoff (attention vs lesson/learning value.) till i saw this post today. He refers to another blog , which talks about insight porn type of blogs. I managed not to click on it and go reading around yay… Anyway, given i have been reading ribbonfarm and do tend to write a blog post** on top of a specific ribbonfarm post and feel a very shallow/temporary rush of work done.  Note that it was nowhere close to what i get when i read up some math and try to sum it up or just take notes/thoughts on it.

Anyway, i realized in a sense venkat is right. He’s generating insights that are useful, but not  change your world type deep.  Atleast not any more. I won’t pretend to be following him from back in 2008 to know this, but i do know i kinda have some idea or the other of his posts’ direction with a little skimming. This is probably an effect me going and digging around his archived posts reading up for a couple of hours at a time . Anyway, these are porn in the sense, they have surprising perspectives and interesting metaphors sometimes, but sometimes they are just one/two-level logical connections from some well-known principles. (A side effect, being you become a lazier thinker.* )  I guess that’s where the 20-80 rule/problem comes up. Infact, i originally thought, i would go and make a list of his blog posts that are not insight porn and those that are give a feedback, before i realize that list is going to be different for different sets/types of people.

Besides, i don’t think there’s any reliable way (that i can suggest ) for him to measure(nay grok) the distribution to make the blog better.

Takeaways:

1. i now vow to write less posts surrounding/developing around ribbonfarm posts and definitely not new posts.

2. I vow to read ribbonfarm on intermittent + serendipity seeking basis. i now have a list of sites for serendipity and lesswrong is moving up in that list.

3. Forming a list of stuff to write on a habit basis( some mix of math + open-source s/w) — not yet ready.

4. Progress enough to move john d cooks’ blog from the google reader subscription to serendipity reading list. instead get something like the p=np blog as a subscription.

 

* — I guess we all optimize for some kinds of think and become lazy thinkers in some area or another realistically.

**– One of my own example is this. it doesn’t really qualify as insight porn(it falls short of offering reasonably useful insights) at least not  sophisticated, but very crude. but definitely qualifies as traffic generator posts..

 

UPDATE 1: ironically, enough this post itself seems to have become a bit of clicks attractor..

UPDATE 2: ok, i realized a couple of things, the site traffic monitor (inherent in wordpress and google analytics) both, perhaps useful if checked once a month or so are a pain if checked every day(which is what i had started doing). the problem with checking every day or once in two days is it is very easy to settle for the local maxima of clicks coming in from backtrack links on existing popular blogs. my core faults so far.

Python list vs dictionary.

Was talking to a colleague(.Net developer) and ended up lecturing him about how array(list in python) is a specific type of data structure and is a specific type of associated array. Now. the logic goes like this. Associated array(dict in python) is an key-value store. It is a method to store data in the format of a key  mapped to a value. it is usually implemented  with the help of a good hash function.

Anyway, the only constraint being any new insertion has to be of the format key, value , where both the key, value are hashable* values.

Add one more condition that the key values have to be in the increasing order of whole numbers(numbers starting at 0) and you have an array/list.

This discussion/lecture got me thinking about how it would be implemented at the python core language level. I promise to actually check the open-source code base and write a summary later, but for now here’s my thought process/guesses after some pruning over a walk.

1. A list by virtue of having whole numbers for key values will be easier to access. i.e: it can be stored in constant interval  locations in the memory. (I know python being dynamic typed and python lists being capable of storing different type of values in a list, complicates things, but only the implementation. In theory, i can just add a pointer in that memory segment to the real memory where it is stored(you know in case of a huge wall of text that doesn’t fit in the memory intervals.)).   Effect? Accessing a list can be done in Const. Time O(1).**

2. A dictionary since it can have an arbitrary data type as key, cannot be assumed to have const. memory spaces. but then we have a hash function. i.e , we pass our key to a function that is guaranteed to return a unique hash value for any  unique key.  Now the lookup becomes two fold. First

a, a Hashing to get the hash key,

b, Search the table for an entry with the same key as the hashed key value.

Now what is the Big O time for this. My first thought is well it depends on

a. the hashing function implementation

b.  The table size or rather the hashed value size w.r.to  the dictionary size.

Anyway, this reminded me of an older post i had made and the excursions i had made into the cpython source at that time. And i clearly remember some comment from that Objects/dictobject.c/h file about the hashing function being special enough to make O(1) look up. Now i did not really get it at that time and will need to check the code + comment again in context. but the basic reasoning as i remember is by avoiding most of the outlier cases and assuming most simplest/popularly used distribution of the keys, they can simplify the hashing function to O(1).  Will update more some time.

 

 

** — Turns out not exactly const time, but invariable time with respect to the number of elements in the list.  In cases of pointers, there will be variation in time depending on the size of the element stored, but the first lookup is a simple const. time lookup table.

* — by our hash function. but generally, not a file stream, socket handler, port handler , device file,  IPC pipe etc…