On Thu, Oct 4, 2012 at 1:38 PM, Eric Purdy <epurdy@uchicago.edu> wrote:

Do you know which part of the code is being slow? If it's the MCMC
part (and I'm guessing it is?), I might be able to give advice on
speeding that up.

On Thu, Oct 4, 2012 at 11:31 AM, Michael Bishop <michaelbish@gmail.com> wrote:
> Thanks Eric. So, I do have the code the authors used, and it will be made
> public eventually, but they haven't given me permission to make it public,
> and I'm a reticent about sharing it over the list. If you're interested in
> it, let me know with an email off list, and we can discuss how you might
> get involved. Maybe at some point in the not-too-distant future I can just
> share it on the list. The code should run on any suitably cleaned up
> directed network data, but it has only been run on data from the Add Health
> study. Its very processor intensive... Last night I started running it on
> one of the largest of 84 networks (844 vertices and approx 4000 edges) and
> its still running.
>
>
>
> On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy <epurdy@uchicago.edu> wrote:
>>
>> First, some general advice about reading technical papers! Read the
>> introduction, then the discussion/conclusion, then the experimental
>> results, and THEN the technical details (in this case, the section
>> entitled "Inference of rank from network structure"). This is out of
>> order, obviously, but it's a good triage strategy: you read things in
>> increasing level of difficulty, so that you always know whether you
>> care enough to read the next, more difficult part. It's also good
>> because you have more context to interpret the difficult parts if
>> you've read all the easier parts of the paper.
>>
>> I've put a list of wikipedia articles at the bottom of this email,
>> under the heading "Study Guide". They are a hopefully comprehensible
>> explanation of the math needed to understand the most technical parts
>> of the paper. But, if you don't feel like wading through the most
>> technical parts, you don't have to read the study guide articles in
>> order to participate in the discussion. (Also, I'm omitting some
>> background for technical parts that I don't know much about, like
>> ANOVA and statistical tests of model fit. If someone feels like adding
>> anything to the study guide, go for it.)
>>
>> == Paper ==
>>
>> PDF link: http://arxiv.org/pdf/1205.6822v1.pdf
>> Arxiv page: http://arxiv.org/abs/1205.6822
>>
>> == Abstract ==
>>
>> In empirical studies of friendship networks participants are typically
>> asked, in interviews or questionnaires, to identify some or all of
>> their close friends, resulting in a directed network in which
>> friendships can, and often do, run in only one direction between a
>> pair of individuals. Here we analyze a large collection of such
>> networks representing friendships among students at US high and
>> junior-high schools and show that the pattern of unreciprocated
>> friendships is far from random. In every network, without exception,
>> we find that there exists a ranking of participants, from low to high,
>> such that almost all unreciprocated friendships consist of a
>> lower-ranked individual claiming friendship with a higher-ranked one.
>> We present a maximum-likelihood method for deducing such rankings from
>> observed network data and conjecture that the rankings produced
>> reflect a measure of social status. We note in particular that
>> reciprocated and unreciprocated friendships obey different statistics,
>> suggesting different formation processes, and that rankings are
>> correlated with other characteristics of the participants that are
>> traditionally associated with status, such as age and overall
>> popularity as measured by total number of friends.
>>
>> == Code ==
>>
>> Mike Bishop has some C++ code that (I think?) the authors of the paper
>> are releasing publicly. Mike, can you send that out? It would probably
>> be a good base to start from if we want to run simulations.
>>
>> == Study guide ==
>>
>> These are just to be helpful, don't feel like you have to read any of
>> them to participate. And feel free to suggest additions, or say that
>> one of these articles wasn't very helpful.
>>
>> Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This
>> is useful because it gives you a dictionary between graph theory and
>> linear algebra. It only comes up a little bit, but it's often used
>> without any explanation.
>>
>> Elementary Bayesian stuff:
>> (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very
>> high-level article that gives links to more in-depth articles.
>>
>> Less elementary Bayesian stuff:
>> (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main
>> piece of background for the technical section of this paper. If you
>> really want to know this stuff, Bishop's "Pattern Recognition and
>> Machine Learning" is the best textbook I know, although it's not
>> accessible without a lot of mathematical background or a lot of free
>> time.
>>
>> Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've
>> never seen this before, but it seems like the appropriate math to use
>> when thinking about social science stuff with Bayesian methods.
>>
>> EM:
>> (http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm)
>> The workhorse of mathematically rigorous Bayesian statistics. It's a
>> flaky piece of shit, but it's pretty much all we have.
>>
>> MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The
>> workhorse of mathematically rigorous frequentist statistics.
>>
>> Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is
>> the most mature mathematics that specifies how emergent properties
>> emerge from local interactions. It was first created to model
>> thermodynamics, which is basically all about emergent properties of
>> molecule interactions, like temperature, pressure, etc.
>>
>> Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the
>> most straightforward example of a Gibbs distribution, it's what you
>> should think about when you try to understand them.
>>
>> On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy <epurdy@uchicago.edu> wrote:
>> > I am starting a new listhost, math@moomers.org. I am hoping that it
>> > will function much like readings, but that we can have long and
>> > intricate technical discussions without annoying anyone.
>> >
>> > Some proposed ground rules:
>> >
>> > - There are no stupid questions, and we're interested in everyone's
>> > opinion. In particular, you are encouraged to subscribe to
>> > math@moomers.org even if you do not have much mathematical training.
>> > If you're interested in watching math being done, that's all that
>> > matters.
>> >
>> > - There will be no status games, and no arguments from authority.
>> > Nobody should sign up to math@moomers.org because they want to impress
>> > anyone, and nobody can cite their professional status or mathematical
>> > training to make someone else shut up.
>> >
>> > - If any publishable research is done on math@moomers.org, we will
>> > decide by consensus whose names should be on any papers that result.
>> >
>> > The first proposed paper to read is "Friendship networks and social
>> > status", by Ball and Newman. It is available online at
>> > http://arxiv.org/abs/1205.6822 . I am interested in it because it
>> > gives a nice model of social stratification as an emergent property of
>> > social networks. I think it would be cool to try to extend it, and I
>> > think it should be possible to give a scientific account of the
>> > phenomenon of discursive privilege.
>> >
>> > Here is the abstract:
>> >
>> > In empirical studies of friendship networks participants are typically
>> > asked, in interviews or questionnaires, to identify some or all of
>> > their close friends, resulting in a directed network in which
>> > friendships can, and often do, run in only one direction between a
>> > pair of individuals. Here we analyze a large collection of such
>> > networks representing friendships among students at US high and
>> > junior-high schools and show that the pattern of unreciprocated
>> > friendships is far from random. In every network, without exception,
>> > we find that there exists a ranking of participants, from low to high,
>> > such that almost all unreciprocated friendships consist of a
>> > lower-ranked individual claiming friendship with a higher-ranked one.
>> > We present a maximum-likelihood method for deducing such rankings from
>> > observed network data and conjecture that the rankings produced
>> > reflect a measure of social status. We note in particular that
>> > reciprocated and unreciprocated friendships obey different statistics,
>> > suggesting different formation processes, and that rankings are
>> > correlated with other characteristics of the participants that are
>> > traditionally associated with status, such as age and overall
>> > popularity as measured by total number of friends.
>> >
>> > --
>> > -Eric
>>
>>
>>
>> --
>> -Eric
>>
>> _______________________________________________
>> Math mailing list
>> Math@moomers.org
>> http://mailman.moomers.org/mailman/listinfo/math
>>
>

--
-Eric