paper: "Friendship networks and social status" (Brian Ball, M. E. J. Newman)
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like adding anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the paper are releasing publicly. Mike, can you send that out? It would probably be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM: (http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu wrote:
I am starting a new listhost, math@moomers.org. I am hoping that it will function much like readings, but that we can have long and intricate technical discussions without annoying anyone.
Some proposed ground rules:
- There are no stupid questions, and we're interested in everyone's
opinion. In particular, you are encouraged to subscribe to math@moomers.org even if you do not have much mathematical training. If you're interested in watching math being done, that's all that matters.
- There will be no status games, and no arguments from authority.
Nobody should sign up to math@moomers.org because they want to impress anyone, and nobody can cite their professional status or mathematical training to make someone else shut up.
- If any publishable research is done on math@moomers.org, we will
decide by consensus whose names should be on any papers that result.
The first proposed paper to read is "Friendship networks and social status", by Ball and Newman. It is available online at http://arxiv.org/abs/1205.6822 . I am interested in it because it gives a nice model of social stratification as an emergent property of social networks. I think it would be cool to try to extend it, and I think it should be possible to give a scientific account of the phenomenon of discursive privilege.
Here is the abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
-- -Eric
Thanks Eric. So, I do have the code the authors used, and it will be made public eventually, but they haven't given me permission to make it public, and I'm a reticent about sharing it over the list. If you're interested in it, let me know with an email off list, and we can discuss how you might get involved. Maybe at some point in the not-too-distant future I can just share it on the list. The code should run on any suitably cleaned up directed network data, but it has only been run on data from the Add Health study. Its very processor intensive... Last night I started running it on one of the largest of 84 networks (844 vertices and approx 4000 edges) and its still running.
On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy epurdy@uchicago.edu wrote:
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like adding anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the paper are releasing publicly. Mike, can you send that out? It would probably be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM: ( http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu wrote:
I am starting a new listhost, math@moomers.org. I am hoping that it will function much like readings, but that we can have long and intricate technical discussions without annoying anyone.
Some proposed ground rules:
- There are no stupid questions, and we're interested in everyone's
opinion. In particular, you are encouraged to subscribe to math@moomers.org even if you do not have much mathematical training. If you're interested in watching math being done, that's all that matters.
- There will be no status games, and no arguments from authority.
Nobody should sign up to math@moomers.org because they want to impress anyone, and nobody can cite their professional status or mathematical training to make someone else shut up.
- If any publishable research is done on math@moomers.org, we will
decide by consensus whose names should be on any papers that result.
The first proposed paper to read is "Friendship networks and social status", by Ball and Newman. It is available online at http://arxiv.org/abs/1205.6822 . I am interested in it because it gives a nice model of social stratification as an emergent property of social networks. I think it would be cool to try to extend it, and I think it should be possible to give a scientific account of the phenomenon of discursive privilege.
Here is the abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Do you know which part of the code is being slow? If it's the MCMC part (and I'm guessing it is?), I might be able to give advice on speeding that up.
On Thu, Oct 4, 2012 at 11:31 AM, Michael Bishop michaelbish@gmail.com wrote:
Thanks Eric. So, I do have the code the authors used, and it will be made public eventually, but they haven't given me permission to make it public, and I'm a reticent about sharing it over the list. If you're interested in it, let me know with an email off list, and we can discuss how you might get involved. Maybe at some point in the not-too-distant future I can just share it on the list. The code should run on any suitably cleaned up directed network data, but it has only been run on data from the Add Health study. Its very processor intensive... Last night I started running it on one of the largest of 84 networks (844 vertices and approx 4000 edges) and its still running.
On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy epurdy@uchicago.edu wrote:
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like adding anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the paper are releasing publicly. Mike, can you send that out? It would probably be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM: (http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu wrote:
I am starting a new listhost, math@moomers.org. I am hoping that it will function much like readings, but that we can have long and intricate technical discussions without annoying anyone.
Some proposed ground rules:
- There are no stupid questions, and we're interested in everyone's
opinion. In particular, you are encouraged to subscribe to math@moomers.org even if you do not have much mathematical training. If you're interested in watching math being done, that's all that matters.
- There will be no status games, and no arguments from authority.
Nobody should sign up to math@moomers.org because they want to impress anyone, and nobody can cite their professional status or mathematical training to make someone else shut up.
- If any publishable research is done on math@moomers.org, we will
decide by consensus whose names should be on any papers that result.
The first proposed paper to read is "Friendship networks and social status", by Ball and Newman. It is available online at http://arxiv.org/abs/1205.6822 . I am interested in it because it gives a nice model of social stratification as an emergent property of social networks. I think it would be cool to try to extend it, and I think it should be possible to give a scientific account of the phenomenon of discursive privilege.
Here is the abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Yeah, its the MCMC part... still running for 16 hours now on a year old computer with 16gb of ram. Once again this is just one, albeit the largest, of 84 networks in my dataset. I'm told the easiest thing that would help is to take advantage of my second core but I don't know how to do that.
Here's the output: $ ./highschool_rankings -i paj_files_r/comm93.scc.el -iter30 -v3 First step will be a linearly scored minimum violations ranking. Ranks initialized via simulated annealing. The score from the annealing process was -59692 Starting EM iteration 1 of a maximum of 30. Log-Likelihood at the end of the iteration = 20387.6 0.242066 1.46033e-05 0.000997235 0.0618695 5.06643e-05 -0.0331409 0.0261 432 0.0251279 -0.0177707 -0.0117149 Starting EM iteration 2 of a maximum of 30. Log-Likelihood at the end of the iteration = 19637.5 0.271515 1.45319e-05 0.000757182 0.0797456 4.72377e-05 0.0309019 -0.0246 118 -0.0258381 0.0155623 0.0114179 Starting EM iteration 3 of a maximum of 30. Log-Likelihood at the end of the iteration = 19357.8 0.281222 1.43437e-05 0.000693565 0.0887446 4.46714e-05 -0.0349585 0.0142 105 0.0235122 -0.0143165 -0.00574516 Starting EM iteration 4 of a maximum of 30. Log-Likelihood at the end of the iteration = 19278.5 0.281864 1.42948e-05 0.000692791 0.0931104 4.31624e-05 -0.0341695 0.0147 996 0.0244729 -0.0133326 -0.00491241 Starting EM iteration 5 of a maximum of 30. Log-Likelihood at the end of the iteration = 19219.5 0.288413 1.35299e-05 0.000712436 0.098817 4.00959e-05 -0.0339019 0.01445 34 0.0246769 -0.0129563 -0.00465449 Starting EM iteration 6 of a maximum of 30.
On Thu, Oct 4, 2012 at 1:38 PM, Eric Purdy epurdy@uchicago.edu wrote:
Do you know which part of the code is being slow? If it's the MCMC part (and I'm guessing it is?), I might be able to give advice on speeding that up.
On Thu, Oct 4, 2012 at 11:31 AM, Michael Bishop michaelbish@gmail.com wrote:
Thanks Eric. So, I do have the code the authors used, and it will be
made
public eventually, but they haven't given me permission to make it
public,
and I'm a reticent about sharing it over the list. If you're interested
in
it, let me know with an email off list, and we can discuss how you might get involved. Maybe at some point in the not-too-distant future I can
just
share it on the list. The code should run on any suitably cleaned up directed network data, but it has only been run on data from the Add
Health
study. Its very processor intensive... Last night I started running it
on
one of the largest of 84 networks (844 vertices and approx 4000 edges)
and
its still running.
On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy epurdy@uchicago.edu wrote:
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like adding anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the paper are releasing publicly. Mike, can you send that out? It would probably be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM: (
http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm)
The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu
wrote:
I am starting a new listhost, math@moomers.org. I am hoping that it will function much like readings, but that we can have long and intricate technical discussions without annoying anyone.
Some proposed ground rules:
- There are no stupid questions, and we're interested in everyone's
opinion. In particular, you are encouraged to subscribe to math@moomers.org even if you do not have much mathematical training. If you're interested in watching math being done, that's all that matters.
- There will be no status games, and no arguments from authority.
Nobody should sign up to math@moomers.org because they want to
impress
anyone, and nobody can cite their professional status or mathematical training to make someone else shut up.
- If any publishable research is done on math@moomers.org, we will
decide by consensus whose names should be on any papers that result.
The first proposed paper to read is "Friendship networks and social status", by Ball and Newman. It is available online at http://arxiv.org/abs/1205.6822 . I am interested in it because it gives a nice model of social stratification as an emergent property of social networks. I think it would be cool to try to extend it, and I think it should be possible to give a scientific account of the phenomenon of discursive privilege.
Here is the abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
-- -Eric
You might be able to just skip the MCMC part and do exact marginalization. Do you want me to try to code that up? I have an out-of-date version of the code sitting around somewhere.
On Thu, Oct 4, 2012 at 1:50 PM, Michael Bishop michaelbish@gmail.com wrote:
Yeah, its the MCMC part... still running for 16 hours now on a year old computer with 16gb of ram. Once again this is just one, albeit the largest, of 84 networks in my dataset. I'm told the easiest thing that would help is to take advantage of my second core but I don't know how to do that.
Here's the output: $ ./highschool_rankings -i paj_files_r/comm93.scc.el -iter30 -v3 First step will be a linearly scored minimum violations ranking. Ranks initialized via simulated annealing. The score from the annealing process was -59692 Starting EM iteration 1 of a maximum of 30. Log-Likelihood at the end of the iteration = 20387.6 0.242066 1.46033e-05 0.000997235 0.0618695 5.06643e-05 -0.0331409 0.0261 432 0.0251279 -0.0177707 -0.0117149 Starting EM iteration 2 of a maximum of 30. Log-Likelihood at the end of the iteration = 19637.5 0.271515 1.45319e-05 0.000757182 0.0797456 4.72377e-05 0.0309019 -0.0246 118 -0.0258381 0.0155623 0.0114179 Starting EM iteration 3 of a maximum of 30. Log-Likelihood at the end of the iteration = 19357.8 0.281222 1.43437e-05 0.000693565 0.0887446 4.46714e-05 -0.0349585 0.0142 105 0.0235122 -0.0143165 -0.00574516 Starting EM iteration 4 of a maximum of 30. Log-Likelihood at the end of the iteration = 19278.5 0.281864 1.42948e-05 0.000692791 0.0931104 4.31624e-05 -0.0341695 0.0147 996 0.0244729 -0.0133326 -0.00491241 Starting EM iteration 5 of a maximum of 30. Log-Likelihood at the end of the iteration = 19219.5 0.288413 1.35299e-05 0.000712436 0.098817 4.00959e-05 -0.0339019 0.01445 34 0.0246769 -0.0129563 -0.00465449 Starting EM iteration 6 of a maximum of 30.
On Thu, Oct 4, 2012 at 1:38 PM, Eric Purdy epurdy@uchicago.edu wrote:
Do you know which part of the code is being slow? If it's the MCMC part (and I'm guessing it is?), I might be able to give advice on speeding that up.
On Thu, Oct 4, 2012 at 11:31 AM, Michael Bishop michaelbish@gmail.com wrote:
Thanks Eric. So, I do have the code the authors used, and it will be made public eventually, but they haven't given me permission to make it public, and I'm a reticent about sharing it over the list. If you're interested in it, let me know with an email off list, and we can discuss how you might get involved. Maybe at some point in the not-too-distant future I can just share it on the list. The code should run on any suitably cleaned up directed network data, but it has only been run on data from the Add Health study. Its very processor intensive... Last night I started running it on one of the largest of 84 networks (844 vertices and approx 4000 edges) and its still running.
On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy epurdy@uchicago.edu wrote:
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like adding anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the paper are releasing publicly. Mike, can you send that out? It would probably be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM:
(http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu wrote:
I am starting a new listhost, math@moomers.org. I am hoping that it will function much like readings, but that we can have long and intricate technical discussions without annoying anyone.
Some proposed ground rules:
- There are no stupid questions, and we're interested in everyone's
opinion. In particular, you are encouraged to subscribe to math@moomers.org even if you do not have much mathematical training. If you're interested in watching math being done, that's all that matters.
- There will be no status games, and no arguments from authority.
Nobody should sign up to math@moomers.org because they want to impress anyone, and nobody can cite their professional status or mathematical training to make someone else shut up.
- If any publishable research is done on math@moomers.org, we will
decide by consensus whose names should be on any papers that result.
The first proposed paper to read is "Friendship networks and social status", by Ball and Newman. It is available online at http://arxiv.org/abs/1205.6822 . I am interested in it because it gives a nice model of social stratification as an emergent property of social networks. I think it would be cool to try to extend it, and I think it should be possible to give a scientific account of the phenomenon of discursive privilege.
Here is the abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
-- -Eric
Eric, I'm very grateful for the offer. I'm also a bit uncomfortable asking you to do that. I also don't know the benefits, and might not be able to use it without help.
On Thu, Oct 4, 2012 at 2:01 PM, Eric Purdy epurdy@uchicago.edu wrote:
You might be able to just skip the MCMC part and do exact marginalization. Do you want me to try to code that up? I have an out-of-date version of the code sitting around somewhere.
On Thu, Oct 4, 2012 at 1:50 PM, Michael Bishop michaelbish@gmail.com wrote:
Yeah, its the MCMC part... still running for 16 hours now on a year old computer with 16gb of ram. Once again this is just one, albeit the
largest,
of 84 networks in my dataset. I'm told the easiest thing that would
help is
to take advantage of my second core but I don't know how to do that.
Here's the output: $ ./highschool_rankings -i paj_files_r/comm93.scc.el -iter30 -v3 First step will be a linearly scored minimum violations ranking. Ranks initialized via simulated annealing. The score from the annealing process was -59692 Starting EM iteration 1 of a maximum of 30. Log-Likelihood at the end of the iteration = 20387.6 0.242066 1.46033e-05 0.000997235 0.0618695 5.06643e-05 -0.0331409 0.0261 432 0.0251279 -0.0177707 -0.0117149 Starting EM iteration 2 of a maximum of 30. Log-Likelihood at the end of the iteration = 19637.5 0.271515 1.45319e-05 0.000757182 0.0797456 4.72377e-05 0.0309019 -0.0246 118 -0.0258381 0.0155623 0.0114179 Starting EM iteration 3 of a maximum of 30. Log-Likelihood at the end of the iteration = 19357.8 0.281222 1.43437e-05 0.000693565 0.0887446 4.46714e-05 -0.0349585 0.0142 105 0.0235122 -0.0143165 -0.00574516 Starting EM iteration 4 of a maximum of 30. Log-Likelihood at the end of the iteration = 19278.5 0.281864 1.42948e-05 0.000692791 0.0931104 4.31624e-05 -0.0341695 0.0147 996 0.0244729 -0.0133326 -0.00491241 Starting EM iteration 5 of a maximum of 30. Log-Likelihood at the end of the iteration = 19219.5 0.288413 1.35299e-05 0.000712436 0.098817 4.00959e-05 -0.0339019 0.01445 34 0.0246769 -0.0129563 -0.00465449 Starting EM iteration 6 of a maximum of 30.
On Thu, Oct 4, 2012 at 1:38 PM, Eric Purdy epurdy@uchicago.edu wrote:
Do you know which part of the code is being slow? If it's the MCMC part (and I'm guessing it is?), I might be able to give advice on speeding that up.
On Thu, Oct 4, 2012 at 11:31 AM, Michael Bishop michaelbish@gmail.com wrote:
Thanks Eric. So, I do have the code the authors used, and it will be made public eventually, but they haven't given me permission to make it public, and I'm a reticent about sharing it over the list. If you're
interested
in it, let me know with an email off list, and we can discuss how you might get involved. Maybe at some point in the not-too-distant future I can just share it on the list. The code should run on any suitably cleaned up directed network data, but it has only been run on data from the Add Health study. Its very processor intensive... Last night I started running
it
on one of the largest of 84 networks (844 vertices and approx 4000 edges) and its still running.
On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy epurdy@uchicago.edu
wrote:
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like
adding
anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are
typically
asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to
high,
such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings
from
observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different
statistics,
suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the
paper
are releasing publicly. Mike, can you send that out? It would
probably
be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix)
This
is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM:
(
http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm)
The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This
is
the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu wrote:
I am starting a new listhost, math@moomers.org. I am hoping that
it
will function much like readings, but that we can have long and intricate technical discussions without annoying anyone.
Some proposed ground rules:
- There are no stupid questions, and we're interested in everyone's
opinion. In particular, you are encouraged to subscribe to math@moomers.org even if you do not have much mathematical
training.
If you're interested in watching math being done, that's all that matters.
- There will be no status games, and no arguments from authority.
Nobody should sign up to math@moomers.org because they want to impress anyone, and nobody can cite their professional status or
mathematical
training to make someone else shut up.
- If any publishable research is done on math@moomers.org, we will
decide by consensus whose names should be on any papers that
result.
The first proposed paper to read is "Friendship networks and social status", by Ball and Newman. It is available online at http://arxiv.org/abs/1205.6822 . I am interested in it because it gives a nice model of social stratification as an emergent property of social networks. I think it would be cool to try to extend it, and
I
think it should be possible to give a scientific account of the phenomenon of discursive privilege.
Here is the abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without
exception,
we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked
one.
We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
-- -Eric
-- -Eric
I *think* you can use a weird dynamic programming-style process to draw independent random samples from the set of all rankings in time O(EV) (where E is the number of edges and V the number of nodes), which should only be a second or two if written well. Since the samples are independent, you would not need very many to get a good estimate of the things you need a good estimate for. It should be possible to get the running time down to something more reasonable, like five minutes or an hour.
I'm feeling too tired/lazy to work out all the details right now, but it would be a good challenge/exercise for people who want to work on their algorithms skillz. You have to use amortized analysis, I think.
The trick that I think will work is that, if you have a ranking of the first k nodes, you can generate a ranking of the first k+1 nodes by inserting the (k+1)st node between any two nodes of the previous ranking. (Or, it can also go at the beginning or the end, so there are k+1 places it can go.) So, if you can generate a legitimate random sample for a ranking of the first k nodes, you can extend it randomly, and get a legitimate ranking of the first (k+1) nodes. You have to be careful while doing it, though, because any edge that goes from a node before the insertion to a node after the insertion will now be between nodes that are one more rank apart than before.
So basically, you just want to make sure that you can do this updating of the sample in such a way that you satisfy equation (1) from the paper at every step.
If anyone would like to work on this problem, but needs more hints, let me know.
On Thu, Oct 4, 2012 at 3:07 PM, Michael Bishop michaelbish@gmail.com wrote:
Eric, I'm very grateful for the offer. I'm also a bit uncomfortable asking you to do that. I also don't know the benefits, and might not be able to use it without help.
On Thu, Oct 4, 2012 at 2:01 PM, Eric Purdy epurdy@uchicago.edu wrote:
You might be able to just skip the MCMC part and do exact marginalization. Do you want me to try to code that up? I have an out-of-date version of the code sitting around somewhere.
On Thu, Oct 4, 2012 at 1:50 PM, Michael Bishop michaelbish@gmail.com wrote:
Yeah, its the MCMC part... still running for 16 hours now on a year old computer with 16gb of ram. Once again this is just one, albeit the largest, of 84 networks in my dataset. I'm told the easiest thing that would help is to take advantage of my second core but I don't know how to do that.
Here's the output: $ ./highschool_rankings -i paj_files_r/comm93.scc.el -iter30 -v3 First step will be a linearly scored minimum violations ranking. Ranks initialized via simulated annealing. The score from the annealing process was -59692 Starting EM iteration 1 of a maximum of 30. Log-Likelihood at the end of the iteration = 20387.6 0.242066 1.46033e-05 0.000997235 0.0618695 5.06643e-05 -0.0331409 0.0261 432 0.0251279 -0.0177707 -0.0117149 Starting EM iteration 2 of a maximum of 30. Log-Likelihood at the end of the iteration = 19637.5 0.271515 1.45319e-05 0.000757182 0.0797456 4.72377e-05 0.0309019 -0.0246 118 -0.0258381 0.0155623 0.0114179 Starting EM iteration 3 of a maximum of 30. Log-Likelihood at the end of the iteration = 19357.8 0.281222 1.43437e-05 0.000693565 0.0887446 4.46714e-05 -0.0349585 0.0142 105 0.0235122 -0.0143165 -0.00574516 Starting EM iteration 4 of a maximum of 30. Log-Likelihood at the end of the iteration = 19278.5 0.281864 1.42948e-05 0.000692791 0.0931104 4.31624e-05 -0.0341695 0.0147 996 0.0244729 -0.0133326 -0.00491241 Starting EM iteration 5 of a maximum of 30. Log-Likelihood at the end of the iteration = 19219.5 0.288413 1.35299e-05 0.000712436 0.098817 4.00959e-05 -0.0339019 0.01445 34 0.0246769 -0.0129563 -0.00465449 Starting EM iteration 6 of a maximum of 30.
On Thu, Oct 4, 2012 at 1:38 PM, Eric Purdy epurdy@uchicago.edu wrote:
Do you know which part of the code is being slow? If it's the MCMC part (and I'm guessing it is?), I might be able to give advice on speeding that up.
On Thu, Oct 4, 2012 at 11:31 AM, Michael Bishop michaelbish@gmail.com wrote:
Thanks Eric. So, I do have the code the authors used, and it will be made public eventually, but they haven't given me permission to make it public, and I'm a reticent about sharing it over the list. If you're interested in it, let me know with an email off list, and we can discuss how you might get involved. Maybe at some point in the not-too-distant future I can just share it on the list. The code should run on any suitably cleaned up directed network data, but it has only been run on data from the Add Health study. Its very processor intensive... Last night I started running it on one of the largest of 84 networks (844 vertices and approx 4000 edges) and its still running.
On Thu, Oct 4, 2012 at 9:59 AM, Eric Purdy epurdy@uchicago.edu wrote:
First, some general advice about reading technical papers! Read the introduction, then the discussion/conclusion, then the experimental results, and THEN the technical details (in this case, the section entitled "Inference of rank from network structure"). This is out of order, obviously, but it's a good triage strategy: you read things in increasing level of difficulty, so that you always know whether you care enough to read the next, more difficult part. It's also good because you have more context to interpret the difficult parts if you've read all the easier parts of the paper.
I've put a list of wikipedia articles at the bottom of this email, under the heading "Study Guide". They are a hopefully comprehensible explanation of the math needed to understand the most technical parts of the paper. But, if you don't feel like wading through the most technical parts, you don't have to read the study guide articles in order to participate in the discussion. (Also, I'm omitting some background for technical parts that I don't know much about, like ANOVA and statistical tests of model fit. If someone feels like adding anything to the study guide, go for it.)
== Paper ==
PDF link: http://arxiv.org/pdf/1205.6822v1.pdf Arxiv page: http://arxiv.org/abs/1205.6822
== Abstract ==
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
== Code ==
Mike Bishop has some C++ code that (I think?) the authors of the paper are releasing publicly. Mike, can you send that out? It would probably be a good base to start from if we want to run simulations.
== Study guide ==
These are just to be helpful, don't feel like you have to read any of them to participate. And feel free to suggest additions, or say that one of these articles wasn't very helpful.
Adjacency matrix: (http://en.wikipedia.org/wiki/Adjacency_matrix) This is useful because it gives you a dictionary between graph theory and linear algebra. It only comes up a little bit, but it's often used without any explanation.
Elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_statistics) This is a very high-level article that gives links to more in-depth articles.
Less elementary Bayesian stuff: (http://en.wikipedia.org/wiki/Bayesian_inference) This is the main piece of background for the technical section of this paper. If you really want to know this stuff, Bishop's "Pattern Recognition and Machine Learning" is the best textbook I know, although it's not accessible without a lot of mathematical background or a lot of free time.
Bayesian games: (http://en.wikipedia.org/wiki/Bayesian_game) I've never seen this before, but it seems like the appropriate math to use when thinking about social science stuff with Bayesian methods.
EM:
(http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) The workhorse of mathematically rigorous Bayesian statistics. It's a flaky piece of shit, but it's pretty much all we have.
MCMC: (http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo) The workhorse of mathematically rigorous frequentist statistics.
Gibbs sampling: (http://en.wikipedia.org/wiki/Gibbs_sampling) This is the most mature mathematics that specifies how emergent properties emerge from local interactions. It was first created to model thermodynamics, which is basically all about emergent properties of molecule interactions, like temperature, pressure, etc.
Ising model: (http://en.wikipedia.org/wiki/Ising_model) This is the most straightforward example of a Gibbs distribution, it's what you should think about when you try to understand them.
On Wed, Oct 3, 2012 at 10:37 AM, Eric Purdy epurdy@uchicago.edu wrote: > I am starting a new listhost, math@moomers.org. I am hoping that > it > will function much like readings, but that we can have long and > intricate technical discussions without annoying anyone. > > Some proposed ground rules: > > - There are no stupid questions, and we're interested in > everyone's > opinion. In particular, you are encouraged to subscribe to > math@moomers.org even if you do not have much mathematical > training. > If you're interested in watching math being done, that's all that > matters. > > - There will be no status games, and no arguments from authority. > Nobody should sign up to math@moomers.org because they want to > impress > anyone, and nobody can cite their professional status or > mathematical > training to make someone else shut up. > > - If any publishable research is done on math@moomers.org, we will > decide by consensus whose names should be on any papers that > result. > > The first proposed paper to read is "Friendship networks and > social > status", by Ball and Newman. It is available online at > http://arxiv.org/abs/1205.6822 . I am interested in it because it > gives a nice model of social stratification as an emergent > property > of > social networks. I think it would be cool to try to extend it, and > I > think it should be possible to give a scientific account of the > phenomenon of discursive privilege. > > Here is the abstract: > > In empirical studies of friendship networks participants are > typically > asked, in interviews or questionnaires, to identify some or all of > their close friends, resulting in a directed network in which > friendships can, and often do, run in only one direction between a > pair of individuals. Here we analyze a large collection of such > networks representing friendships among students at US high and > junior-high schools and show that the pattern of unreciprocated > friendships is far from random. In every network, without > exception, > we find that there exists a ranking of participants, from low to > high, > such that almost all unreciprocated friendships consist of a > lower-ranked individual claiming friendship with a higher-ranked > one. > We present a maximum-likelihood method for deducing such rankings > from > observed network data and conjecture that the rankings produced > reflect a measure of social status. We note in particular that > reciprocated and unreciprocated friendships obey different > statistics, > suggesting different formation processes, and that rankings are > correlated with other characteristics of the participants that are > traditionally associated with status, such as age and overall > popularity as measured by total number of friends. > > -- > -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
-- -Eric
-- -Eric
On Thu, Oct 4, 2012 at 7:03 PM, Eric Purdy epurdy@uchicago.edu wrote:
If anyone would like to work on this problem, but needs more hints, let me know.
and let me know too!
Eric, I don't want you to feel pressured to explain, but let me throw a few questions out there. Do you agree with the decision to use a poisson formulation of the random network generation process?
For context if you need it, see the last few sentences of the quoted passage. Recall α and β are the pdfs for reciprocated and unreciprocated ties respectively, each is a function only of the difference between the rank of the people/vertices.
If we were not given a network but we were given the probability functions
α and β and a complete set of rankings on n vertices, then we could use this model to generate—for instance on a computer—a hypothetical but plausible network in which edges appeared with the appropriate probabilities. In effect, we have a random graph model that incorporates rankings. In this paper, however, we want to perform the reverse operation: given a network we want to deduce the rankings of the nodes and the values of the functions α and β. To put that another way, if we are given a network and we assume that it is generated from our model, what values of the rankings and probability functions are most likely to have generated the network we observe? This question leads us to a maximum likelihood formulation of our problem, which we treat using an expectation–maximization (EM) approach in which the ranks R are considered hidden variables to be determined and the functions α and β are parameters of the model. Using a Poisson formulation of the random network generation process, we can write the probability of generation of a network G with rankings R, given the functions α and β, as (See Eq 1)
Also, is it surprising that:
...The function β takes a more complicated form which we parametrize as a Fourier cosine series, keeping five terms and squaring to enforce nonnegativity, plus an additional Gaussian peak at the origin
Of course the authors claim that the results are robust to parameterization...
I feel like, if people on this list have the energy, we could actually reimplement simplified versions of their experiments in scipy. Would that be rude or something? Certainly in traditional peer-review, it is a huge fuck-you to try to improve someone's results that you have been given a preprint of.
Eric, I don't want you to feel pressured to explain, but let me throw a few questions out there. Do you agree with the decision to use a poisson formulation of the random network generation process?
I haven't looked that closely at the random network generation process, but in general I am strongly in favor of generative models like that. Bayesian methods typically try to maximize the "posterior probability" of a model, as follows:
P(model | data) = P(data | model) P(model) / P(data)
If you only care about maximizing P(model | data), then you typically take logs and throw away constants:
logP(model | data) = logP(data | model) + logP(model) - logP(data)
= logP(data | model) + logP(model) + C
P(model) is called the prior probability of the given model. P(data | model) is some proposed way that you would randomly generate the data observed given the model and model parameters.
Also, is it surprising that:
...The function β takes a more complicated form which we parametrize as a Fourier cosine series, keeping five terms and squaring to enforce nonnegativity, plus an additional Gaussian peak at the origin
I would probably just use polynomials? Then you could require alpha to be composed of only even-powered terms (1, x^2, x^4, etc.) and beta to be composed of only odd-powered terms (x, x^3, x^5, etc.). This would force alpha to be symmetric (alpha(-x) = alpha(x)) and beta to be anti-symmetric (beta(-x) = -beta(x)). But maybe there is some reason why that wouldn't work.
If I open a second instance of cygwin and run another instance of this program, will windows/cygwin know to execute it on a separate core? Brian (the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Yeah I don't really know what the norms are surrounding this sorta thing. I really don't want to send these guys a huge fuck-you. But it is basically accepted at a real journal... this new one I thinkhttp://www.indiana.edu/~netsci/? Does its going to press change anything?
Its good to hear their approach seems fairly reasonable to you. It seemed weird to me to bring periodic functions like cosine into it... my intuition is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner solutions" at max rank differences.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated ties are different, just allow that to emerge... so he suggested a mixture of 3 beta distributions.
On Fri, Oct 5, 2012 at 9:31 AM, Eric Purdy epurdy@uchicago.edu wrote:
I feel like, if people on this list have the energy, we could actually reimplement simplified versions of their experiments in scipy. Would that be rude or something? Certainly in traditional peer-review, it is a huge fuck-you to try to improve someone's results that you have been given a preprint of.
Eric, I don't want you to feel pressured to explain, but let me throw a
few
questions out there. Do you agree with the decision to use a poisson formulation of the random network generation process?
I haven't looked that closely at the random network generation process, but in general I am strongly in favor of generative models like that. Bayesian methods typically try to maximize the "posterior probability" of a model, as follows:
P(model | data) = P(data | model) P(model) / P(data)
If you only care about maximizing P(model | data), then you typically take logs and throw away constants:
logP(model | data) = logP(data | model) + logP(model) - logP(data)
= logP(data | model) + logP(model) + C
P(model) is called the prior probability of the given model. P(data | model) is some proposed way that you would randomly generate the data observed given the model and model parameters.
Also, is it surprising that:
...The function β takes a more complicated form which we parametrize
as a
Fourier cosine series, keeping five terms and squaring to enforce nonnegativity, plus an additional Gaussian peak at the origin
I would probably just use polynomials? Then you could require alpha to be composed of only even-powered terms (1, x^2, x^4, etc.) and beta to be composed of only odd-powered terms (x, x^3, x^5, etc.). This would force alpha to be symmetric (alpha(-x) = alpha(x)) and beta to be anti-symmetric (beta(-x) = -beta(x)). But maybe there is some reason why that wouldn't work.
-- -Eric
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop michaelbish@gmail.com wrote:
If I open a second instance of cygwin and run another instance of this program, will windows/cygwin know to execute it on a separate core? Brian (the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
Yeah I don't really know what the norms are surrounding this sorta thing. I really don't want to send these guys a huge fuck-you. But it is basically accepted at a real journal... this new one I think? Does its going to press change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
Its good to hear their approach seems fairly reasonable to you. It seemed weird to me to bring periodic functions like cosine into it... my intuition is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner solutions" at max rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated ties are different, just allow that to emerge... so he suggested a mixture of 3 beta distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together. It is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
well its not published yet. i think it was conditionally accepted and probably won't come out until the spring. i'm currently operating under the assumption that i can't publish using their ranking without there say so. since they said they were planning on doing things similar to what i was planning on doing, and they gave me their code before others have access to it (i think), i probably need to consider them coauthors.
but now that i've got the code, its not clear how much more help i need or could get from them. therefore i feel like i'm in a grey area where given my plans, giving them co-authorship seems generous, but giving them nothing more than an acknowledgement seems sorta wrong as well.
On Fri, Oct 5, 2012 at 11:04 AM, Eric Purdy epurdy@uchicago.edu wrote:
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop michaelbish@gmail.com wrote:
If I open a second instance of cygwin and run another instance of this program, will windows/cygwin know to execute it on a separate core?
Brian
(the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
Yeah I don't really know what the norms are surrounding this sorta
thing. I
really don't want to send these guys a huge fuck-you. But it is
basically
accepted at a real journal... this new one I think? Does its going to
press
change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
Its good to hear their approach seems fairly reasonable to you. It
seemed
weird to me to bring periodic functions like cosine into it... my
intuition
is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner solutions" at
max
rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated ties
are
different, just allow that to emerge... so he suggested a mixture of 3
beta
distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together. It is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
-- -Eric
Well, we could just say that we'll do what we want, not use their code (except possibly to help us understand the paper), and offer co-author credit to them if we manage to get a paper out of it.
In general, being overly generous with co-authorship doesn't usually have any bad effects?
On Fri, Oct 5, 2012 at 11:34 AM, Michael Bishop michaelbish@gmail.com wrote:
well its not published yet. i think it was conditionally accepted and probably won't come out until the spring. i'm currently operating under the assumption that i can't publish using their ranking without there say so. since they said they were planning on doing things similar to what i was planning on doing, and they gave me their code before others have access to it (i think), i probably need to consider them coauthors.
but now that i've got the code, its not clear how much more help i need or could get from them. therefore i feel like i'm in a grey area where given my plans, giving them co-authorship seems generous, but giving them nothing more than an acknowledgement seems sorta wrong as well.
On Fri, Oct 5, 2012 at 11:04 AM, Eric Purdy epurdy@uchicago.edu wrote:
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop michaelbish@gmail.com wrote:
If I open a second instance of cygwin and run another instance of this program, will windows/cygwin know to execute it on a separate core? Brian (the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
Yeah I don't really know what the norms are surrounding this sorta thing. I really don't want to send these guys a huge fuck-you. But it is basically accepted at a real journal... this new one I think? Does its going to press change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
Its good to hear their approach seems fairly reasonable to you. It seemed weird to me to bring periodic functions like cosine into it... my intuition is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner solutions" at max rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated ties are different, just allow that to emerge... so he suggested a mixture of 3 beta distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together. It is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
-- -Eric
Just out of curiosity, is anybody still reading this thread? What are people thinking of the listhost so far?
On Fri, Oct 5, 2012 at 11:57 AM, Eric Purdy epurdy@uchicago.edu wrote:
Well, we could just say that we'll do what we want, not use their code (except possibly to help us understand the paper), and offer co-author credit to them if we manage to get a paper out of it.
In general, being overly generous with co-authorship doesn't usually have any bad effects?
On Fri, Oct 5, 2012 at 11:34 AM, Michael Bishop michaelbish@gmail.com wrote:
well its not published yet. i think it was conditionally accepted and probably won't come out until the spring. i'm currently operating under the assumption that i can't publish using their ranking without there say so. since they said they were planning on doing things similar to what i was planning on doing, and they gave me their code before others have access to it (i think), i probably need to consider them coauthors.
but now that i've got the code, its not clear how much more help i need or could get from them. therefore i feel like i'm in a grey area where given my plans, giving them co-authorship seems generous, but giving them nothing more than an acknowledgement seems sorta wrong as well.
On Fri, Oct 5, 2012 at 11:04 AM, Eric Purdy epurdy@uchicago.edu wrote:
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop michaelbish@gmail.com wrote:
If I open a second instance of cygwin and run another instance of this program, will windows/cygwin know to execute it on a separate core? Brian (the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
Yeah I don't really know what the norms are surrounding this sorta thing. I really don't want to send these guys a huge fuck-you. But it is basically accepted at a real journal... this new one I think? Does its going to press change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
Its good to hear their approach seems fairly reasonable to you. It seemed weird to me to bring periodic functions like cosine into it... my intuition is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner solutions" at max rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated ties are different, just allow that to emerge... so he suggested a mixture of 3 beta distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together. It is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
-- -Eric
-- -Eric
Well it may just be you and me Eric. And you can feel free to dropout at any time. Though I assume that Andrew and Trevor are on this list and I know they are somewhat engaged.
This part of the paper explains a few things:
and suppose we believe there to be a ranking of the in- dividuals implied
by the pattern of the unreciprocated friendships so that most such friendships run from lower to higher rank. One possible way to infer that ranking would be simply to ignore any reciprocated friendships and then construct a minimum violations ranking of the remaining network [7, 8]. That is, we find the *ranking of the network nodes that minimizes the number of connec- tions running from higher ranked nodes to lower ranked ones. In practice this approach works quite well: for the networks studied in this paper the minimum viola- tions rankings have an average of 98% of their unrecipro- cated friendships running from lower to higher ranks and only 2% running the other way. By contrast, versions of the same networks in which edge directions have been randomized typically have about 10% of edges running the wrong way. (Statistical errors in either case are 1% or less, so these observations are highly unlikely to be the results of chance.)* The minimum violations ranking, however, misses im- portant network features because it focuses only on un- reciprocated friendships. In most cases there are a sub- stantial number of reciprocated friendships as well, as many as a half of the total, and they contain significant information about network structure and ranking. For example, as we will see, pairs of individuals who report a reciprocated friendship are almost always closely similar in rank.
And they can use the minimum violations ranking to seed their maximum likelihood approach. Without that trick the computational demands would be even more absurd.
So question, does the fact that they can achieve a ranking with 90% of unreciprocated edges directed from lower to higher status people WHEN THE DATA IS RANDOMLY GENERATED worry you... I mean, its good that they can increase this to 98% when the use the ME, but it seems like they have a lot of flexibility. I hypothesize that in schools where people fill out their survey more thoroughly (fewer people with very few edges) that they end up with more edges directed the wrong way. But that the rankings are in some sense truer. Do you think I'm right?
On Fri, Oct 5, 2012 at 2:44 PM, Eric Purdy epurdy@uchicago.edu wrote:
Just out of curiosity, is anybody still reading this thread? What are people thinking of the listhost so far?
On Fri, Oct 5, 2012 at 11:57 AM, Eric Purdy epurdy@uchicago.edu wrote:
Well, we could just say that we'll do what we want, not use their code (except possibly to help us understand the paper), and offer co-author credit to them if we manage to get a paper out of it.
In general, being overly generous with co-authorship doesn't usually have any bad effects?
On Fri, Oct 5, 2012 at 11:34 AM, Michael Bishop michaelbish@gmail.com
wrote:
well its not published yet. i think it was conditionally accepted and probably won't come out until the spring. i'm currently operating
under the
assumption that i can't publish using their ranking without there say
so.
since they said they were planning on doing things similar to what i was planning on doing, and they gave me their code before others have
access to
it (i think), i probably need to consider them coauthors.
but now that i've got the code, its not clear how much more help i need
or
could get from them. therefore i feel like i'm in a grey area where
given
my plans, giving them co-authorship seems generous, but giving them
nothing
more than an acknowledgement seems sorta wrong as well.
On Fri, Oct 5, 2012 at 11:04 AM, Eric Purdy epurdy@uchicago.edu
wrote:
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop <michaelbish@gmail.com
wrote:
If I open a second instance of cygwin and run another instance of
this
program, will windows/cygwin know to execute it on a separate core? Brian (the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
Yeah I don't really know what the norms are surrounding this sorta thing. I really don't want to send these guys a huge fuck-you. But it is basically accepted at a real journal... this new one I think? Does its going
to
press change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
Its good to hear their approach seems fairly reasonable to you. It seemed weird to me to bring periodic functions like cosine into it... my intuition is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner
solutions" at
max rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated
ties
are different, just allow that to emerge... so he suggested a mixture of
3
beta distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together. It is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
-- -Eric
-- -Eric
-- -Eric
So question, does the fact that they can achieve a ranking with 90% of unreciprocated edges directed from lower to higher status people WHEN THE DATA IS RANDOMLY GENERATED worry you... I mean, its good that they can increase this to 98% when the use the ME, but it seems like they have a lot of flexibility. I hypothesize that in schools where people fill out their survey more thoroughly (fewer people with very few edges) that they end up with more edges directed the wrong way. But that the rankings are in some sense truer. Do you think I'm right?
Congratulations, you have rediscovered the phenomenon of overfitting! Everything you said is correct and important. In general, too-flexible statistical models are bad news.
The only way to really know whether or not you're doing it is to have some data held out for testing/cross-validation. For example, you could erase a random 10% of the edges, do their procedure on the remaining 90%, and then see how many rank violations you get in the invisible 10% of edges.
Pong. I am here and I am engaged. I have very much been enjoying reading this discussion. Been busy with other stuff.
Will comment with some thoughts when I get a chance next week. Please keep the conversation going though.
Trevor
On Sat, Oct 6, 2012 at 5:15 PM, Michael Bishop michaelbish@gmail.comwrote:
Well it may just be you and me Eric. And you can feel free to dropout at any time. Though I assume that Andrew and Trevor are on this list and I know they are somewhat engaged.
This part of the paper explains a few things:
and suppose we believe there to be a ranking of the in- dividuals implied
by the pattern of the unreciprocated friendships so that most such friendships run from lower to higher rank. One possible way to infer that ranking would be simply to ignore any reciprocated friendships and then construct a minimum violations ranking of the remaining network [7, 8]. That is, we find the *ranking of the network nodes that minimizes the number of connec- tions running from higher ranked nodes to lower ranked ones. In practice this approach works quite well: for the networks studied in this paper the minimum viola- tions rankings have an average of 98% of their unrecipro- cated friendships running from lower to higher ranks and only 2% running the other way. By contrast, versions of the same networks in which edge directions have been randomized typically have about 10% of edges running the wrong way. (Statistical errors in either case are 1% or less, so these observations are highly unlikely to be the results of chance.)* The minimum violations ranking, however, misses im- portant network features because it focuses only on un- reciprocated friendships. In most cases there are a sub- stantial number of reciprocated friendships as well, as many as a half of the total, and they contain significant information about network structure and ranking. For example, as we will see, pairs of individuals who report a reciprocated friendship are almost always closely similar in rank.
And they can use the minimum violations ranking to seed their maximum likelihood approach. Without that trick the computational demands would be even more absurd.
So question, does the fact that they can achieve a ranking with 90% of unreciprocated edges directed from lower to higher status people WHEN THE DATA IS RANDOMLY GENERATED worry you... I mean, its good that they can increase this to 98% when the use the ME, but it seems like they have a lot of flexibility. I hypothesize that in schools where people fill out their survey more thoroughly (fewer people with very few edges) that they end up with more edges directed the wrong way. But that the rankings are in some sense truer. Do you think I'm right?
On Fri, Oct 5, 2012 at 2:44 PM, Eric Purdy epurdy@uchicago.edu wrote:
Just out of curiosity, is anybody still reading this thread? What are people thinking of the listhost so far?
On Fri, Oct 5, 2012 at 11:57 AM, Eric Purdy epurdy@uchicago.edu wrote:
Well, we could just say that we'll do what we want, not use their code (except possibly to help us understand the paper), and offer co-author credit to them if we manage to get a paper out of it.
In general, being overly generous with co-authorship doesn't usually have any bad effects?
On Fri, Oct 5, 2012 at 11:34 AM, Michael Bishop michaelbish@gmail.com
wrote:
well its not published yet. i think it was conditionally accepted and probably won't come out until the spring. i'm currently operating
under the
assumption that i can't publish using their ranking without there say
so.
since they said they were planning on doing things similar to what i
was
planning on doing, and they gave me their code before others have
access to
it (i think), i probably need to consider them coauthors.
but now that i've got the code, its not clear how much more help i
need or
could get from them. therefore i feel like i'm in a grey area where
given
my plans, giving them co-authorship seems generous, but giving them
nothing
more than an acknowledgement seems sorta wrong as well.
On Fri, Oct 5, 2012 at 11:04 AM, Eric Purdy epurdy@uchicago.edu
wrote:
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop <
michaelbish@gmail.com>
wrote:
If I open a second instance of cygwin and run another instance of
this
program, will windows/cygwin know to execute it on a separate core? Brian (the author) just sent me his timings (attached) to give me an idea about what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
Yeah I don't really know what the norms are surrounding this sorta thing. I really don't want to send these guys a huge fuck-you. But it is basically accepted at a real journal... this new one I think? Does its going
to
press change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
Its good to hear their approach seems fairly reasonable to you. It seemed weird to me to bring periodic functions like cosine into it... my intuition is that the density for unreciprocated ties would achieve perhaps 3 local maxima... one somewhere in the middle, and the two "corner
solutions" at
max rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
I have another friend who suggested eliminating α and β as separate parameters, i.e. don't assume reciprocated ties and unreciprocated
ties
are different, just allow that to emerge... so he suggested a mixture
of 3
beta distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together. It is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
-- -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Yes, some sort of cross-validation would be great, but a) I wouldn't know how to implement it, and b) this thing runs slowly enough as it is!
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
On Sat, Oct 6, 2012 at 4:22 PM, Trevor Smith trevorsummerssmith@gmail.comwrote:
Pong. I am here and I am engaged. I have very much been enjoying reading this discussion. Been busy with other stuff.
Will comment with some thoughts when I get a chance next week. Please keep the conversation going though.
Trevor
On Sat, Oct 6, 2012 at 5:15 PM, Michael Bishop michaelbish@gmail.comwrote:
Well it may just be you and me Eric. And you can feel free to dropout at any time. Though I assume that Andrew and Trevor are on this list and I know they are somewhat engaged.
This part of the paper explains a few things:
and suppose we believe there to be a ranking of the in- dividuals implied
by the pattern of the unreciprocated friendships so that most such friendships run from lower to higher rank. One possible way to infer that ranking would be simply to ignore any reciprocated friendships and then construct a minimum violations ranking of the remaining network [7, 8]. That is, we find the *ranking of the network nodes that minimizes the number of connec- tions running from higher ranked nodes to lower ranked ones. In practice this approach works quite well: for the networks studied in this paper the minimum viola- tions rankings have an average of 98% of their unrecipro- cated friendships running from lower to higher ranks and only 2% running the other way. By contrast, versions of the same networks in which edge directions have been randomized typically have about 10% of edges running the wrong way. (Statistical errors in either case are 1% or less, so these observations are highly unlikely to be the results of chance.)* The minimum violations ranking, however, misses im- portant network features because it focuses only on un- reciprocated friendships. In most cases there are a sub- stantial number of reciprocated friendships as well, as many as a half of the total, and they contain significant information about network structure and ranking. For example, as we will see, pairs of individuals who report a reciprocated friendship are almost always closely similar in rank.
And they can use the minimum violations ranking to seed their maximum likelihood approach. Without that trick the computational demands would be even more absurd.
So question, does the fact that they can achieve a ranking with 90% of unreciprocated edges directed from lower to higher status people WHEN THE DATA IS RANDOMLY GENERATED worry you... I mean, its good that they can increase this to 98% when the use the ME, but it seems like they have a lot of flexibility. I hypothesize that in schools where people fill out their survey more thoroughly (fewer people with very few edges) that they end up with more edges directed the wrong way. But that the rankings are in some sense truer. Do you think I'm right?
On Fri, Oct 5, 2012 at 2:44 PM, Eric Purdy epurdy@uchicago.edu wrote:
Just out of curiosity, is anybody still reading this thread? What are people thinking of the listhost so far?
On Fri, Oct 5, 2012 at 11:57 AM, Eric Purdy epurdy@uchicago.edu wrote:
Well, we could just say that we'll do what we want, not use their code (except possibly to help us understand the paper), and offer co-author credit to them if we manage to get a paper out of it.
In general, being overly generous with co-authorship doesn't usually have any bad effects?
On Fri, Oct 5, 2012 at 11:34 AM, Michael Bishop michaelbish@gmail.com
wrote:
well its not published yet. i think it was conditionally accepted and probably won't come out until the spring. i'm currently operating
under the
assumption that i can't publish using their ranking without there say
so.
since they said they were planning on doing things similar to what i
was
planning on doing, and they gave me their code before others have
access to
it (i think), i probably need to consider them coauthors.
but now that i've got the code, its not clear how much more help i
need or
could get from them. therefore i feel like i'm in a grey area where
given
my plans, giving them co-authorship seems generous, but giving them
nothing
more than an acknowledgement seems sorta wrong as well.
On Fri, Oct 5, 2012 at 11:04 AM, Eric Purdy epurdy@uchicago.edu
wrote:
On Fri, Oct 5, 2012 at 10:13 AM, Michael Bishop <
michaelbish@gmail.com>
wrote: > If I open a second instance of cygwin and run another instance of
this
> program, will windows/cygwin know to execute it on a separate core? > Brian > (the author) just sent me his timings (attached) to give me an idea > about > what to expect. #3 is the shortest, #50 is the longest.
Probably that would work? I know very little about windows.
> Yeah I don't really know what the norms are surrounding this sorta > thing. I > really don't want to send these guys a huge fuck-you. But it is > basically > accepted at a real journal... this new one I think? Does its
going to
> press > change anything?
If it's already published, then we can do whatever we want. And we're not the referees, which is where the real fuck-yous come in. Maybe we should ask them??
> Its good to hear their approach seems fairly reasonable to you. It > seemed > weird to me to bring periodic functions like cosine into it... my > intuition > is that the density for unreciprocated ties would achieve perhaps 3 > local > maxima... one somewhere in the middle, and the two "corner
solutions" at
> max > rank differences.
Yes, the periodicity is a very weird assumption. That's exactly why I was saying polynomials made more sense to me.
> I have another friend who suggested eliminating α and β as separate > parameters, i.e. don't assume reciprocated ties and unreciprocated
ties
> are > different, just allow that to emerge... so he suggested a mixture
of 3
> beta > distributions.
If you use polynomials like I suggested, then there's no real difference between alpha and beta being separate or being together.
It
is possibly helpful at a conceptual level to separate out the parts that are symmetric and anti-symmetric, because they tell you about reciprocated and unreciprocated friendships, respectively.
-- -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
So with all due respect to the author, its annoying that the only ranking output the program saves is the final ranking. What I've started doing is telling it to run with 0 iterations which outputs the minimum violations ranking. Then I change the output filename, and run it again with a flag which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to the expectation maximization algorithm which typically takes 4-8 iterations to converge. But given the computational requirements, it would seem to make sense to default to save every single ranking along the way in case you ever want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu wrote:
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you
status in
different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess
I'm
just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know
for
sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
1. Write code to separate the edges into training and test sets. Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
2. Get a ranking that has the minimum number of rank violations.
3. Write code to infer alpha and beta from a set of rankings. This is basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
4. Write code to sample from the set of all rankings, assuming that alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
5. Write code that does EM iterations. This is just going back and forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
6. Write code that evaluates your proposed ranking by looking at the invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop michaelbish@gmail.com wrote:
So with all due respect to the author, its annoying that the only ranking output the program saves is the final ranking. What I've started doing is telling it to run with 0 iterations which outputs the minimum violations ranking. Then I change the output filename, and run it again with a flag which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to the expectation maximization algorithm which typically takes 4-8 iterations to converge. But given the computational requirements, it would seem to make sense to default to save every single ranking along the way in case you ever want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu wrote:
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
Sorry, in step 3, I meant that the sensible prior would be e^{ - [a_1^2 + ... + a_k^2] }. I tend to think about probabilities in negative log space.
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop michaelbish@gmail.com wrote:
So with all due respect to the author, its annoying that the only ranking output the program saves is the final ranking. What I've started doing is telling it to run with 0 iterations which outputs the minimum violations ranking. Then I change the output filename, and run it again with a flag which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to the expectation maximization algorithm which typically takes 4-8 iterations to converge. But given the computational requirements, it would seem to make sense to default to save every single ranking along the way in case you ever want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu wrote:
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
-- -Eric
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop michaelbish@gmail.com wrote:
So with all due respect to the author, its annoying that the only ranking output the program saves is the final ranking. What I've started doing is telling it to run with 0 iterations which outputs the minimum violations ranking. Then I change the output filename, and run it again with a flag which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to the expectation maximization algorithm which typically takes 4-8 iterations to converge. But given the computational requirements, it would seem to make sense to default to save every single ranking along the way in case you ever want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu wrote:
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
-- -Eric
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu wrote:
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop michaelbish@gmail.com wrote:
So with all due respect to the author, its annoying that the only ranking output the program saves is the final ranking. What I've started doing is telling it to run with 0 iterations which outputs the minimum violations ranking. Then I change the output filename, and run it again with a flag which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to the expectation maximization algorithm which typically takes 4-8 iterations to converge. But given the computational requirements, it would seem to make sense to default to save every single ranking along the way in case you ever want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu wrote:
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
-- -Eric
-- -Eric
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu wrote:
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu wrote:
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop michaelbish@gmail.com wrote:
So with all due respect to the author, its annoying that the only ranking output the program saves is the final ranking. What I've started doing is telling it to run with 0 iterations which outputs the minimum violations ranking. Then I change the output filename, and run it again with a flag which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to the expectation maximization algorithm which typically takes 4-8 iterations to converge. But given the computational requirements, it would seem to make sense to default to save every single ranking along the way in case you ever want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu wrote:
Let me also raise a substantive issue. Within groups there is more agreement about what is status-worthy than between groups. While I think the idea that there exists a status ranking across all the students in a school has enough truth to it to merit the work being done on it, this approach is also obscuring the large differences in what earns you status in different contexts. A focus on the causes and consequences of status differences (status as defined in this paper) inevitably ignores the importance of groups. Of course, there are ways of trying to address the existence of groups that complement the work done here, and maybe its not even as hard as I'm making it out to be, but you always have to draw the line somewhere, and at that point you start waving your hands. I guess I'm just saying we have to do our best to be honest with ourselves and our audience about the many things that might be important but we don't know for sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which game each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you are, relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
-- -Eric
-- -Eric
-- -Eric
Hi all,
I'm Kevin. I met Eric when he came to visit Trevor at Knewton. Trevor invited me along to this list.
Anyway, the discussion is interesting, and I would point out that the phenomenon observed by this paper is very well-known. See, for instance, this Kaggle competition and the ensuing discussion: http://www.kaggle.com/c/FacebookRecruiting
One interesting question beyond this paper and in the direction of where the discussion has been going would be to ask the same questions in larger and larger social networks. Some example questions:
- Most edges run in the direction of status. But how many triangles are there? i.e., suppose VLS has very low status, LS has low status and HS has high status. A triangle would consist of a relationship with VLS claiming friendship with LS and HS and VLS claiming friendship with HS. One may expand from triangles to longer chains (where you ask about n-length chains from VLS to Lady Gaga next to a direct edge from VLS to Lady Gaga). - Can you distinguish types of social networks by the length of these chains? - What happens when you merge two friendship networks? Do relative statuses remain the same?
Each of these would be a paper, but you'd probably have to vastly improve the code. I think data sets exist to answer all three.
Kevin
On Wed, Oct 10, 2012 at 1:02 PM, Eric Purdy epurdy@uchicago.edu wrote:
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu wrote:
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu wrote:
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu
wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop michaelbish@gmail.com
wrote:
So with all due respect to the author, its annoying that the only
ranking
output the program saves is the final ranking. What I've started
doing is
telling it to run with 0 iterations which outputs the minimum
violations
ranking. Then I change the output filename, and run it again with a
flag
which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to
the
expectation maximization algorithm which typically takes 4-8
iterations to
converge. But given the computational requirements, it would seem to
make
sense to default to save every single ranking along the way in case
you ever
want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu
wrote:
> Let me also raise a substantive issue. Within groups there is more > agreement about what is status-worthy than between groups. While I > think > the idea that there exists a status ranking across all the
students in a
> school has enough truth to it to merit the work being done on it,
this
> approach is also obscuring the large differences in what earns you > status in > different contexts. A focus on the causes and consequences of
status
> differences (status as defined in this paper) inevitably ignores
the
> importance of groups. Of course, there are ways of trying to
address
> the > existence of groups that complement the work done here, and maybe
its
> not > even as hard as I'm making it out to be, but you always have to
draw the
> line somewhere, and at that point you start waving your hands. I
guess
> I'm > just saying we have to do our best to be honest with ourselves and
our
> audience about the many things that might be important but we
don't know
> for > sure because we can't fit them in our model.
You can model people playing different status games at the same time. Like, say for simplicity that there are two games in high school, being smart and being cool. Then we could try to figure out which
game
each player is playing (or how much they're playing both) based on that. It would make the overfitting problems much worse, but it's not super-hard to formulate such a model.
Also, you could probably get some sort of independent check of such models by giving people surveys that try to determine what strategy people think they are following. Like, "how cool do you think you
are,
relative to your friends", "how do you decide whether you want to be friends with a new acquaintance", etc.
-- -Eric
-- -Eric
-- -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Just to clarify, the phenomenon I meant is very well-known is the directedness of many kinds of social graphs. This particular example of friendships among school children and the social science implications are rather nice.
Kevin
On Wed, Oct 10, 2012 at 10:36 PM, Kevin Wilson khwilson@gmail.com wrote:
Hi all,
I'm Kevin. I met Eric when he came to visit Trevor at Knewton. Trevor invited me along to this list.
Anyway, the discussion is interesting, and I would point out that the phenomenon observed by this paper is very well-known. See, for instance, this Kaggle competition and the ensuing discussion: http://www.kaggle.com/c/FacebookRecruiting
One interesting question beyond this paper and in the direction of where the discussion has been going would be to ask the same questions in larger and larger social networks. Some example questions:
- Most edges run in the direction of status. But how many triangles
are there? i.e., suppose VLS has very low status, LS has low status and HS has high status. A triangle would consist of a relationship with VLS claiming friendship with LS and HS and VLS claiming friendship with HS. One may expand from triangles to longer chains (where you ask about n-length chains from VLS to Lady Gaga next to a direct edge from VLS to Lady Gaga).
- Can you distinguish types of social networks by the length of these
chains?
- What happens when you merge two friendship networks? Do relative
statuses remain the same?
Each of these would be a paper, but you'd probably have to vastly improve the code. I think data sets exist to answer all three.
Kevin
On Wed, Oct 10, 2012 at 1:02 PM, Eric Purdy epurdy@uchicago.edu wrote:
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu wrote:
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu wrote:
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu
wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 + a_1 x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. Note that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop <
michaelbish@gmail.com> wrote:
So with all due respect to the author, its annoying that the only
ranking
output the program saves is the final ranking. What I've started
doing is
telling it to run with 0 iterations which outputs the minimum
violations
ranking. Then I change the output filename, and run it again with a
flag
which allows you to specify a seed.
This way I'll be able to compare the minimum violations rankings to
the
expectation maximization algorithm which typically takes 4-8
iterations to
converge. But given the computational requirements, it would seem
to make
sense to default to save every single ranking along the way in case
you ever
want to look at it later.
On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu
wrote:
> > > Let me also raise a substantive issue. Within groups there is
more
> > agreement about what is status-worthy than between groups. While
I
> > think > > the idea that there exists a status ranking across all the
students in a
> > school has enough truth to it to merit the work being done on it,
this
> > approach is also obscuring the large differences in what earns you > > status in > > different contexts. A focus on the causes and consequences of
status
> > differences (status as defined in this paper) inevitably ignores
the
> > importance of groups. Of course, there are ways of trying to
address
> > the > > existence of groups that complement the work done here, and maybe
its
> > not > > even as hard as I'm making it out to be, but you always have to
draw the
> > line somewhere, and at that point you start waving your hands. I
guess
> > I'm > > just saying we have to do our best to be honest with ourselves
and our
> > audience about the many things that might be important but we
don't know
> > for > > sure because we can't fit them in our model. > > You can model people playing different status games at the same
time.
> Like, say for simplicity that there are two games in high school, > being smart and being cool. Then we could try to figure out which
game
> each player is playing (or how much they're playing both) based on > that. It would make the overfitting problems much worse, but it's
not
> super-hard to formulate such a model. > > Also, you could probably get some sort of independent check of such > models by giving people surveys that try to determine what strategy > people think they are following. Like, "how cool do you think you
are,
> relative to your friends", "how do you decide whether you want to be > friends with a new acquaintance", etc. > > -- > -Eric
-- -Eric
-- -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Thanks for the link to the kaggle competition Kevin. I'll have some fun looking at that. Of course the approach Ball and Newman are taking is only one of many. Check out http://statnet.org/ for an R package which can do some fancy things... I think the heavy lifting implemented in C.
On Wed, Oct 10, 2012 at 10:12 PM, Kevin Wilson khwilson@gmail.com wrote:
Just to clarify, the phenomenon I meant is very well-known is the directedness of many kinds of social graphs. This particular example of friendships among school children and the social science implications are rather nice.
Kevin
On Wed, Oct 10, 2012 at 10:36 PM, Kevin Wilson khwilson@gmail.com wrote:
Hi all,
I'm Kevin. I met Eric when he came to visit Trevor at Knewton. Trevor invited me along to this list.
Anyway, the discussion is interesting, and I would point out that the phenomenon observed by this paper is very well-known. See, for instance, this Kaggle competition and the ensuing discussion: http://www.kaggle.com/c/FacebookRecruiting
One interesting question beyond this paper and in the direction of where the discussion has been going would be to ask the same questions in larger and larger social networks. Some example questions:
- Most edges run in the direction of status. But how many triangles
are there? i.e., suppose VLS has very low status, LS has low status and HS has high status. A triangle would consist of a relationship with VLS claiming friendship with LS and HS and VLS claiming friendship with HS. One may expand from triangles to longer chains (where you ask about n-length chains from VLS to Lady Gaga next to a direct edge from VLS to Lady Gaga).
- Can you distinguish types of social networks by the length of these
chains?
- What happens when you merge two friendship networks? Do relative
statuses remain the same?
Each of these would be a paper, but you'd probably have to vastly improve the code. I think data sets exist to answer all three.
Kevin
On Wed, Oct 10, 2012 at 1:02 PM, Eric Purdy epurdy@uchicago.edu wrote:
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu wrote:
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu
wrote:
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the
paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu
wrote:
So, I'm still a bit too busy to just reimplement their code, but here's a sketch of roughly how I would do it. Maybe someone else will feel like writing this code? It sounds like we could probably use it to write a publishable paper, if that's a motivation.
The easiest would be to do this in python with networkx.
- Write code to separate the edges into training and test sets.
Basically, independently label each edge "test" with probability 10% and "training" with probability 90%. Then make a new graph that doesn't have any of the test edges. This is the graph that you will run EM on.
Get a ranking that has the minimum number of rank violations.
Write code to infer alpha and beta from a set of rankings. This is
basically just a least squares problem, if you take my advice and use polynomials to represent alpha and beta. You will probably want to have some sort of penalty for having large coefficients in your polynomial. A common way is to put a prior on the polynomial a_0 +
a_1
x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2.
Note
that you probably want to let a_0 be whatever it wants.
- Write code to sample from the set of all rankings, assuming that
alpha and beta are the correct distributions. This is the thing that can probably be done quickly if you do some clever algorithms. Basically, you generate a ranking of the first k nodes, then you extend it to a ranking of the first (k+1) nodes by inserting node #(k+1) somewhere in between two other nodes in the ranking.
- Write code that does EM iterations. This is just going back and
forth between steps 3 and 4. In step 4, generate a bunch of random rankings, and then use all of them when you go back to step 3. Save all of the rankings, like Mike says.
- Write code that evaluates your proposed ranking by looking at the
invisible test edges. See how the quality of the ranking changes over each EM iteration.
On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop <
michaelbish@gmail.com> wrote:
> So with all due respect to the author, its annoying that the only
ranking
> output the program saves is the final ranking. What I've started
doing is
> telling it to run with 0 iterations which outputs the minimum
violations
> ranking. Then I change the output filename, and run it again with
a flag
> which allows you to specify a seed. > > This way I'll be able to compare the minimum violations rankings to
the
> expectation maximization algorithm which typically takes 4-8
iterations to
> converge. But given the computational requirements, it would seem
to make
> sense to default to save every single ranking along the way in case
you ever
> want to look at it later. > > > > On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu
wrote:
>> >> > Let me also raise a substantive issue. Within groups there is
more
>> > agreement about what is status-worthy than between groups.
While I
>> > think >> > the idea that there exists a status ranking across all the
students in a
>> > school has enough truth to it to merit the work being done on
it, this
>> > approach is also obscuring the large differences in what earns
you
>> > status in >> > different contexts. A focus on the causes and consequences of
status
>> > differences (status as defined in this paper) inevitably ignores
the
>> > importance of groups. Of course, there are ways of trying to
address
>> > the >> > existence of groups that complement the work done here, and
maybe its
>> > not >> > even as hard as I'm making it out to be, but you always have to
draw the
>> > line somewhere, and at that point you start waving your hands.
I guess
>> > I'm >> > just saying we have to do our best to be honest with ourselves
and our
>> > audience about the many things that might be important but we
don't know
>> > for >> > sure because we can't fit them in our model. >> >> You can model people playing different status games at the same
time.
>> Like, say for simplicity that there are two games in high school, >> being smart and being cool. Then we could try to figure out which
game
>> each player is playing (or how much they're playing both) based on >> that. It would make the overfitting problems much worse, but it's
not
>> super-hard to formulate such a model. >> >> Also, you could probably get some sort of independent check of such >> models by giving people surveys that try to determine what strategy >> people think they are following. Like, "how cool do you think you
are,
>> relative to your friends", "how do you decide whether you want to
be
>> friends with a new acquaintance", etc. >> >> -- >> -Eric > >
-- -Eric
-- -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Is anyone interested in working on the sampling algorithm I have been writing about? If not, I may finish it myself, but I don't think I'll have time for a week or so. If people are interested but have questions, you can email me off-list if you want?
On Wed, Oct 10, 2012 at 10:28 PM, Michael Bishop michaelbish@gmail.com wrote:
Thanks for the link to the kaggle competition Kevin. I'll have some fun looking at that. Of course the approach Ball and Newman are taking is only one of many. Check out http://statnet.org/ for an R package which can do some fancy things... I think the heavy lifting implemented in C.
On Wed, Oct 10, 2012 at 10:12 PM, Kevin Wilson khwilson@gmail.com wrote:
Just to clarify, the phenomenon I meant is very well-known is the directedness of many kinds of social graphs. This particular example of friendships among school children and the social science implications are rather nice.
Kevin
On Wed, Oct 10, 2012 at 10:36 PM, Kevin Wilson khwilson@gmail.com wrote:
Hi all,
I'm Kevin. I met Eric when he came to visit Trevor at Knewton. Trevor invited me along to this list.
Anyway, the discussion is interesting, and I would point out that the phenomenon observed by this paper is very well-known. See, for instance, this Kaggle competition and the ensuing discussion: http://www.kaggle.com/c/FacebookRecruiting
One interesting question beyond this paper and in the direction of where the discussion has been going would be to ask the same questions in larger and larger social networks. Some example questions:
Most edges run in the direction of status. But how many triangles are there? i.e., suppose VLS has very low status, LS has low status and HS has high status. A triangle would consist of a relationship with VLS claiming friendship with LS and HS and VLS claiming friendship with HS. One may expand from triangles to longer chains (where you ask about n-length chains from VLS to Lady Gaga next to a direct edge from VLS to Lady Gaga). Can you distinguish types of social networks by the length of these chains? What happens when you merge two friendship networks? Do relative statuses remain the same?
Each of these would be a paper, but you'd probably have to vastly improve the code. I think data sets exist to answer all three.
Kevin
On Wed, Oct 10, 2012 at 1:02 PM, Eric Purdy epurdy@uchicago.edu wrote:
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu wrote:
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu wrote:
Or you can also skip step 2, if you initialize alpha and beta to be something simple and reasonable, like
(unreciprocated probability of edge from #i to #j) alpha = e^[ j-i ]
(reciprocated probability of edge between #i and #j) beta = e^[ (j-i)^2 ]
(I might have alpha and beta backwards from how they are in the paper?)
Before I was saying that alpha and beta should be modeled as polynomials, but it probably makes more sense to model (log alpha) and (log beta) as polynomials, as the equations above do. That makes it more like logistic regression, which I think is supposed to work well in the social sciences?
On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu wrote: > So, I'm still a bit too busy to just reimplement their code, but > here's a sketch of roughly how I would do it. Maybe someone else > will > feel like writing this code? It sounds like we could probably use it > to write a publishable paper, if that's a motivation. > > The easiest would be to do this in python with networkx. > > 1. Write code to separate the edges into training and test sets. > Basically, independently label each edge "test" with probability 10% > and "training" with probability 90%. Then make a new graph that > doesn't have any of the test edges. This is the graph that you will > run EM on. > > 2. Get a ranking that has the minimum number of rank violations. > > 3. Write code to infer alpha and beta from a set of rankings. This > is > basically just a least squares problem, if you take my advice and > use > polynomials to represent alpha and beta. You will probably want to > have some sort of penalty for having large coefficients in your > polynomial. A common way is to put a prior on the polynomial a_0 + > a_1 > x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. > Note > that you probably want to let a_0 be whatever it wants. > > 4. Write code to sample from the set of all rankings, assuming that > alpha and beta are the correct distributions. This is the thing that > can probably be done quickly if you do some clever algorithms. > Basically, you generate a ranking of the first k nodes, then you > extend it to a ranking of the first (k+1) nodes by inserting node > #(k+1) somewhere in between two other nodes in the ranking. > > 5. Write code that does EM iterations. This is just going back and > forth between steps 3 and 4. In step 4, generate a bunch of random > rankings, and then use all of them when you go back to step 3. Save > all of the rankings, like Mike says. > > 6. Write code that evaluates your proposed ranking by looking at the > invisible test edges. See how the quality of the ranking changes > over > each EM iteration. > > On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop > michaelbish@gmail.com wrote: >> So with all due respect to the author, its annoying that the only >> ranking >> output the program saves is the final ranking. What I've started >> doing is >> telling it to run with 0 iterations which outputs the minimum >> violations >> ranking. Then I change the output filename, and run it again with >> a flag >> which allows you to specify a seed. >> >> This way I'll be able to compare the minimum violations rankings to >> the >> expectation maximization algorithm which typically takes 4-8 >> iterations to >> converge. But given the computational requirements, it would seem >> to make >> sense to default to save every single ranking along the way in case >> you ever >> want to look at it later. >> >> >> >> On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu >> wrote: >>> >>> > Let me also raise a substantive issue. Within groups there is >>> > more >>> > agreement about what is status-worthy than between groups. >>> > While I >>> > think >>> > the idea that there exists a status ranking across all the >>> > students in a >>> > school has enough truth to it to merit the work being done on >>> > it, this >>> > approach is also obscuring the large differences in what earns >>> > you >>> > status in >>> > different contexts. A focus on the causes and consequences of >>> > status >>> > differences (status as defined in this paper) inevitably ignores >>> > the >>> > importance of groups. Of course, there are ways of trying to >>> > address >>> > the >>> > existence of groups that complement the work done here, and >>> > maybe its >>> > not >>> > even as hard as I'm making it out to be, but you always have to >>> > draw the >>> > line somewhere, and at that point you start waving your hands. >>> > I guess >>> > I'm >>> > just saying we have to do our best to be honest with ourselves >>> > and our >>> > audience about the many things that might be important but we >>> > don't know >>> > for >>> > sure because we can't fit them in our model. >>> >>> You can model people playing different status games at the same >>> time. >>> Like, say for simplicity that there are two games in high school, >>> being smart and being cool. Then we could try to figure out which >>> game >>> each player is playing (or how much they're playing both) based on >>> that. It would make the overfitting problems much worse, but it's >>> not >>> super-hard to formulate such a model. >>> >>> Also, you could probably get some sort of independent check of >>> such >>> models by giving people surveys that try to determine what >>> strategy >>> people think they are following. Like, "how cool do you think you >>> are, >>> relative to your friends", "how do you decide whether you want to >>> be >>> friends with a new acquaintance", etc. >>> >>> -- >>> -Eric >> >> > > > > -- > -Eric
-- -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
Updated to add more details.
On Thu, Oct 11, 2012 at 12:29 PM, Eric Purdy epurdy@uchicago.edu wrote:
Is anyone interested in working on the sampling algorithm I have been writing about? If not, I may finish it myself, but I don't think I'll have time for a week or so. If people are interested but have questions, you can email me off-list if you want?
On Wed, Oct 10, 2012 at 10:28 PM, Michael Bishop michaelbish@gmail.com wrote:
Thanks for the link to the kaggle competition Kevin. I'll have some fun looking at that. Of course the approach Ball and Newman are taking is only one of many. Check out http://statnet.org/ for an R package which can do some fancy things... I think the heavy lifting implemented in C.
On Wed, Oct 10, 2012 at 10:12 PM, Kevin Wilson khwilson@gmail.com wrote:
Just to clarify, the phenomenon I meant is very well-known is the directedness of many kinds of social graphs. This particular example of friendships among school children and the social science implications are rather nice.
Kevin
On Wed, Oct 10, 2012 at 10:36 PM, Kevin Wilson khwilson@gmail.com wrote:
Hi all,
I'm Kevin. I met Eric when he came to visit Trevor at Knewton. Trevor invited me along to this list.
Anyway, the discussion is interesting, and I would point out that the phenomenon observed by this paper is very well-known. See, for instance, this Kaggle competition and the ensuing discussion: http://www.kaggle.com/c/FacebookRecruiting
One interesting question beyond this paper and in the direction of where the discussion has been going would be to ask the same questions in larger and larger social networks. Some example questions:
Most edges run in the direction of status. But how many triangles are there? i.e., suppose VLS has very low status, LS has low status and HS has high status. A triangle would consist of a relationship with VLS claiming friendship with LS and HS and VLS claiming friendship with HS. One may expand from triangles to longer chains (where you ask about n-length chains from VLS to Lady Gaga next to a direct edge from VLS to Lady Gaga). Can you distinguish types of social networks by the length of these chains? What happens when you merge two friendship networks? Do relative statuses remain the same?
Each of these would be a paper, but you'd probably have to vastly improve the code. I think data sets exist to answer all three.
Kevin
On Wed, Oct 10, 2012 at 1:02 PM, Eric Purdy epurdy@uchicago.edu wrote:
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu wrote:
GAH, left out a minus sign:
beta = e^[ - (j-i)^2 ]
On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu wrote: > Or you can also skip step 2, if you initialize alpha and beta to be > something simple and reasonable, like > > (unreciprocated probability of edge from #i to #j) > alpha = e^[ j-i ] > > (reciprocated probability of edge between #i and #j) > beta = e^[ (j-i)^2 ] > > (I might have alpha and beta backwards from how they are in the > paper?) > > Before I was saying that alpha and beta should be modeled as > polynomials, but it probably makes more sense to model (log alpha) > and > (log beta) as polynomials, as the equations above do. That makes it > more like logistic regression, which I think is supposed to work well > in the social sciences? > > On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu > wrote: >> So, I'm still a bit too busy to just reimplement their code, but >> here's a sketch of roughly how I would do it. Maybe someone else >> will >> feel like writing this code? It sounds like we could probably use it >> to write a publishable paper, if that's a motivation. >> >> The easiest would be to do this in python with networkx. >> >> 1. Write code to separate the edges into training and test sets. >> Basically, independently label each edge "test" with probability 10% >> and "training" with probability 90%. Then make a new graph that >> doesn't have any of the test edges. This is the graph that you will >> run EM on. >> >> 2. Get a ranking that has the minimum number of rank violations. >> >> 3. Write code to infer alpha and beta from a set of rankings. This >> is >> basically just a least squares problem, if you take my advice and >> use >> polynomials to represent alpha and beta. You will probably want to >> have some sort of penalty for having large coefficients in your >> polynomial. A common way is to put a prior on the polynomial a_0 + >> a_1 >> x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. >> Note >> that you probably want to let a_0 be whatever it wants. >> >> 4. Write code to sample from the set of all rankings, assuming that >> alpha and beta are the correct distributions. This is the thing that >> can probably be done quickly if you do some clever algorithms. >> Basically, you generate a ranking of the first k nodes, then you >> extend it to a ranking of the first (k+1) nodes by inserting node >> #(k+1) somewhere in between two other nodes in the ranking. >> >> 5. Write code that does EM iterations. This is just going back and >> forth between steps 3 and 4. In step 4, generate a bunch of random >> rankings, and then use all of them when you go back to step 3. Save >> all of the rankings, like Mike says. >> >> 6. Write code that evaluates your proposed ranking by looking at the >> invisible test edges. See how the quality of the ranking changes >> over >> each EM iteration. >> >> On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop >> michaelbish@gmail.com wrote: >>> So with all due respect to the author, its annoying that the only >>> ranking >>> output the program saves is the final ranking. What I've started >>> doing is >>> telling it to run with 0 iterations which outputs the minimum >>> violations >>> ranking. Then I change the output filename, and run it again with >>> a flag >>> which allows you to specify a seed. >>> >>> This way I'll be able to compare the minimum violations rankings to >>> the >>> expectation maximization algorithm which typically takes 4-8 >>> iterations to >>> converge. But given the computational requirements, it would seem >>> to make >>> sense to default to save every single ranking along the way in case >>> you ever >>> want to look at it later. >>> >>> >>> >>> On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy epurdy@uchicago.edu >>> wrote: >>>> >>>> > Let me also raise a substantive issue. Within groups there is >>>> > more >>>> > agreement about what is status-worthy than between groups. >>>> > While I >>>> > think >>>> > the idea that there exists a status ranking across all the >>>> > students in a >>>> > school has enough truth to it to merit the work being done on >>>> > it, this >>>> > approach is also obscuring the large differences in what earns >>>> > you >>>> > status in >>>> > different contexts. A focus on the causes and consequences of >>>> > status >>>> > differences (status as defined in this paper) inevitably ignores >>>> > the >>>> > importance of groups. Of course, there are ways of trying to >>>> > address >>>> > the >>>> > existence of groups that complement the work done here, and >>>> > maybe its >>>> > not >>>> > even as hard as I'm making it out to be, but you always have to >>>> > draw the >>>> > line somewhere, and at that point you start waving your hands. >>>> > I guess >>>> > I'm >>>> > just saying we have to do our best to be honest with ourselves >>>> > and our >>>> > audience about the many things that might be important but we >>>> > don't know >>>> > for >>>> > sure because we can't fit them in our model. >>>> >>>> You can model people playing different status games at the same >>>> time. >>>> Like, say for simplicity that there are two games in high school, >>>> being smart and being cool. Then we could try to figure out which >>>> game >>>> each player is playing (or how much they're playing both) based on >>>> that. It would make the overfitting problems much worse, but it's >>>> not >>>> super-hard to formulate such a model. >>>> >>>> Also, you could probably get some sort of independent check of >>>> such >>>> models by giving people surveys that try to determine what >>>> strategy >>>> people think they are following. Like, "how cool do you think you >>>> are, >>>> relative to your friends", "how do you decide whether you want to >>>> be >>>> friends with a new acquaintance", etc. >>>> >>>> -- >>>> -Eric >>> >>> >> >> >> >> -- >> -Eric > > > > -- > -Eric
-- -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
-- -Eric
So back when this thread was active, I mentioned that I was concerned that Ball & Newman's attempt to measure status was overfitting the data . Eric replied:
The only way to really know whether or not you're doing it is to have
some data held out for testing/cross-validation. For example, you could erase a random 10% of the edges, do their procedure on the remaining 90%, and then see how many rank violations you get in the invisible 10% of edges.
Say cross-validation shows that they are overfitting. How hard is it to take the next step and change the algorithm so it isn't overfitting any more?
On Thu, Oct 11, 2012 at 5:45 PM, Eric Purdy epurdy@uchicago.edu wrote:
Updated to add more details.
On Thu, Oct 11, 2012 at 12:29 PM, Eric Purdy epurdy@uchicago.edu wrote:
Is anyone interested in working on the sampling algorithm I have been writing about? If not, I may finish it myself, but I don't think I'll have time for a week or so. If people are interested but have questions, you can email me off-list if you want?
On Wed, Oct 10, 2012 at 10:28 PM, Michael Bishop michaelbish@gmail.com
wrote:
Thanks for the link to the kaggle competition Kevin. I'll have some fun looking at that. Of course the approach Ball and Newman are taking is
only
one of many. Check out http://statnet.org/ for an R package which can
do
some fancy things... I think the heavy lifting implemented in C.
On Wed, Oct 10, 2012 at 10:12 PM, Kevin Wilson khwilson@gmail.com
wrote:
Just to clarify, the phenomenon I meant is very well-known is the directedness of many kinds of social graphs. This particular example of friendships among school children and the social science implications
are
rather nice.
Kevin
On Wed, Oct 10, 2012 at 10:36 PM, Kevin Wilson khwilson@gmail.com
wrote:
Hi all,
I'm Kevin. I met Eric when he came to visit Trevor at Knewton. Trevor invited me along to this list.
Anyway, the discussion is interesting, and I would point out that the phenomenon observed by this paper is very well-known. See, for
instance,
this Kaggle competition and the ensuing discussion: http://www.kaggle.com/c/FacebookRecruiting
One interesting question beyond this paper and in the direction of
where
the discussion has been going would be to ask the same questions in
larger
and larger social networks. Some example questions:
Most edges run in the direction of status. But how many triangles are there? i.e., suppose VLS has very low status, LS has low status and
HS has
high status. A triangle would consist of a relationship with VLS
claiming
friendship with LS and HS and VLS claiming friendship with HS. One may expand from triangles to longer chains (where you ask about n-length
chains
from VLS to Lady Gaga next to a direct edge from VLS to Lady Gaga). Can you distinguish types of social networks by the length of these chains? What happens when you merge two friendship networks? Do relative
statuses
remain the same?
Each of these would be a paper, but you'd probably have to vastly
improve
the code. I think data sets exist to answer all three.
Kevin
On Wed, Oct 10, 2012 at 1:02 PM, Eric Purdy epurdy@uchicago.edu
wrote:
Here is a writeup of some notes related to the tricky algorithms question I was trying to pose. Hopefully it will be helpful to anyone who is thinking about reimplementing the code from the paper.
On Sun, Oct 7, 2012 at 2:27 PM, Eric Purdy epurdy@uchicago.edu
wrote:
> GAH, left out a minus sign: > > beta = e^[ - (j-i)^2 ] > > On Sun, Oct 7, 2012 at 2:26 PM, Eric Purdy epurdy@uchicago.edu > wrote: >> Or you can also skip step 2, if you initialize alpha and beta to
be
>> something simple and reasonable, like >> >> (unreciprocated probability of edge from #i to #j) >> alpha = e^[ j-i ] >> >> (reciprocated probability of edge between #i and #j) >> beta = e^[ (j-i)^2 ] >> >> (I might have alpha and beta backwards from how they are in the >> paper?) >> >> Before I was saying that alpha and beta should be modeled as >> polynomials, but it probably makes more sense to model (log alpha) >> and >> (log beta) as polynomials, as the equations above do. That makes
it
>> more like logistic regression, which I think is supposed to work
well
>> in the social sciences? >> >> On Sun, Oct 7, 2012 at 12:35 PM, Eric Purdy epurdy@uchicago.edu >> wrote: >>> So, I'm still a bit too busy to just reimplement their code, but >>> here's a sketch of roughly how I would do it. Maybe someone else >>> will >>> feel like writing this code? It sounds like we could probably
use it
>>> to write a publishable paper, if that's a motivation. >>> >>> The easiest would be to do this in python with networkx. >>> >>> 1. Write code to separate the edges into training and test sets. >>> Basically, independently label each edge "test" with probability
10%
>>> and "training" with probability 90%. Then make a new graph that >>> doesn't have any of the test edges. This is the graph that you
will
>>> run EM on. >>> >>> 2. Get a ranking that has the minimum number of rank violations. >>> >>> 3. Write code to infer alpha and beta from a set of rankings.
This
>>> is >>> basically just a least squares problem, if you take my advice and >>> use >>> polynomials to represent alpha and beta. You will probably want
to
>>> have some sort of penalty for having large coefficients in your >>> polynomial. A common way is to put a prior on the polynomial a_0
>>> a_1 >>> x + ... + a_k x^k. A sensible prior would be a_1^2 + ... + a_k^2. >>> Note >>> that you probably want to let a_0 be whatever it wants. >>> >>> 4. Write code to sample from the set of all rankings, assuming
that
>>> alpha and beta are the correct distributions. This is the thing
that
>>> can probably be done quickly if you do some clever algorithms. >>> Basically, you generate a ranking of the first k nodes, then you >>> extend it to a ranking of the first (k+1) nodes by inserting node >>> #(k+1) somewhere in between two other nodes in the ranking. >>> >>> 5. Write code that does EM iterations. This is just going back
and
>>> forth between steps 3 and 4. In step 4, generate a bunch of
random
>>> rankings, and then use all of them when you go back to step 3.
Save
>>> all of the rankings, like Mike says. >>> >>> 6. Write code that evaluates your proposed ranking by looking at
the
>>> invisible test edges. See how the quality of the ranking changes >>> over >>> each EM iteration. >>> >>> On Sun, Oct 7, 2012 at 12:16 PM, Michael Bishop >>> michaelbish@gmail.com wrote: >>>> So with all due respect to the author, its annoying that the
only
>>>> ranking >>>> output the program saves is the final ranking. What I've
started
>>>> doing is >>>> telling it to run with 0 iterations which outputs the minimum >>>> violations >>>> ranking. Then I change the output filename, and run it again
with
>>>> a flag >>>> which allows you to specify a seed. >>>> >>>> This way I'll be able to compare the minimum violations
rankings to
>>>> the >>>> expectation maximization algorithm which typically takes 4-8 >>>> iterations to >>>> converge. But given the computational requirements, it would
seem
>>>> to make >>>> sense to default to save every single ranking along the way in
case
>>>> you ever >>>> want to look at it later. >>>> >>>> >>>> >>>> On Sun, Oct 7, 2012 at 10:33 AM, Eric Purdy <
epurdy@uchicago.edu>
>>>> wrote: >>>>> >>>>> > Let me also raise a substantive issue. Within groups there
is
>>>>> > more >>>>> > agreement about what is status-worthy than between groups. >>>>> > While I >>>>> > think >>>>> > the idea that there exists a status ranking across all the >>>>> > students in a >>>>> > school has enough truth to it to merit the work being done on >>>>> > it, this >>>>> > approach is also obscuring the large differences in what
earns
>>>>> > you >>>>> > status in >>>>> > different contexts. A focus on the causes and consequences
of
>>>>> > status >>>>> > differences (status as defined in this paper) inevitably
ignores
>>>>> > the >>>>> > importance of groups. Of course, there are ways of trying to >>>>> > address >>>>> > the >>>>> > existence of groups that complement the work done here, and >>>>> > maybe its >>>>> > not >>>>> > even as hard as I'm making it out to be, but you always have
to
>>>>> > draw the >>>>> > line somewhere, and at that point you start waving your
hands.
>>>>> > I guess >>>>> > I'm >>>>> > just saying we have to do our best to be honest with
ourselves
>>>>> > and our >>>>> > audience about the many things that might be important but we >>>>> > don't know >>>>> > for >>>>> > sure because we can't fit them in our model. >>>>> >>>>> You can model people playing different status games at the same >>>>> time. >>>>> Like, say for simplicity that there are two games in high
school,
>>>>> being smart and being cool. Then we could try to figure out
which
>>>>> game >>>>> each player is playing (or how much they're playing both)
based on
>>>>> that. It would make the overfitting problems much worse, but
it's
>>>>> not >>>>> super-hard to formulate such a model. >>>>> >>>>> Also, you could probably get some sort of independent check of >>>>> such >>>>> models by giving people surveys that try to determine what >>>>> strategy >>>>> people think they are following. Like, "how cool do you think
you
>>>>> are, >>>>> relative to your friends", "how do you decide whether you want
to
>>>>> be >>>>> friends with a new acquaintance", etc. >>>>> >>>>> -- >>>>> -Eric >>>> >>>> >>> >>> >>> >>> -- >>> -Eric >> >> >> >> -- >> -Eric > > > > -- > -Eric
-- -Eric
Math mailing list Math@moomers.org http://mailman.moomers.org/mailman/listinfo/math
-- -Eric
-- -Eric
participants (4)
-
Eric Purdy
-
Kevin Wilson
-
Michael Bishop
-
Trevor Smith