About the ecology of online aggregates

Academic papers usually must meet two goals: on one hand, to communicate novel scientific progress in a specific discipline, its relevance or application  and, on the other hand, to support such claims by showing the results of experiments, proofs, evidences, references to other works, descriptions of the method used, etc. This way the scientific community not only validates the authors but also knowledge is shared. Science dissemination aims to present that work to the non-scientific community in a simpler language. However, there are complex techniques and formulas that sometimes are difficult to disseminate without resorting to a very generic simplification. This is why the idea of this new section of our 7Puentes blog is to disseminate some interesting paper in simpler language than the one used in the paper. We will try to minimize the formulas and explain them as best as possible so that our readers can understand it. It is worth mentioning that all figures obtained here are a copy of the original figures, that the credit for the discovery is for Johnson et al and for more details you will have to refer to the original work and the complementary material.

What Johnson et al expose in “New online ecology of adversarial aggregates: ISIS and beyond” is a model for the dynamic of the origin, union, predation and breakup of protest/ support groups in social networks. Once these processes are characterized, a variable of the model is correlated with any real “event” so as to validate the predictive capacities of the group. In the work –by the way, it is fantastic, very original and deeply inspiring for those who work social media analysis- they exposed very good results regarding the protests of Brazil 2013 and an ISIS attack of September 2014.

The model

The model proposes that, given all the groups as a whole that manifest their support to protests, strikes, even terrorist organizations, it works as an ecosystem with a particular dynamic where there is a union between organisms, predators, etc.

In simple words, the dynamic that we are talking about would be like this:

  1. A new “page” rises within the ecosystem with a few followers.
  2. The followers promote the page in other pages and more followers appear.
  3. The page is detected or the purpose of it lacks of temporary sense and the page is closed.
  4. The followers divide and migrate to other pages of the same “ecosystem” because they are more attractive or because they have greater traction (followers are still inside the ecosystem).

To set up the ecosystem, a “seed” corpus of keywords unique of this type of support is created (for example: “We love CFK” if they were Cristina Kirchner’s fans) and then a “snow ball” starts, adding followers and keywords until closing the loop. That is to say:

  1. For each keyword look for related posts and the groups.
  2. For each group  look for the followers and followers’ groups.
  3. Eliminate the groups that do not correspond to the specific topic.
  4. Create new keywords and repeat from 1 until no new groups are added.

This process is made in real time until the ecosystem is complete. It is important to highlight the relevance of having a complete set of groups to be able to model it correctly as an “ecosystem”.

In a chart with horizontal bars (as in the figure above) you may observe, as they approach to the event date, that it has a fractal structure where there are auto-organized transitions. The scale parameter is lined up with (t-tc)^1 like many physical process phase changes.

What the authors observed was that there is a change of scale that precedes an “event”and they verify the “formula” with the empirical data obtained. In other words, the proliferation of these groups and its online followers are indicators of the real conditions to carry out this event.

According to the model, the evolution of each group (its size regarding the amount of followers) is determined by two probabilities, one of increasing and the other of decreasing its size.

On one hand v_coal which is the probability of adding 1, 2, 3, etc. followers to any group and on the other hand, we have v_frag which is the probability of the closure of the group. In the above graphic we can see better the process of union and fragmentation. If we make a graphic of the amount of followers based on time, we will realize that it will look like a shark fin. All groups, no matter their size, follow this figure, which is typical of fractal structures (self-similarity). In the figure below we can see the curves obtained from the empirical data. The great finding of Johnson et al was a theoretical model that reproduces the same curves.

This closure is realistic since, in some cases, there are government agencies or groups that directly attack the group producing a phenomenon of “predation”. An interesting point about the work is to highlight that this type of phenomena does not occur in Facebook, since Facebook closes almost automatically this kind of pages and they never have the opportunity to grow.

The theoretical model presented in the work that describes the amount of size S consists of the following nonlinear differential equation:

ns(t)t=vcoalN2k=1s-1k(s-k)nk(t)ns-k(t)-2vcoalsns(t)N2k=1knk(t) -vfragsns(t)N

This equation describes the dynamic of the ecosystem expressed in “how many groups of size S” (the variable) are in a specific time (variable t). The analytical and computational solution (making simulations) to the differential equation follows a power law S where = 2,5 (the empirical value obtained from the curves  the 2,33). If we go into details, we will see that this differential equation has 3 terms, the first one is about the growth of the group and the other two are about the loss of followers as a consequence of the processes of union and fragmentation. Please, note the presence of these two probabilities v_coal and v_frag in those terms.