The DAILY CORRUPTION: NEWS FEED & DATABASE project is an effort to fill the information gap inherent in traditional aggregate indexes of corruption, and to provide a more efficient way to conduct case studies on the subject by significantly reducing the usual constraints.

The project consists in the creation and constant update of a web database of corruption and anti-corruption news from all over the world, starting with a beta version focusing on the twenty-nine countries of the Americas selected.

Upon release, three products are to be immediately expected: (1) Daily news feed summary of corruption and anti-corruption news reported by national media of each country; (2) automatic statistical information on variables such as frequency, forms, actors and amounts involved, etc.; and (3) qualitative follow-up data on all relevant cases.

The ultimate goals of developing the proposed database are: (a) to limit the loss of data accuracy regarding the perception of political corruption; (b) to drastically cut down the financial and logistical costs of conducting case studies in developing countries; (c) to offer a constantly updated map of political corruption at the national level; and, perhaps most importantly, (d) to provide an instrument for the follow-up of political will for anti-corruption efforts in each country.


For corruption events, these are:

1. Sector –general- (public/private)
2. Sector –specific- (agriculture, education, tax administration, etc.)
3. Activity (human resource management, service delivery, etc.)
4. Sub-type (bribery, embezzlement, money laundering, etc.)
5. Range (local/national/international)
6. Level of the actors involved (low/middle/high)
7. Scale of the network (intra-organizational/inter-organizational)
8. Amount of resources involved (petty/grand)
9. Legality (criminal/administrative/ethical consequences)
10. Actors involved –general- (executive, legislative, judiciary, military, etc.)
11. Actors involved –specific- (incumbent/former)
12. Impact expected over the leadership (unfavorable/neutral/favorable)
13. Perception changes that may result (-5 ~ +5 corruption perception points)
14. Relevance, by number of media outlets reporting the event (single/multiple)
15. Relevance, by page of the printed media where the event is reported (front/inside)
16. Relevance, by the position of the news article on the page (main/secondary/tertiary)
17. Relevance, by word count

For anti-corruption events, these are:

1. Sector (public/private)
2. Sub-type (procedure, plan, appointment, etc.)
3. Range (local/national/international)
4. Level of the actors targeted (low/middle/high)
5. Initiative actors (executive, legislative, judiciary, military, etc.)
6. Implementation actors (executive, legislative, judiciary, military, etc.)
7. Impact expected over the leadership (unfavorable/neutral/favorable)
8. Perception changes that may result (-5 ~ +5 corruption perception points)
9. Relevance, by number of media outlets reporting the event (single/multiple)
10. Relevance, by page of the printed media where the event is reported (front/inside)
11. Relevance, by the position of the news article on the page (main/secondary/tertiary)
12. Relevance, by word count


6 thoughts on “FEEDBACK

  1. Pablo Noriega Vinces Reply

    Control sobre asociaciones y ONG.
    Integrantes fuerzas Armadas y Policiales con responsabilidad del Narcotraficantes.

  2. Prof. Dr. Carsten Stark Reply

    This database is a very good idea. I’m almost enthusiastic. But I fear there are some problems. How can you ensure, that you will get representative data? What is really the difference between anti-corruption and corruption, is there always a reliable difference? What´s about the quality of the data, I guess the greatest issue will be the reliability. Perhaps you can solve the issius, if you get in every country a team of scientitys, cooperating with you and be responsible for the data- quality. Perhaps you should cooperate with Transparency International. They do have in almost every country specialists, and I guess the do have really interests in your datas.

    1. JPozsgai Reply

      Thank you for raising these issues. I will try to address each one the best I can here, but if there are further comments, please do not hesitate to share them with us.
      (1)The matter of representativeness of the data is indeed problematic, but we hope to somehow mitigate the threat by choosing media sources from different political leanings, with at least 2-3 main outlets per country at the beginning. With the later adoption of text classification tools, we hope to expand the number of outlets covered, specially in countries with a larger population. The main strategy to control for editorial bias, on the other hand, will be to weight the data by country level of media opposition, popular approval, media congestion, freedom of press, and freedom of information, all of these variables found in the literature to potentially skew news coverage.
      (2)Regarding the difference between corruption and anti-corruption entries, we are considering the following rule: If the event pertains to a case of misuse of office for private gains (including all actions taken to investigate, prosecute, and punish it), the entry is categorized as CORRUPTION. If the event pertains to the prevention, control or punishment of corruption in general (that is, regardless of its impact on any specific corruption case), the entry is categorized as ANTI-CORRUPTION. Additionally, a secondary field addresses the “direction of impact on organizational legitimacy” in order to differentiate between favorable and unfavorable news of both corruption (control vs. scandal) and anti-corruption (policy adoption vs. implementation failure) cases. In this way, we can clearly differentiate between good and bad news of corruption, and good and bad news regarding anti-corruption, and measure them independently.
      (3)Finally, concerning the quality of the data, this will depend largely on the capacity of our local partners conducting the daily input, and of our constant training, monitoring and support of this activity. For validation purposes, we are also currently extending invitations to organizations such as Global Integrity to keep independent track of the cases covered in each country. We hope this will provide both a source of legitimacy and a second level of quality control to ensure the quality of our data.

  3. Andrej Skolkay Reply


    it may be useful project. However, I wonder how to deal with the fact the media
    a) mostly report accusations or suspicions and not necessarily real corruption (sometimes it turns to be the real one)
    b) do not report all cases of corruption and this in three meanings : some media report on some cases of what they call corruption, while other media may ingore these reports, but there also are cases which do not get reported at all.
    c) moreover, sometimes courts or prosecutors stop cases that may be real corruption but there is missing evidence or there is political pressure behind verdicts.
    In other words, is is aim here to get media version of corruption of somehow more reliable picture of corruption?

    1. JPozsgai Reply

      Thank you for raising this issue. Indeed, media reports cannot, and should not, be equated to ACTUAL corruption. Reports are, however, as close as we can currently get to assessing the real dimension of corruption in any country, aside from strategies employing proxy measures such as size of informal market, cost overrun, and the like (all of which have their own issues).

      At a minimum, we can hopefully agree that media reports are the main (if not the only) source of information regarding political corruption, and thus the origin of actual corruption perceptions. When we consider that perceptions are currently assessed by surveying experts (representing a secondary source of data), it is clear that tapping the media directly can at least reduce the noise of individual bias.

      Additionally, in order to increase data validity, the project includes information on “Original source of information/allegations”, which is a variable that will allow the user to differentiate between actual, legal cases and discursive usage/attacks when conducting analysis. Its classification rule is: If the news article reports on information provided by actors in exercise of their non-partisan, technical public roles (e.g. prosecutors, judges, international public servants), please select OFFICIAL. If information is provided by actors in exercise of their partisan (congress members) and/or private roles (experts), or it is mainly generated by private organizations (NGOs, media), please select UNOFFICIAL. If the article reports on information that cannot be assigned to any of the previous categories with a high level of confidence, please select UNKNOWN/UNCLEAR.

      Finally, the objective of this database is not to provide a perfect measurement of the corruption phenomenon, but to disaggregate it as much as possible along the dimensions commonly considered as relevant in the literature; and to give users the freedom to analyze and combine it as they see fit. We do this by explicitly defining every concept used to classify the data, without attempting to extend its logic to the perfect description of such a complex issue as that of corruption. The only place where decisions regarding conceptual validity will be made is in the area of ADVANCED ANALYTICS, a section of this platform where statistical trends will be shown based on the combination of variables selected by our team. But, even there, the rationale behind the exercise will be clearly shown so that the concepts behind the numbers are understood and, if necessary, challenged.

  4. Omar SHDEIFAT Reply

    Hi to all, Hi Joseph if you need any support about analysis I am ready to help you.

    Best regards,

Leave a Comment

Your email address will not be published. Required fields are marked *