Cointime

Download App
iOS & Android

A Quantitative Simulation of Pairwise Voting for RetroPGF

From CryptoEconLab by Kiran Karra

Introduction

CryptoEconLab has been booting the Filecoin RetroPGF program. The first round of FIL-RetroPGF followed the Optimism framework to as much of a degree as possible. From a voting design perspective, FIL-RetroPGF-1 used the Quorum and Threshold (Q+T) model to convert badgeholder votes to funding decisions. In the Q+T model, badgeholders are asked to vote on how much funding they would like to allocate to all projects, simultaneously. These are aggregated by a scoring mechanism that determines the final funds to be distributed to each project. A key point in Q+T voting is that badgeholders must assess all projects against each other simultaneously.

However, other voting mechanisms are possible. In this post, we quantitatively characterize another voting mechanism, Pairwise, in the context of RetroPGF-based funding. We introduce a new open-source framework, voting_mechanism_design, currently under active development, to compare Pairwise to the Quorum and Threshold voting mechanism.

Using this framework, we compare Pairwise to a quorum-based voting mechanism and show that Pairwise can allocate capital more efficiently than quorum-based voting. We then explore the robustness of Pairwise voting to negative behaviors such as COI and collusion.

Pairwise Voting Mechanism

Pairwise voting is a mechanism that enables badgeholders to cast their preferences for which projects they would like to see funded. It works by presenting pairs of projects to the badgeholders. For each pair, the badgeholder selects which project they feel deserves more funding. This is done for as many pairs as the badgeholder wishes. After all badgeholder votes are collected, a model (e.g., the Bradley-Terry model) is applied to infer the global rankings of projects based on the badgeholder rankings. This is similar to a chess ranking system, where one-on-one match results are aggregated to create a global ranking of all players. Rankings are then mapped to funding amounts according to a pre-defined distribution or mapping (interfaces for pairwise voting have already been implemented).

The pairwise voting mechanism differs from the approach used by Optimism and FIL-RetroPGF-1, which we will denote as Q+T. In the “Quorum + Threshold” approach, badgeholders can vote on how much funding they would like to go to each project simultaneously. Pairwise differs from this by presenting pairs of projects to each badgeholder. The hypotheses motivating a Pairwise are:

  1. It reduces the cognitive load badgeholders face when trying to assess hundreds of projects simultaneously and the cognitive load on badge holders due to the limited scope of each vote.
  2. It is a more robust way to create a global ranking of projects since it can be inferred by well-known algorithms used in other related applications.
  3. It can result in more accurate capital allocation.

We created a pairwise voting simulator to test these hypotheses and the properties of this voting mechanism to negative behaviors.

Comparing to Quorum + Threshold

We begin by comparing the baseline performance of the two mechanisms. Specifically, we want to compare how aligned the global rankings of each mechanism are to the true project rankings. This measures the efficacy of the voting mechanism’s capital allocation. To incorporate the realities of the RetroPGF process, where badgeholders have limited time and energy to evaluate projects, we assess the capital allocation accuracy as a function of laziness and expertise.

In our framework, badgeholder laziness is a value between 0 and 1 that translates to how many projects the particular badgeholder will vote on. A laziness of 0 indicates that the badgeholder will vote on all projects, whereas a laziness of 1 indicates that badgeholders will not vote on any projects. Expertise is a value between 0 and 1 that translates to how aligned a badgeholder’s votes are with the true project ratings. An expertise of 0 means that the badgeholder is placing fully random votes, whereas an expertise of 1 means that a badgeholder is placing perfectly accurate votes. Interpolation between these values is discussed in the Appendix.

We initialize the simulation by seeding each project with a “true impact” rating, a value between 0 and 1 indicating the impact. We then create a population of badgeholders with particular expertise and laziness values to be tested.

We then simulated how aligned the global rankings of projects were between the two voting mechanisms as a function of badgeholder laziness and expertise. Alignment is measured through rank correlation of the true project rankings to the rankings supplied by the badgeholders through voting. An alignment of 1 is perfect, whereas 0 is purely random. Negative values are allowed because rank correlation can be negative, but this is a technicality.

Fig 1 shows the results of our experiments. The x-axis sweeps across badgeholder expertise, with a value of 0.0, meaning random guessing, and 1.0, meaning perfect badgeholder voting and linearly scaling between the two. The y-axis measures the alignment between the inferred rankings and the true project impact. Blue dots represent Monte Carlo simulation runs of pairwise voting mechanism, and the green dots represent the Quorum and Threshold mechanism. The darkness of the dots maps to different badgeholder laziness, as indicated by the legend.

Fig 1: A comparison of the effectiveness of Q+T and Pairwise voting mechanisms

The results indicate that the pairwise voting mechanism is more robust to both low badgeholder expertise and laziness than the Q+T voting mechanism. This is evident because, for corresponding values of expertise and laziness, the blue dot groupings consistently have higher alignment than the green dot groupings.

COI Modeling

Next, we wanted to test the effect of conflict of interest (COI) on the pairwise voting mechanism. COI is defined as the scenario where a badgeholder votes for a particular project for which they have a vested interest (perhaps financially). In our simulations, COI is modeled as a badgeholder who votes for a particular project consistently, even if they are presented with a voting pair where the second project has a higher perceivable impact. The metric we use to determine the effect of COI is a change in the relative ranking of a project if it was voted with COI and if it was not voted with COI behavior. Fig 2 shows the results of the effect of COI in changing a project’s ranking as a function of three variables: a) the project’s true impact, b) the badgeholder’s expertise, and c) the badgeholder’s laziness. The y-axis shows the change in project ranking, and the x-axis is binned from the most impactful to the least impactful project. Color and brightness indicate the Badgeholder population’s expertise and laziness.

Fig 2 shows that more impactful projects are less impacted by COI voting, and that the effectiveness of COI is directly proportional to laziness.

These observations match intuition — since more impactful projects are more likely to be voted for by badgeholders, a COI badgeholder applying COI behavior to a highly impactful project will not affect the project’s ranking significantly. Similarly, a lazier badgeholder population results in fewer overall votes, which can amplify the effect of COI voting.

Fig 2: Effect of COI on Project rankings using Pairwise Voting

Collusion

Next, we measure the effect of collusion on a particular project’s rankings. We define collusion as a coordinated agreement between multiple badgeholders to vote for a particular project regardless of the impact of that project. In our simulations, this manifests as multiple badgeholders voting for a particular project consistently, regardless of the relative impact of the other project. This is essentially an amplified version of the COI voting, so we use the same metrics to determine the effect of collusion on project rankings.

Fig 3 below shows the effect of collusion on project rankings. Light colors indicate only one colluding agent (equal to the COI case), and darker colors indicate more badgeholders colluding for a particular project. The x-axis represents the impact of the project that is being colluded, and the y-axis represents the delta in the project rankings for that particular project.

Fig 3 shows that as the number of agents colluding for a particular project increases, the effect of the collusion is amplified. This matches intuition because as the number of votes for a project increases, how it changes the project’s overall ranking also changes. Additionally, the effect of collusion depends on the project’s true rating — a more impactful project is less affected by collusion than a less impactful project.

Fig 3: The effect of collusion on project rankings, for Expertise=0.25 and Laziness=0.75.

Conclusion

In this work, we described the Pairwise voting mechanism and quantitatively analyzed the accuracy of pairwise voting to the Quorum+Threshold method. We found that for equivalent values of badgeholder laziness and expertise, Pairwise voting can help to allocate capital more efficiently than the Quorum method. Next, we examined the robustness of Pairwise to negative behaviors such as COI and collusion. For both, our results indicate that the effect of COI and collusion on a project is proportional to both the badgeholder population’s behavior patterns and the true project’s impact. More impactful projects are less affected by COI and collusion behaviors than less impactful behaviors. Similarly, a more active badgeholder population results in more total votes cast, reducing the effect of COI and collusion.

Our next steps are to implement COI and Collusion modeling for the Quorum and Threshold voting mechanism and compare these two voting mechanisms along that dimension. This can help to fully characterize whether Pairwise can be a viable replacement for Quorum and Threshold voting for RetroPGF. Beyond this, we aim to expand the simulator to other voting mechanisms that can be applied to this style of funding and welcome contributions from others interested in working on this space!

Please reach out to us at [email protected] if you’re interested in working together on mechanism design, cryptoeconomics, quantitative modeling, or other related subjects for your project.

Appendix

A1 — Expertise Mapping

In this section, we discuss how badgeholder expertise maps to the alignment of the rankings between the true project rankings and the badgeholder rankings. Fig 4A shows the mapping that was implemented by the original OP simulator. Here, we notice that an expertise of 0 maps to a reasonable correlation between the true project impact and the badgeholder assigned — this does not align with the definition of expertise defined earlier.

Fig 4B shows the updated mapping we used in the simulations above, which is more aligned with our definition of expertise. It also shows the pairwise version.

Fig 4: Mapping Expertise to Project Rankings

Future work can involve using the true project impact value to make expertise dependent on the project’s impact and the expertise factor. The logic here is that it may be easier to vote on projects with high true impact, regardless of expertise, whereas lower projects need more expertise to discern true impact.

A2 — Aligning Laziness

In this section, we identify how laziness differs between pairwise and Q+T voting. In pairwise voting, projects are presented in pairs, and the number of pairs of projects grows exponentially with the number of projects. For 100 projects, there are 4950 pairs of projects that a badgeholder needs to vote on. However, in Q+T voting, there are only 100 projects to vote on. If we define laziness as the percentage of projects a badgeholder votes on, then we need to ensure that the mapping is normalized. Consider the following example for 100 projects: if laziness of 0.5 corresponds to a Q+T badgeholder voting on 50 projects, then we cannot say that a pairwise badgeholder may vote for 0.5*4950 = 2475 projects in the pairwise method. They may vote for far less because even if pairwise is more accessible and less time-consuming, 2475 votes are still sizeable in absolute terms!

To normalize the factors in the simulations above, we use the mapping described in Fig 5, which translates badgeholder laziness in the Q+T setting to badgeholder laziness factor in the pairwise setting. We note this to be imperfect and that it requires further investigation.

Fig 5: Mapping of laziness between the Q+T and Pairwise schemes.

A3 — Funding

This work was conducted through the OP funding vehicle and supported by Filecoin Impact Fund Public Goods funding.

Comments

All Comments

Recommended for you

  • The Eye of the Storm

    A notable divergence between supply and demand is emerging, with the market being range-bound for over seven months. With low volumes across on-chain and futures markets and a HODLer-dominated environment, the scene is set for heightened volatility in the near future.

  • Coinbase and Glassnode: The Q4 2024 Guide to Crypto Markets

    In the Q4 'Guide to Crypto Markets' by Glassnode and Coinbase Institutional, we highlight growing institutional engagement in Bitcoin ETFs, the surge in Ethereum staking, and the expanding role of stablecoins as key drivers of market activity.

  • ZETA breaks through $0.71, with a 24-hour increase of 14.8%

     market shows ZETA has broken through $0.71 and is now trading at $0.7089, with a 24-hour increase of 14.8%. The market is volatile, so please be prepared for risk control.

  • BNB breaks through $600

     the market shows that BNB has broken through $600 and is now reported at $600.09, with a 24-hour increase of 1.4%. The market fluctuates greatly, so please be prepared for risk control.

  • BTC breaks through $68,500

    Golden Finance reported that the market showed BTC breaking through $68,500 and is currently trading at $68,501.99, with a 24-hour increase of 2.56%. The market is volatile, so please be prepared for risk control.

  • Tapioca DAO suspected of security attack

    According to Aggr News, Tapioca DAO, a full-chain currency market based on LayerZero, may have been subject to a security breach. The specific details are currently unclear, and users should remain vigilant and avoid interacting with unknown links or suspicious activities.

  • EigenLayer X account suspected to be hacked, posting fraudulent links

    EigenLayer X account is suspected to have been hacked, and a tweet was posted about the re-allocation of the remaining EIGEN tokens for the 2nd season Stakedrop, which includes a fraudulent link. Users should be cautious when interacting with it.

  • Possible futures for the Ethereum protocol, part 2: The Surge

    At the beginning, Ethereum had two scaling strategies in its roadmap. One (eg. see this early paper from 2015) was "sharding": instead of verifying and storing all of the transactions in the chain, each node would only need to verify and store a small fraction of the transactions. This is how any other peer-to-peer network (eg. BitTorrent) works too, so surely we could make blockchains work the same way.

  • UAE to introduce legal framework for DAOs

    The United Arab Emirates is focusing on introducing a legal framework for decentralized autonomous organizations (DAOs) in the Ras Al Khaimah Digital Asset Oasis (RAK DAO), a free economic zone dedicated to digital assets. Law firm NeosLegal and RAK DAO announced that the new system will be launched and discussed at the DAO Legal Clinic on October 25th. Irina Heaver, a partner at NeosLegal, said that the framework is expected to clarify how DAOs can remain legally compliant, and she believes this will have a significant impact on decentralized governance in the UAE and the wider Web3 ecosystem. The announcement emphasizes that the legal structure will clarify tax obligations and benefits. It will also establish property rights for on-chain and off-chain assets and provide legal protection for the founders, members, and contributors of the DAO from personal liability. The legal framework will also enable DAOs to enter into legally binding contracts and establish guidelines for resolving internal and external disputes.

  • Stripe Aims to Strengthen Its Position in Stablecoins: Plans to Acquire Bridge for $1B

    Bridge allows companies to accept payments in stablecoins like Tether’s USDT and Circle’s USDC. Stripe recently confirmed it will allow merchants to accept stablecoin payments.