Cointime

Download App
iOS & Android

Auditing with ChatGPT: Complementary But Incomplete

Validated Project

In November 2022, OpenAI launched ChatGPT, an innovative Artificial Intelligence (AI) project. In addition to summarizing articles, crafting essays, and even writing jokes and poems, ChatGPT can be used to debug and generate code. With more than $3.7 billion lost to hacks and scams of Web3 projects, some people wondered if this new technology could improve insecure smart contract code.

ZKasino, a decentralized betting platform, recently engaged in a pre-audit with ChatGPT. ZKasino hoped that ChatGPT could give it an initial security review while CertiK’s comprehensive audit was still in progress. The team wanted to test the capabilities of ChatGPT as a smart contract auditor. So how did it perform? Is AI ready to take over from expert manual code auditors, or does the human touch still have something to offer?

On Dec. 23, 2022, ZKasino “hired” ChatGPT to identify potential security issues in their smart contracts. The tool raised several concerns that sounded valid on the surface.

While ChatGPT undeniably provides a valuable service to the Web3 security community, we found that there is quite a lot of room for improvement. ChatGPT missed a number of important vulnerabilities while giving false positives for good code.

We hope that our insight and recommendations can help ChatGPT become an even stronger tool for securing Web3 applications. The following sections present our findings on these two types of mistakes.While ChatGPT undeniably provides a valuable service to the Web3 security community, we found that there is quite a lot of room for improvement.

What Did ChatGPT Find?

What Did ChatGPT Miss?

ChatGPT mentioned several common security concerns that can be found in many smart contract implementations. However, it failed to identify certain serious security issues, including:

  • Project-specific logic vulnerabilities
  • Inaccurate math calculations and statistical models
  • Inconsistencies between implementation and design intention

Vulnerability #1: Project-Specific Logic

ChatGPT failed to identify a critical vulnerability, leaving ZKasino users vulnerable to an exploit where attackers could consistently win and drain funds from the Bankroll contract. Players can join the game by calling the Verifiable Randomness Function (VRF), and Chainlink's VRF will trigger the fulfillRandomWords() function with random numbers to complete the game. ZKasino’s code allowed for a refund of users' wagers that could be triggered if the calling of fulfillRandomWords() fails.

Figure 1: A consistent winning attack strategy

During CertiK’s code review of the same smart contract code, a potentially harmful _transferPayout() invocation was discovered, The function was designed to transfer winning payouts to the player's account. An attacker can maliciously revert the _transferPayout() if they lose, causing the entire fulfillRandomWords() call to fail. This invokes a waiting period of 100 blocks and leads to the invocation of CoinFlip_Refund() for a refund, meaning the attacker would never lose money.

While the transfer failure issue was recognized by ChatGPT, the potential attack methods linked to the project design were not. Thus, the impact of the failure combined with the project's logic was not identified by ChatGPT. See ZKasino’s full audit report for a description of the specific attack flow.

Vulnerability Missed #2: Inaccurate Math Calculation and Statistical Models

Ensuring randomness and outcomes which meet reasonable expectations are of the utmost importance in any gaming project. To confirm this, the randomness of each game outcome was thoroughly evaluated during the audit process. Though ChatGPT acknowledges the significance of this matter, it does not detect any cases of unfairness. ChatGPT brings up the use of VRF and the potential for unfair outcomes if the VRF contract is compromised or manipulated:

“If the VRF contract is not secure or is manipulated, it could potentially lead to unfair outcomes for the game.”

However, this conclusion is limited and does not address the root causes of unfairness. We found a number of potential issues regarding randomness in the course of our audit.

Unfair Randomness Distribution

One medium-level issue found regarding randomness is the unfair random number usage issue in the VideoPoker game, where players have less chance to get certain cards.

Decimal Truncation

Another issue was discovered in the Dice game, which would have allowed players to choose special multipliers to maximize their expected returns.

Vulnerability #3: Inconsistencies Between Implementation and Intended Design

ChatGPT is often able to understand the implementation of a single function, while failing to grasp the design's underlying purpose. For example, it may understand the technical execution of a certain function, but not be able to place the purpose of this function in the broader context of the smart contract. To ensure that ChatGPT does not make mistakes in its coding, it needs to better understand smart contract code logic. As it currently stands, ChatGPT provides a surface level reading of the code. To take its auditing to the next level, it must be able to work backwards from a function to derive its initial logic: a significant task.

Incorrect Input Validation

An input validation issue was discovered in the Plinko contract, resulting in incorrect multipliers setting.

According to ZKasino, the number of rows used in Plinko should be 8 to 16. However, the Bankroll contract owner can set a row number value outside the expected range through the function setPlinkoMultipliers() because of a bug in the below check:

The code indicates the transaction will revert if both numRows and risk are invalid. However, if only one of two criteria is invalid, the check will still pass, and the code will not revert.

ChatGPT gave a different answer in response to the second inquiry: “The function then checks if the value of "numRows" is between 8 and 16, and if the value of "risk" is less than 3. If either of these conditions are not met, the function reverts with the error "InvalidNumberToSet".

ChatGPT appears to comprehend the purpose of the function. Nevertheless, it does not possess the knowledge of the suitable application and cannot identify the real vulnerability without extra information.

Inconsistent Value Update

In the Slots contract, an issue related to an inconsistent update to totalValue was identified, which could result in the game ending prematurely. The totalValue was used to monitor user's winnings or losses, but it only kept track of the payout and failed to deduct the wager, leading to an incorrect calculation of the user's gain or loss.

Conclusion

Despite its training, ChatGPT misses certain important security issues in its audits. This is due to the limitations of AI in fully understanding the complexities and nuances of code, as well as its lack of hands-on experience in real-world scenarios. As stated on its official website, ChatGPT is a research release that relies on natural language processing for dialogue purposes. It is often unable to understand the intent and reasoning behind the code as well as a human auditor can. As such, it is important to supplement ChatGPT's analysis with manual audits by experienced security experts to ensure accuracy.

The following summary highlights the strengths and weaknesses of human-based services and ChatGPT on various criteria.

The effectiveness of ChatGPT's answers is largely dependent on the format of the prompt. In this blog, we compare the pre-audit results of our customer's interactions with ChatGPT and the final audit results performed by experts at CertiK. As technology improves and a clearer understanding of prompt engineering arises, engineers will be able to make better use of ChatGPT. Keep a lookout for our future blog posts, in which we delve into the art and science of prompt engineering: posing effective questions to ChatGPT.

Read more: https://www.certik.com/resources/blog/6oBs1st22AsSYxpF7ENoiX-auditing-with-chatgpt-complementary-but-incomplete

Get the latest news here: Cointime channel — https://t.me/cointime_en

Comments

All Comments

Recommended for you

  • U.S. consumer confidence improves again in November, reaching a two-year high

    Dana M. Peterson, Chief Economist of the World Large Enterprises Federation, said, "US consumer confidence continued to improve in November, reaching the highest level in the past two years. The growth in November was mainly due to consumers' more positive assessment of the current situation, especially in the labor market. Compared with October, consumers' optimism about future employment opportunities has also greatly increased, reaching the highest level in nearly three years. At the same time, consumers' expectations for future business conditions have not changed, while their optimism about future income has slightly declined." Earlier, the US Conference Board Consumer Confidence Index for November recorded 111.7, a new high since July 2023.

  • Starknet: Phase 1 of STRK staking is now live on the mainnet

    Starknet announced that the first stage of STRK staking has officially launched on the mainnet.

  • CZ: Not trying to end the meme craze, just encouraging more builders

    CZ posted on X platform today, saying: "I am not against Meme coins, but Meme coins have become 'a little' strange now. Let's use blockchain technology to build practical applications." Some community users said that even Musk is a supporter of Meme coins, and it is very difficult to end this frenzy. CZ responded that "there is no attempt to end anything, everyone has the right to choose to invest or hold what they want. Just encourage more builders."

  • Talus Network Completes $6 Million Strategic Round of Financing with a Valuation of $150 Million

    decentralized AI protocol Talus Network raised $6 million in a strategic financing round led by Polychain Capital, valuing the company at $150 million. This funding will help further develop the Talus ecosystem, including the Protochain, Nexus framework, and "AI dating experience" application.

  • AXIOS: Trump is considering appointing a secretary of state for artificial intelligence

    according to AXIOS, Trump is considering appointing an AI minister to coordinate federal policies and government use of emerging technologies.

  • Coinbase International has launched COW perpetual contracts

     Coinbase International has launched COW perpetual contracts. COW-PERP market limit, market, stop loss, and stop loss limit orders are now all available.

  • Schuman Financial Completes $7.36 Million Seed Round, Led by RockawayX

    Schuman Financial has completed a $7.36 million seed round of financing, led by RockawayX, with participation from Lightspeed Faction, Kraken Ventures, Nexo Ventures, Gnosis VC, Delta Blockchain Fund and Bankless Ventures. In addition, Schuman Financial has launched a euro stablecoin, EURØP, which complies with the MiCA standard.

  • QCP: BTC's path to $100,000 has stalled, and ETH implied volatility has turned to put options

    QCP Capital has published an analysis indicating that the recent drop in the price of Bitcoin has resulted in long liquidations exceeding $430 million. This drop coincides with the end of five consecutive days of net inflows for spot ETFs, which recorded a outflow of $438 million on Monday, while MicroStrategy fell by 4.4%. With the US holiday approaching and no immediate catalyst to push prices higher, BTC's path towards $100,000 has stalled. In addition, the implied volatility of ETH has turned to bearish options rather than bullish options, and market concerns about downside risks may intensify, especially with the release of the FOMC meeting minutes and PCE data. However, in the long run, this market decline is not an excessive correction. Bitcoin has only retreated to last week's level. Since Trump's election, the market has become extremely overbought and leveraged, so a pause is inevitable.

  • Binance will delist GFT, IRIS, KEY, OAX, and REN

     Binance will delist the following trading pairs on December 10, 2024: GFT/USDT, IRIS/BTC, IRIS/USDT, KEY/USDT, OAX/BTC, OAX/USDT, REN/BTC, and REN/USDT. Additionally, Binance Futures will close all positions and automatically settle the KEYUSDT and RENUSDT USDⓈ-M perpetual contracts on December 3, 2024 at 09:00 (UTC). After the settlement is completed, the contracts will be delisted.

  • Web3 data and AI company Validation Cloud completes $10 million in new round of financing

     Web3 data and AI company Validation Cloud announced a $10 million financing round from True Global Ventures. The company plans to use the funds to expand its AI products and achieve seamless access to Web3 data.